linux-cifs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 01/11] cifs: fix tcon status change after tree connect
@ 2023-03-10 15:32 Shyam Prasad N
  2023-03-10 15:32 ` [PATCH 02/11] cifs: generate signkey for the channel that's reconnecting Shyam Prasad N
                   ` (10 more replies)
  0 siblings, 11 replies; 31+ messages in thread
From: Shyam Prasad N @ 2023-03-10 15:32 UTC (permalink / raw)
  To: smfrench, bharathsm.hsk, pc, tom, linux-cifs; +Cc: Shyam Prasad N

After cifs_tree_connect, tcon status should not be
set to TID_GOOD. There could still be files that need
reopen. The status should instead be changed to
TID_NEED_FILES_INVALIDATE. That way, after reopen of
files, the status can be changed to TID_GOOD.

Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
---
 fs/cifs/cifsglob.h |  2 +-
 fs/cifs/connect.c  | 14 ++++++++++----
 fs/cifs/dfs.c      | 16 +++++++++++-----
 fs/cifs/file.c     | 10 +++++-----
 4 files changed, 27 insertions(+), 15 deletions(-)

diff --git a/fs/cifs/cifsglob.h b/fs/cifs/cifsglob.h
index a99883f16d94..8a37b1553dc6 100644
--- a/fs/cifs/cifsglob.h
+++ b/fs/cifs/cifsglob.h
@@ -137,7 +137,7 @@ enum tid_status_enum {
 	TID_NEED_RECON,
 	TID_NEED_TCON,
 	TID_IN_TCON,
-	TID_NEED_FILES_INVALIDATE, /* currently unused */
+	TID_NEED_FILES_INVALIDATE,
 	TID_IN_FILES_INVALIDATE
 };
 
diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c
index 5233f14f0636..3d07729c91a1 100644
--- a/fs/cifs/connect.c
+++ b/fs/cifs/connect.c
@@ -4038,9 +4038,15 @@ int cifs_tree_connect(const unsigned int xid, struct cifs_tcon *tcon, const stru
 
 	/* only send once per connect */
 	spin_lock(&tcon->tc_lock);
-	if (tcon->ses->ses_status != SES_GOOD ||
-	    (tcon->status != TID_NEW &&
-	    tcon->status != TID_NEED_TCON)) {
+	if (tcon->status != TID_GOOD &&
+	    tcon->status != TID_NEW &&
+	    tcon->status != TID_NEED_RECON) {
+		spin_unlock(&tcon->tc_lock);
+		return -EHOSTDOWN;
+	}
+
+	if (tcon->status == TID_NEED_FILES_INVALIDATE ||
+	    tcon->status == TID_GOOD) {
 		spin_unlock(&tcon->tc_lock);
 		return 0;
 	}
@@ -4051,7 +4057,7 @@ int cifs_tree_connect(const unsigned int xid, struct cifs_tcon *tcon, const stru
 	if (rc) {
 		spin_lock(&tcon->tc_lock);
 		if (tcon->status == TID_IN_TCON)
-			tcon->status = TID_NEED_TCON;
+			tcon->status = TID_NEED_RECON;
 		spin_unlock(&tcon->tc_lock);
 	} else {
 		spin_lock(&tcon->tc_lock);
diff --git a/fs/cifs/dfs.c b/fs/cifs/dfs.c
index b64d20374b9c..d37af02902c5 100644
--- a/fs/cifs/dfs.c
+++ b/fs/cifs/dfs.c
@@ -479,9 +479,15 @@ int cifs_tree_connect(const unsigned int xid, struct cifs_tcon *tcon, const stru
 
 	/* only send once per connect */
 	spin_lock(&tcon->tc_lock);
-	if (tcon->ses->ses_status != SES_GOOD ||
-	    (tcon->status != TID_NEW &&
-	    tcon->status != TID_NEED_TCON)) {
+	if (tcon->status != TID_GOOD &&
+	    tcon->status != TID_NEW &&
+	    tcon->status != TID_NEED_RECON) {
+		spin_unlock(&tcon->tc_lock);
+		return -EHOSTDOWN;
+	}
+
+	if (tcon->status == TID_NEED_FILES_INVALIDATE ||
+	    tcon->status == TID_GOOD) {
 		spin_unlock(&tcon->tc_lock);
 		return 0;
 	}
@@ -529,12 +535,12 @@ int cifs_tree_connect(const unsigned int xid, struct cifs_tcon *tcon, const stru
 	if (rc) {
 		spin_lock(&tcon->tc_lock);
 		if (tcon->status == TID_IN_TCON)
-			tcon->status = TID_NEED_TCON;
+			tcon->status = TID_NEED_RECON;
 		spin_unlock(&tcon->tc_lock);
 	} else {
 		spin_lock(&tcon->tc_lock);
 		if (tcon->status == TID_IN_TCON)
-			tcon->status = TID_GOOD;
+			tcon->status = TID_NEED_FILES_INVALIDATE;
 		spin_unlock(&tcon->tc_lock);
 		tcon->need_reconnect = false;
 	}
diff --git a/fs/cifs/file.c b/fs/cifs/file.c
index 4d4a2d82636d..96d865e108f4 100644
--- a/fs/cifs/file.c
+++ b/fs/cifs/file.c
@@ -174,13 +174,13 @@ cifs_mark_open_files_invalid(struct cifs_tcon *tcon)
 	struct list_head *tmp1;
 
 	/* only send once per connect */
-	spin_lock(&tcon->ses->ses_lock);
-	if ((tcon->ses->ses_status != SES_GOOD) || (tcon->status != TID_NEED_RECON)) {
-		spin_unlock(&tcon->ses->ses_lock);
+	spin_lock(&tcon->tc_lock);
+	if (tcon->status != TID_NEED_FILES_INVALIDATE) {
+		spin_unlock(&tcon->tc_lock);
 		return;
 	}
 	tcon->status = TID_IN_FILES_INVALIDATE;
-	spin_unlock(&tcon->ses->ses_lock);
+	spin_unlock(&tcon->tc_lock);
 
 	/* list all files open on tree connection and mark them invalid */
 	spin_lock(&tcon->open_file_lock);
@@ -194,7 +194,7 @@ cifs_mark_open_files_invalid(struct cifs_tcon *tcon)
 	invalidate_all_cached_dirs(tcon);
 	spin_lock(&tcon->tc_lock);
 	if (tcon->status == TID_IN_FILES_INVALIDATE)
-		tcon->status = TID_NEED_TCON;
+		tcon->status = TID_GOOD;
 	spin_unlock(&tcon->tc_lock);
 
 	/*
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 02/11] cifs: generate signkey for the channel that's reconnecting
  2023-03-10 15:32 [PATCH 01/11] cifs: fix tcon status change after tree connect Shyam Prasad N
@ 2023-03-10 15:32 ` Shyam Prasad N
  2023-03-10 15:32 ` [PATCH 03/11] cifs: avoid race conditions with parallel reconnects Shyam Prasad N
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 31+ messages in thread
From: Shyam Prasad N @ 2023-03-10 15:32 UTC (permalink / raw)
  To: smfrench, bharathsm.hsk, pc, tom, linux-cifs; +Cc: Shyam Prasad N

Before my changes to how multichannel reconnects work, the
primary channel was always used to do a non-binding session
setup. With my changes, that is not the case anymore.
Missed this place where channel at index 0 was forcibly
updated with the signing key.

Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
---
 fs/cifs/smb2transport.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/cifs/smb2transport.c b/fs/cifs/smb2transport.c
index 381babc1212c..d827b7547ffa 100644
--- a/fs/cifs/smb2transport.c
+++ b/fs/cifs/smb2transport.c
@@ -425,7 +425,7 @@ generate_smb3signingkey(struct cifs_ses *ses,
 
 		/* safe to access primary channel, since it will never go away */
 		spin_lock(&ses->chan_lock);
-		memcpy(ses->chans[0].signkey, ses->smb3signingkey,
+		memcpy(ses->chans[chan_index].signkey, ses->smb3signingkey,
 		       SMB3_SIGN_KEY_SIZE);
 		spin_unlock(&ses->chan_lock);
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 03/11] cifs: avoid race conditions with parallel reconnects
  2023-03-10 15:32 [PATCH 01/11] cifs: fix tcon status change after tree connect Shyam Prasad N
  2023-03-10 15:32 ` [PATCH 02/11] cifs: generate signkey for the channel that's reconnecting Shyam Prasad N
@ 2023-03-10 15:32 ` Shyam Prasad N
  2023-03-10 15:32 ` [PATCH 04/11] cifs: serialize channel reconnects Shyam Prasad N
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 31+ messages in thread
From: Shyam Prasad N @ 2023-03-10 15:32 UTC (permalink / raw)
  To: smfrench, bharathsm.hsk, pc, tom, linux-cifs; +Cc: Shyam Prasad N

When multiple processes/channels do reconnects in parallel
we used to return success immediately
negotiate/session-setup/tree-connect, causing race conditions
between processes that enter the function in parallel.
This caused several errors related to session not found to
show up during parallel reconnects.

Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
---
 fs/cifs/cifs_debug.c    |   5 ++
 fs/cifs/cifsglob.h      |  14 +++-
 fs/cifs/connect.c       | 145 ++++++++++++++++++++++++++++++++++++----
 fs/cifs/dfs.c           |  33 +++++++++
 fs/cifs/misc.c          |   2 +
 fs/cifs/sess.c          |   8 ++-
 fs/cifs/smb2pdu.c       |  35 +++++-----
 fs/cifs/smb2transport.c |  17 ++++-
 8 files changed, 225 insertions(+), 34 deletions(-)

diff --git a/fs/cifs/cifs_debug.c b/fs/cifs/cifs_debug.c
index 1911f7016fa1..4391c7aac3cb 100644
--- a/fs/cifs/cifs_debug.c
+++ b/fs/cifs/cifs_debug.c
@@ -406,6 +406,11 @@ static int cifs_debug_data_proc_show(struct seq_file *m, void *v)
 				ses->capabilities, ses->ses_status);
 			}
 
+			if (ses->chan_count > 1)
+				seq_printf(m, "\n\tChannel reconnect bitmaps: 0x%lx 0x%lx",
+					   ses->chans_need_reconnect,
+					   ses->chans_in_reconnect);
+
 			seq_printf(m, "\n\tSecurity type: %s ",
 				get_security_type_str(server->ops->select_sectype(server, ses->sectype)));
 
diff --git a/fs/cifs/cifsglob.h b/fs/cifs/cifsglob.h
index 8a37b1553dc6..81ff13e41f97 100644
--- a/fs/cifs/cifsglob.h
+++ b/fs/cifs/cifsglob.h
@@ -624,6 +624,7 @@ struct TCP_Server_Info {
 #ifdef CONFIG_NET_NS
 	struct net *net;
 #endif
+	wait_queue_head_t reconnect_q;	/* for handling parallel reconnects */
 	wait_queue_head_t response_q;
 	wait_queue_head_t request_q; /* if more than maxmpx to srvr must block*/
 	spinlock_t mid_lock;  /* protect mid queue and it's entries */
@@ -1002,7 +1003,6 @@ iface_cmp(struct cifs_server_iface *a, struct cifs_server_iface *b)
 }
 
 struct cifs_chan {
-	unsigned int in_reconnect : 1; /* if session setup in progress for this channel */
 	struct TCP_Server_Info *server;
 	struct cifs_server_iface *iface; /* interface in use */
 	__u8 signkey[SMB3_SIGN_KEY_SIZE];
@@ -1017,6 +1017,7 @@ struct cifs_ses {
 	struct list_head tcon_list;
 	struct cifs_tcon *tcon_ipc;
 	spinlock_t ses_lock;  /* protect anything here that is not protected */
+	wait_queue_head_t reconnect_q;	/* for handling parallel reconnects */
 	struct mutex session_mutex;
 	struct TCP_Server_Info *server;	/* pointer to server info */
 	int ses_count;		/* reference counter */
@@ -1076,7 +1077,9 @@ struct cifs_ses {
 #define CIFS_CHAN_NEEDS_RECONNECT(ses, index)	\
 	test_bit((index), &(ses)->chans_need_reconnect)
 #define CIFS_CHAN_IN_RECONNECT(ses, index)	\
-	((ses)->chans[(index)].in_reconnect)
+	test_bit((index), &(ses)->chans_in_reconnect)
+#define CIFS_ALL_CHANS_IN_RECONNECT(ses)	\
+	((ses)->chans_in_reconnect == CIFS_ALL_CHANNELS_SET(ses))
 
 	struct cifs_chan chans[CIFS_MAX_CHANNELS];
 	size_t chan_count;
@@ -1092,8 +1095,14 @@ struct cifs_ses {
 	 * channels are marked for needing reconnection. This will
 	 * enable the sessions on top to continue to live till any
 	 * of the channels below are active.
+	 *
+	 * chans_in_reconnect is a bitmap indicating which channels
+	 * are in the process of reconnecting. This is needed
+	 * to avoid race conditions between processes which
+	 * do channel binding in parallel.
 	 */
 	unsigned long chans_need_reconnect;
+	unsigned long chans_in_reconnect;
 	/* ========= end: protected by chan_lock ======== */
 	struct cifs_ses *dfs_root_ses;
 };
@@ -1145,6 +1154,7 @@ struct cifs_tcon {
 	int tc_count;
 	struct list_head rlist; /* reconnect list */
 	spinlock_t tc_lock;  /* protect anything here that is not protected */
+	wait_queue_head_t reconnect_q;	/* for handling parallel reconnects */
 	atomic_t num_local_opens;  /* num of all opens including disconnected */
 	atomic_t num_remote_opens; /* num of all network opens on server */
 	struct list_head openFileList;
diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c
index 3d07729c91a1..7b103f69432e 100644
--- a/fs/cifs/connect.c
+++ b/fs/cifs/connect.c
@@ -212,8 +212,10 @@ cifs_mark_tcp_ses_conns_for_reconnect(struct TCP_Server_Info *server,
 			cifs_chan_update_iface(ses, server);
 
 		spin_lock(&ses->chan_lock);
-		if (!mark_smb_session && cifs_chan_needs_reconnect(ses, server))
-			goto next_session;
+		if (!mark_smb_session && cifs_chan_needs_reconnect(ses, server)) {
+			spin_unlock(&ses->chan_lock);
+			continue;
+		}
 
 		if (mark_smb_session)
 			CIFS_SET_ALL_CHANS_NEED_RECONNECT(ses);
@@ -221,22 +223,28 @@ cifs_mark_tcp_ses_conns_for_reconnect(struct TCP_Server_Info *server,
 			cifs_chan_set_need_reconnect(ses, server);
 
 		/* If all channels need reconnect, then tcon needs reconnect */
-		if (!mark_smb_session && !CIFS_ALL_CHANS_NEED_RECONNECT(ses))
-			goto next_session;
+		if (!mark_smb_session && !CIFS_ALL_CHANS_NEED_RECONNECT(ses)) {
+			spin_unlock(&ses->chan_lock);
+			continue;
+		}
+		spin_unlock(&ses->chan_lock);
 
+		spin_lock(&ses->ses_lock);
 		ses->ses_status = SES_NEED_RECON;
+		spin_unlock(&ses->ses_lock);
 
 		list_for_each_entry(tcon, &ses->tcon_list, tcon_list) {
 			tcon->need_reconnect = true;
+			spin_lock(&tcon->tc_lock);
 			tcon->status = TID_NEED_RECON;
+			spin_unlock(&tcon->tc_lock);
 		}
 		if (ses->tcon_ipc) {
 			ses->tcon_ipc->need_reconnect = true;
+			spin_lock(&ses->tcon_ipc->tc_lock);
 			ses->tcon_ipc->status = TID_NEED_RECON;
+			spin_unlock(&ses->tcon_ipc->tc_lock);
 		}
-
-next_session:
-		spin_unlock(&ses->chan_lock);
 	}
 	spin_unlock(&cifs_tcp_ses_lock);
 }
@@ -1596,6 +1604,7 @@ cifs_get_tcp_session(struct smb3_fs_context *ctx,
 	}
 	init_waitqueue_head(&tcp_ses->response_q);
 	init_waitqueue_head(&tcp_ses->request_q);
+	init_waitqueue_head(&tcp_ses->reconnect_q);
 	INIT_LIST_HEAD(&tcp_ses->pending_mid_q);
 	mutex_init(&tcp_ses->_srv_mutex);
 	memcpy(tcp_ses->workstation_RFC1001_name,
@@ -3648,17 +3657,55 @@ cifs_negotiate_protocol(const unsigned int xid, struct cifs_ses *ses,
 			struct TCP_Server_Info *server)
 {
 	int rc = 0;
+	int retries = server->nr_targets;
 
 	if (!server->ops->need_neg || !server->ops->negotiate)
 		return -ENOSYS;
 
+check_again:
 	/* only send once per connect */
 	spin_lock(&server->srv_lock);
-	if (!server->ops->need_neg(server) ||
+	if (server->tcpStatus != CifsGood &&
+	    server->tcpStatus != CifsNew &&
 	    server->tcpStatus != CifsNeedNegotiate) {
+		spin_unlock(&server->srv_lock);
+		return -EHOSTDOWN;
+	}
+
+	if (!server->ops->need_neg(server) &&
+	    server->tcpStatus == CifsGood) {
 		spin_unlock(&server->srv_lock);
 		return 0;
 	}
+
+	/* another process is in the processs of negotiating */
+	while (server->tcpStatus == CifsInNegotiate) {
+		spin_unlock(&server->srv_lock);
+		rc = wait_event_interruptible_timeout(server->reconnect_q,
+						      (server->tcpStatus != CifsInNegotiate),
+						      HZ);
+		if (rc < 0) {
+			cifs_dbg(FYI, "%s: aborting negotiate due to a received signal by the process\n",
+				 __func__);
+			return -ERESTARTSYS;
+		}
+		spin_lock(&server->srv_lock);
+
+		/* are we still waiting for others */
+		if (server->tcpStatus != CifsInNegotiate) {
+			spin_unlock(&server->srv_lock);
+			goto check_again;
+		}
+
+		if (retries && --retries)
+			continue;
+
+		cifs_dbg(FYI, "gave up waiting on CifsInNegotiate\n");
+		spin_unlock(&server->srv_lock);
+		return -EHOSTDOWN;
+	}
+
+	/* now mark the server so that others don't reach here */
 	server->tcpStatus = CifsInNegotiate;
 	spin_unlock(&server->srv_lock);
 
@@ -3676,6 +3723,7 @@ cifs_negotiate_protocol(const unsigned int xid, struct cifs_ses *ses,
 			server->tcpStatus = CifsNeedNegotiate;
 		spin_unlock(&server->srv_lock);
 	}
+	wake_up(&server->reconnect_q);
 
 	return rc;
 }
@@ -3690,25 +3738,63 @@ cifs_setup_session(const unsigned int xid, struct cifs_ses *ses,
 	struct sockaddr_in6 *addr6 = (struct sockaddr_in6 *)&pserver->dstaddr;
 	struct sockaddr_in *addr = (struct sockaddr_in *)&pserver->dstaddr;
 	bool is_binding = false;
+	int retries;
+
+check_again:
+	retries = 5;
 
 	spin_lock(&ses->ses_lock);
 	if (ses->ses_status != SES_GOOD &&
 	    ses->ses_status != SES_NEW &&
 	    ses->ses_status != SES_NEED_RECON) {
 		spin_unlock(&ses->ses_lock);
-		return 0;
+		return -EHOSTDOWN;
 	}
 
 	/* only send once per connect */
 	spin_lock(&ses->chan_lock);
-	if (CIFS_ALL_CHANS_GOOD(ses) ||
-	    cifs_chan_in_reconnect(ses, server)) {
+	if (CIFS_ALL_CHANS_GOOD(ses)) {
+		if (ses->ses_status == SES_NEED_RECON)
+			ses->ses_status = SES_GOOD;
 		spin_unlock(&ses->chan_lock);
 		spin_unlock(&ses->ses_lock);
 		return 0;
 	}
-	is_binding = !CIFS_ALL_CHANS_NEED_RECONNECT(ses);
+
+	/* another process is in the processs of sess setup */
+	while (cifs_chan_in_reconnect(ses, server)) {
+		spin_unlock(&ses->chan_lock);
+		spin_unlock(&ses->ses_lock);
+		rc = wait_event_interruptible_timeout(ses->reconnect_q,
+						      (!cifs_chan_in_reconnect(ses, server)),
+						      HZ);
+		if (rc < 0) {
+			cifs_dbg(FYI, "%s: aborting sess setup due to a received signal by the process\n",
+				 __func__);
+			return -ERESTARTSYS;
+		}
+		spin_lock(&ses->ses_lock);
+		spin_lock(&ses->chan_lock);
+
+		/* are we still trying to reconnect? */
+		if (!cifs_chan_in_reconnect(ses, server)) {
+			spin_unlock(&ses->chan_lock);
+			spin_unlock(&ses->ses_lock);
+			goto check_again;
+		}
+
+		if (retries && --retries)
+			continue;
+
+		cifs_dbg(FYI, "gave up waiting on cifs_chan_in_reconnect\n");
+		spin_unlock(&ses->chan_lock);
+		spin_unlock(&ses->ses_lock);
+		return -EHOSTDOWN;
+	}
+
+	/* now mark the session so that others don't reach here */
 	cifs_chan_set_in_reconnect(ses, server);
+	is_binding = !CIFS_ALL_CHANS_NEED_RECONNECT(ses);
 	spin_unlock(&ses->chan_lock);
 
 	if (!is_binding)
@@ -3762,6 +3848,7 @@ cifs_setup_session(const unsigned int xid, struct cifs_ses *ses,
 		spin_unlock(&ses->chan_lock);
 		spin_unlock(&ses->ses_lock);
 	}
+	wake_up(&ses->reconnect_q);
 
 	return rc;
 }
@@ -4035,6 +4122,10 @@ int cifs_tree_connect(const unsigned int xid, struct cifs_tcon *tcon, const stru
 {
 	int rc;
 	const struct smb_version_operations *ops = tcon->ses->server->ops;
+	int retries;
+
+check_again:
+	retries = 5;
 
 	/* only send once per connect */
 	spin_lock(&tcon->tc_lock);
@@ -4050,6 +4141,35 @@ int cifs_tree_connect(const unsigned int xid, struct cifs_tcon *tcon, const stru
 		spin_unlock(&tcon->tc_lock);
 		return 0;
 	}
+
+	/* another process is in the processs of negotiating */
+	while (tcon->status == TID_IN_TCON) {
+		spin_unlock(&tcon->tc_lock);
+		rc = wait_event_interruptible_timeout(tcon->reconnect_q,
+						      (tcon->status != TID_IN_TCON),
+						      HZ);
+		if (rc < 0) {
+			cifs_dbg(FYI, "%s: aborting tree connect due to a received signal by the process\n",
+				 __func__);
+			return -ERESTARTSYS;
+		}
+		spin_lock(&tcon->tc_lock);
+
+		/* are we still trying to reconnect? */
+		if (tcon->status != TID_IN_TCON) {
+			spin_unlock(&tcon->tc_lock);
+			goto check_again;
+		}
+
+		if (retries && --retries)
+			continue;
+
+		cifs_dbg(FYI, "gave up waiting on TID_IN_TCON\n");
+		spin_unlock(&tcon->tc_lock);
+		return -EHOSTDOWN;
+	}
+
+	/* now mark the tcon so that others don't reach here */
 	tcon->status = TID_IN_TCON;
 	spin_unlock(&tcon->tc_lock);
 
@@ -4066,6 +4186,7 @@ int cifs_tree_connect(const unsigned int xid, struct cifs_tcon *tcon, const stru
 		tcon->need_reconnect = false;
 		spin_unlock(&tcon->tc_lock);
 	}
+	wake_up(&tcon->reconnect_q);
 
 	return rc;
 }
diff --git a/fs/cifs/dfs.c b/fs/cifs/dfs.c
index d37af02902c5..013a399088c3 100644
--- a/fs/cifs/dfs.c
+++ b/fs/cifs/dfs.c
@@ -476,6 +476,10 @@ int cifs_tree_connect(const unsigned int xid, struct cifs_tcon *tcon, const stru
 	struct dfs_cache_tgt_list tl = DFS_CACHE_TGT_LIST_INIT(tl);
 	char *tree;
 	struct dfs_info3_param ref = {0};
+	int retries;
+
+check_again:
+	retries = 5;
 
 	/* only send once per connect */
 	spin_lock(&tcon->tc_lock);
@@ -491,6 +495,35 @@ int cifs_tree_connect(const unsigned int xid, struct cifs_tcon *tcon, const stru
 		spin_unlock(&tcon->tc_lock);
 		return 0;
 	}
+
+	/* another process is in the processs of negotiating */
+	while (tcon->status == TID_IN_TCON) {
+		spin_unlock(&tcon->tc_lock);
+		rc = wait_event_interruptible_timeout(tcon->reconnect_q,
+						      (tcon->status != TID_IN_TCON),
+						      HZ);
+		if (rc < 0) {
+			cifs_dbg(FYI, "%s: aborting tree connect due to a received signal by the process\n",
+				 __func__);
+			return -ERESTARTSYS;
+		}
+		spin_lock(&tcon->tc_lock);
+
+		/* are we still trying to reconnect? */
+		if (tcon->status != TID_IN_TCON) {
+			spin_unlock(&tcon->tc_lock);
+			goto check_again;
+		}
+
+		if (retries && --retries)
+			continue;
+
+		cifs_dbg(FYI, "gave up waiting on TID_IN_TCON\n");
+		spin_unlock(&tcon->tc_lock);
+		return -EHOSTDOWN;
+	}
+
+	/* now mark the tcon so that others don't reach here */
 	tcon->status = TID_IN_TCON;
 	spin_unlock(&tcon->tc_lock);
 
diff --git a/fs/cifs/misc.c b/fs/cifs/misc.c
index a0d286ee723d..5a974689fde9 100644
--- a/fs/cifs/misc.c
+++ b/fs/cifs/misc.c
@@ -80,6 +80,7 @@ sesInfoAlloc(void)
 		spin_lock_init(&ret_buf->iface_lock);
 		INIT_LIST_HEAD(&ret_buf->iface_list);
 		spin_lock_init(&ret_buf->chan_lock);
+		init_waitqueue_head(&ret_buf->reconnect_q);
 	}
 	return ret_buf;
 }
@@ -134,6 +135,7 @@ tconInfoAlloc(void)
 	spin_lock_init(&ret_buf->stat_lock);
 	atomic_set(&ret_buf->num_local_opens, 0);
 	atomic_set(&ret_buf->num_remote_opens, 0);
+	init_waitqueue_head(&ret_buf->reconnect_q);
 
 	return ret_buf;
 }
diff --git a/fs/cifs/sess.c b/fs/cifs/sess.c
index d2cbae4b5d21..b8bfebe4498e 100644
--- a/fs/cifs/sess.c
+++ b/fs/cifs/sess.c
@@ -94,7 +94,9 @@ cifs_chan_set_in_reconnect(struct cifs_ses *ses,
 {
 	unsigned int chan_index = cifs_ses_get_chan_index(ses, server);
 
-	ses->chans[chan_index].in_reconnect = true;
+	set_bit(chan_index, &ses->chans_in_reconnect);
+	cifs_dbg(FYI, "Set in-reconnect bitmask for chan %u; now 0x%lx\n",
+		 chan_index, ses->chans_in_reconnect);
 }
 
 void
@@ -103,7 +105,9 @@ cifs_chan_clear_in_reconnect(struct cifs_ses *ses,
 {
 	unsigned int chan_index = cifs_ses_get_chan_index(ses, server);
 
-	ses->chans[chan_index].in_reconnect = false;
+	clear_bit(chan_index, &ses->chans_in_reconnect);
+	cifs_dbg(FYI, "Cleared in-reconnect bitmask for chan %u; now 0x%lx\n",
+		 chan_index, ses->chans_in_reconnect);
 }
 
 bool
diff --git a/fs/cifs/smb2pdu.c b/fs/cifs/smb2pdu.c
index 0e53265e1462..52318a79c848 100644
--- a/fs/cifs/smb2pdu.c
+++ b/fs/cifs/smb2pdu.c
@@ -215,8 +215,9 @@ smb2_reconnect(__le16 smb2_command, struct cifs_tcon *tcon,
 		return 0;
 	}
 	spin_unlock(&ses->chan_lock);
-	cifs_dbg(FYI, "sess reconnect mask: 0x%lx, tcon reconnect: %d",
+	cifs_dbg(FYI, "sess reconnect masks: 0x%lx 0x%lx, tcon reconnect: %d",
 		 tcon->ses->chans_need_reconnect,
+		 tcon->ses->chans_in_reconnect,
 		 tcon->need_reconnect);
 
 	nls_codepage = load_nls_default();
@@ -238,9 +239,12 @@ smb2_reconnect(__le16 smb2_command, struct cifs_tcon *tcon,
 	 * need to prevent multiple threads trying to simultaneously
 	 * reconnect the same SMB session
 	 */
+	spin_lock(&ses->ses_lock);
 	spin_lock(&ses->chan_lock);
-	if (!cifs_chan_needs_reconnect(ses, server)) {
+	if (!cifs_chan_needs_reconnect(ses, server) &&
+	    ses->ses_status == SES_GOOD) {
 		spin_unlock(&ses->chan_lock);
+		spin_unlock(&ses->ses_lock);
 
 		/* this means that we only need to tree connect */
 		if (tcon->need_reconnect)
@@ -249,6 +253,7 @@ smb2_reconnect(__le16 smb2_command, struct cifs_tcon *tcon,
 		goto out;
 	}
 	spin_unlock(&ses->chan_lock);
+	spin_unlock(&ses->ses_lock);
 
 	mutex_lock(&ses->session_mutex);
 	rc = cifs_negotiate_protocol(0, ses, server);
@@ -284,7 +289,7 @@ smb2_reconnect(__le16 smb2_command, struct cifs_tcon *tcon,
 	cifs_dbg(FYI, "reconnect tcon rc = %d\n", rc);
 	if (rc) {
 		/* If sess reconnected but tcon didn't, something strange ... */
-		pr_warn_once("reconnect tcon failed rc = %d\n", rc);
+		cifs_dbg(VFS, "reconnect tcon failed rc = %d\n", rc);
 		goto out;
 	}
 
@@ -1256,9 +1261,9 @@ SMB2_sess_alloc_buffer(struct SMB2_sess_data *sess_data)
 	if (rc)
 		return rc;
 
-	spin_lock(&ses->chan_lock);
-	is_binding = !CIFS_ALL_CHANS_NEED_RECONNECT(ses);
-	spin_unlock(&ses->chan_lock);
+	spin_lock(&ses->ses_lock);
+	is_binding = (ses->ses_status == SES_GOOD);
+	spin_unlock(&ses->ses_lock);
 
 	if (is_binding) {
 		req->hdr.SessionId = cpu_to_le64(ses->Suid);
@@ -1416,9 +1421,9 @@ SMB2_auth_kerberos(struct SMB2_sess_data *sess_data)
 		goto out_put_spnego_key;
 	}
 
-	spin_lock(&ses->chan_lock);
-	is_binding = !CIFS_ALL_CHANS_NEED_RECONNECT(ses);
-	spin_unlock(&ses->chan_lock);
+	spin_lock(&ses->ses_lock);
+	is_binding = (ses->ses_status == SES_GOOD);
+	spin_unlock(&ses->ses_lock);
 
 	/* keep session key if binding */
 	if (!is_binding) {
@@ -1542,9 +1547,9 @@ SMB2_sess_auth_rawntlmssp_negotiate(struct SMB2_sess_data *sess_data)
 
 	cifs_dbg(FYI, "rawntlmssp session setup challenge phase\n");
 
-	spin_lock(&ses->chan_lock);
-	is_binding = !CIFS_ALL_CHANS_NEED_RECONNECT(ses);
-	spin_unlock(&ses->chan_lock);
+	spin_lock(&ses->ses_lock);
+	is_binding = (ses->ses_status == SES_GOOD);
+	spin_unlock(&ses->ses_lock);
 
 	/* keep existing ses id and flags if binding */
 	if (!is_binding) {
@@ -1610,9 +1615,9 @@ SMB2_sess_auth_rawntlmssp_authenticate(struct SMB2_sess_data *sess_data)
 
 	rsp = (struct smb2_sess_setup_rsp *)sess_data->iov[0].iov_base;
 
-	spin_lock(&ses->chan_lock);
-	is_binding = !CIFS_ALL_CHANS_NEED_RECONNECT(ses);
-	spin_unlock(&ses->chan_lock);
+	spin_lock(&ses->ses_lock);
+	is_binding = (ses->ses_status == SES_GOOD);
+	spin_unlock(&ses->ses_lock);
 
 	/* keep existing ses id and flags if binding */
 	if (!is_binding) {
diff --git a/fs/cifs/smb2transport.c b/fs/cifs/smb2transport.c
index d827b7547ffa..790acf65a092 100644
--- a/fs/cifs/smb2transport.c
+++ b/fs/cifs/smb2transport.c
@@ -81,6 +81,7 @@ int smb2_get_sign_key(__u64 ses_id, struct TCP_Server_Info *server, u8 *key)
 	struct cifs_ses *ses = NULL;
 	int i;
 	int rc = 0;
+	bool is_binding = false;
 
 	spin_lock(&cifs_tcp_ses_lock);
 
@@ -97,9 +98,12 @@ int smb2_get_sign_key(__u64 ses_id, struct TCP_Server_Info *server, u8 *key)
 	goto out;
 
 found:
+	spin_lock(&ses->ses_lock);
 	spin_lock(&ses->chan_lock);
-	if (cifs_chan_needs_reconnect(ses, server) &&
-	    !CIFS_ALL_CHANS_NEED_RECONNECT(ses)) {
+
+	is_binding = (cifs_chan_needs_reconnect(ses, server) &&
+		      ses->ses_status == SES_GOOD);
+	if (is_binding) {
 		/*
 		 * If we are in the process of binding a new channel
 		 * to an existing session, use the master connection
@@ -107,6 +111,7 @@ int smb2_get_sign_key(__u64 ses_id, struct TCP_Server_Info *server, u8 *key)
 		 */
 		memcpy(key, ses->smb3signingkey, SMB3_SIGN_KEY_SIZE);
 		spin_unlock(&ses->chan_lock);
+		spin_unlock(&ses->ses_lock);
 		goto out;
 	}
 
@@ -119,10 +124,12 @@ int smb2_get_sign_key(__u64 ses_id, struct TCP_Server_Info *server, u8 *key)
 		if (chan->server == server) {
 			memcpy(key, chan->signkey, SMB3_SIGN_KEY_SIZE);
 			spin_unlock(&ses->chan_lock);
+			spin_unlock(&ses->ses_lock);
 			goto out;
 		}
 	}
 	spin_unlock(&ses->chan_lock);
+	spin_unlock(&ses->ses_lock);
 
 	cifs_dbg(VFS,
 		 "%s: Could not find channel signing key for session 0x%llx\n",
@@ -392,11 +399,15 @@ generate_smb3signingkey(struct cifs_ses *ses,
 	bool is_binding = false;
 	int chan_index = 0;
 
+	spin_lock(&ses->ses_lock);
 	spin_lock(&ses->chan_lock);
-	is_binding = !CIFS_ALL_CHANS_NEED_RECONNECT(ses);
+	is_binding = (cifs_chan_needs_reconnect(ses, server) &&
+		      ses->ses_status == SES_GOOD);
+
 	chan_index = cifs_ses_get_chan_index(ses, server);
 	/* TODO: introduce ref counting for channels when the can be freed */
 	spin_unlock(&ses->chan_lock);
+	spin_unlock(&ses->ses_lock);
 
 	/*
 	 * All channels use the same encryption/decryption keys but
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 04/11] cifs: serialize channel reconnects
  2023-03-10 15:32 [PATCH 01/11] cifs: fix tcon status change after tree connect Shyam Prasad N
  2023-03-10 15:32 ` [PATCH 02/11] cifs: generate signkey for the channel that's reconnecting Shyam Prasad N
  2023-03-10 15:32 ` [PATCH 03/11] cifs: avoid race conditions with parallel reconnects Shyam Prasad N
@ 2023-03-10 15:32 ` Shyam Prasad N
  2023-03-10 22:40   ` Steve French
  2023-03-10 15:32 ` [PATCH 05/11] cifs: lock chan_lock outside match_session Shyam Prasad N
                   ` (7 subsequent siblings)
  10 siblings, 1 reply; 31+ messages in thread
From: Shyam Prasad N @ 2023-03-10 15:32 UTC (permalink / raw)
  To: smfrench, bharathsm.hsk, pc, tom, linux-cifs; +Cc: Shyam Prasad N

Parallel session reconnects are prone to race conditions
that are difficult to avoid cleanly. The changes so far do
ensure that parallel reconnects eventually go through.
But that can take multiple session setups on the same channel.

Avoiding that by serializing the session setups on parallel
channels. In doing so, we should avoid such issues.

Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
---
 fs/cifs/connect.c | 18 ++++++++++++++----
 1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c
index 7b103f69432e..4ea1e51c3fa5 100644
--- a/fs/cifs/connect.c
+++ b/fs/cifs/connect.c
@@ -222,6 +222,11 @@ cifs_mark_tcp_ses_conns_for_reconnect(struct TCP_Server_Info *server,
 		else
 			cifs_chan_set_need_reconnect(ses, server);
 
+		cifs_dbg(FYI, "%s: channel connect bitmaps: 0x%lx 0x%lx\n",
+			 __func__,
+			 ses->chans_need_reconnect,
+			 ses->chans_in_reconnect);
+
 		/* If all channels need reconnect, then tcon needs reconnect */
 		if (!mark_smb_session && !CIFS_ALL_CHANS_NEED_RECONNECT(ses)) {
 			spin_unlock(&ses->chan_lock);
@@ -3744,6 +3749,11 @@ cifs_setup_session(const unsigned int xid, struct cifs_ses *ses,
 	retries = 5;
 
 	spin_lock(&ses->ses_lock);
+	cifs_dbg(FYI, "%s: channel connect bitmaps: 0x%lx 0x%lx\n",
+		 __func__,
+		 ses->chans_need_reconnect,
+		 ses->chans_in_reconnect);
+
 	if (ses->ses_status != SES_GOOD &&
 	    ses->ses_status != SES_NEW &&
 	    ses->ses_status != SES_NEED_RECON) {
@@ -3762,11 +3772,11 @@ cifs_setup_session(const unsigned int xid, struct cifs_ses *ses,
 	}
 
 	/* another process is in the processs of sess setup */
-	while (cifs_chan_in_reconnect(ses, server)) {
+	while (ses->chans_in_reconnect) {
 		spin_unlock(&ses->chan_lock);
 		spin_unlock(&ses->ses_lock);
 		rc = wait_event_interruptible_timeout(ses->reconnect_q,
-						      (!cifs_chan_in_reconnect(ses, server)),
+						      (!ses->chans_in_reconnect),
 						      HZ);
 		if (rc < 0) {
 			cifs_dbg(FYI, "%s: aborting sess setup due to a received signal by the process\n",
@@ -3776,8 +3786,8 @@ cifs_setup_session(const unsigned int xid, struct cifs_ses *ses,
 		spin_lock(&ses->ses_lock);
 		spin_lock(&ses->chan_lock);
 
-		/* are we still trying to reconnect? */
-		if (!cifs_chan_in_reconnect(ses, server)) {
+		/* did the bitmask change? */
+		if (!ses->chans_in_reconnect) {
 			spin_unlock(&ses->chan_lock);
 			spin_unlock(&ses->ses_lock);
 			goto check_again;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 05/11] cifs: lock chan_lock outside match_session
  2023-03-10 15:32 [PATCH 01/11] cifs: fix tcon status change after tree connect Shyam Prasad N
                   ` (2 preceding siblings ...)
  2023-03-10 15:32 ` [PATCH 04/11] cifs: serialize channel reconnects Shyam Prasad N
@ 2023-03-10 15:32 ` Shyam Prasad N
  2023-03-10 15:32 ` [PATCH 06/11] cifs: fix sockaddr comparison in iface_cmp Shyam Prasad N
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 31+ messages in thread
From: Shyam Prasad N @ 2023-03-10 15:32 UTC (permalink / raw)
  To: smfrench, bharathsm.hsk, pc, tom, linux-cifs; +Cc: Shyam Prasad N

Coverity had rightly indicated a possible deadlock
due to chan_lock being done inside match_session.
All callers of match_* functions should pick up the
necessary locks and call them.

Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
---
 fs/cifs/connect.c | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c
index 4ea1e51c3fa5..fb9d9994df09 100644
--- a/fs/cifs/connect.c
+++ b/fs/cifs/connect.c
@@ -1735,7 +1735,7 @@ cifs_get_tcp_session(struct smb3_fs_context *ctx,
 	return ERR_PTR(rc);
 }
 
-/* this function must be called with ses_lock held */
+/* this function must be called with ses_lock and chan_lock held */
 static int match_session(struct cifs_ses *ses, struct smb3_fs_context *ctx)
 {
 	if (ctx->sectype != Unspecified &&
@@ -1746,12 +1746,8 @@ static int match_session(struct cifs_ses *ses, struct smb3_fs_context *ctx)
 	 * If an existing session is limited to less channels than
 	 * requested, it should not be reused
 	 */
-	spin_lock(&ses->chan_lock);
-	if (ses->chan_max < ctx->max_channels) {
-		spin_unlock(&ses->chan_lock);
+	if (ses->chan_max < ctx->max_channels)
 		return 0;
-	}
-	spin_unlock(&ses->chan_lock);
 
 	switch (ses->sectype) {
 	case Kerberos:
@@ -1879,10 +1875,13 @@ cifs_find_smb_ses(struct TCP_Server_Info *server, struct smb3_fs_context *ctx)
 			spin_unlock(&ses->ses_lock);
 			continue;
 		}
+		spin_lock(&ses->chan_lock);
 		if (!match_session(ses, ctx)) {
+			spin_unlock(&ses->chan_lock);
 			spin_unlock(&ses->ses_lock);
 			continue;
 		}
+		spin_unlock(&ses->chan_lock);
 		spin_unlock(&ses->ses_lock);
 
 		++ses->ses_count;
@@ -2706,6 +2705,7 @@ cifs_match_super(struct super_block *sb, void *data)
 
 	spin_lock(&tcp_srv->srv_lock);
 	spin_lock(&ses->ses_lock);
+	spin_lock(&ses->chan_lock);
 	spin_lock(&tcon->tc_lock);
 	if (!match_server(tcp_srv, ctx, dfs_super_cmp) ||
 	    !match_session(ses, ctx) ||
@@ -2718,6 +2718,7 @@ cifs_match_super(struct super_block *sb, void *data)
 	rc = compare_mount_options(sb, mnt_data);
 out:
 	spin_unlock(&tcon->tc_lock);
+	spin_unlock(&ses->chan_lock);
 	spin_unlock(&ses->ses_lock);
 	spin_unlock(&tcp_srv->srv_lock);
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 06/11] cifs: fix sockaddr comparison in iface_cmp
  2023-03-10 15:32 [PATCH 01/11] cifs: fix tcon status change after tree connect Shyam Prasad N
                   ` (3 preceding siblings ...)
  2023-03-10 15:32 ` [PATCH 05/11] cifs: lock chan_lock outside match_session Shyam Prasad N
@ 2023-03-10 15:32 ` Shyam Prasad N
  2023-03-11  4:51   ` kernel test robot
  2023-03-10 15:32 ` [PATCH 07/11] cifs: do not poll server interfaces too regularly Shyam Prasad N
                   ` (5 subsequent siblings)
  10 siblings, 1 reply; 31+ messages in thread
From: Shyam Prasad N @ 2023-03-10 15:32 UTC (permalink / raw)
  To: smfrench, bharathsm.hsk, pc, tom, linux-cifs; +Cc: Shyam Prasad N

iface_cmp used to simply do a memcmp of the two
provided struct sockaddrs. The comparison needs to do more
based on the address family. Similar logic was already
present in cifs_match_ipaddr. Doing something similar now.

Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
---
 fs/cifs/cifsglob.h  | 37 ---------------------------------
 fs/cifs/cifsproto.h |  1 +
 fs/cifs/connect.c   | 50 +++++++++++++++++++++++++++++++++++++++++++++
 fs/cifs/smb2ops.c   | 37 +++++++++++++++++++++++++++++++++
 4 files changed, 88 insertions(+), 37 deletions(-)

diff --git a/fs/cifs/cifsglob.h b/fs/cifs/cifsglob.h
index 81ff13e41f97..a11e7b10f607 100644
--- a/fs/cifs/cifsglob.h
+++ b/fs/cifs/cifsglob.h
@@ -965,43 +965,6 @@ release_iface(struct kref *ref)
 	kfree(iface);
 }
 
-/*
- * compare two interfaces a and b
- * return 0 if everything matches.
- * return 1 if a has higher link speed, or rdma capable, or rss capable
- * return -1 otherwise.
- */
-static inline int
-iface_cmp(struct cifs_server_iface *a, struct cifs_server_iface *b)
-{
-	int cmp_ret = 0;
-
-	WARN_ON(!a || !b);
-	if (a->speed == b->speed) {
-		if (a->rdma_capable == b->rdma_capable) {
-			if (a->rss_capable == b->rss_capable) {
-				cmp_ret = memcmp(&a->sockaddr, &b->sockaddr,
-						 sizeof(a->sockaddr));
-				if (!cmp_ret)
-					return 0;
-				else if (cmp_ret > 0)
-					return 1;
-				else
-					return -1;
-			} else if (a->rss_capable > b->rss_capable)
-				return 1;
-			else
-				return -1;
-		} else if (a->rdma_capable > b->rdma_capable)
-			return 1;
-		else
-			return -1;
-	} else if (a->speed > b->speed)
-		return 1;
-	else
-		return -1;
-}
-
 struct cifs_chan {
 	struct TCP_Server_Info *server;
 	struct cifs_server_iface *iface; /* interface in use */
diff --git a/fs/cifs/cifsproto.h b/fs/cifs/cifsproto.h
index e2eff66eefab..30fd81268eb7 100644
--- a/fs/cifs/cifsproto.h
+++ b/fs/cifs/cifsproto.h
@@ -86,6 +86,7 @@ extern int cifs_handle_standard(struct TCP_Server_Info *server,
 				struct mid_q_entry *mid);
 extern int smb3_parse_devname(const char *devname, struct smb3_fs_context *ctx);
 extern int smb3_parse_opt(const char *options, const char *key, char **val);
+extern int cifs_ipaddr_cmp(struct sockaddr *srcaddr, struct sockaddr *rhs);
 extern bool cifs_match_ipaddr(struct sockaddr *srcaddr, struct sockaddr *rhs);
 extern int cifs_discard_remaining_data(struct TCP_Server_Info *server);
 extern int cifs_call_async(struct TCP_Server_Info *server,
diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c
index fb9d9994df09..b9af60417194 100644
--- a/fs/cifs/connect.c
+++ b/fs/cifs/connect.c
@@ -1289,6 +1289,56 @@ cifs_demultiplex_thread(void *p)
 	module_put_and_kthread_exit(0);
 }
 
+int
+cifs_ipaddr_cmp(struct sockaddr *srcaddr, struct sockaddr *rhs)
+{
+	switch (srcaddr->sa_family) {
+	case AF_UNSPEC:
+		switch (rhs->sa_family) {
+		case AF_UNSPEC:
+			return 0;
+		case AF_INET:
+		case AF_INET6:
+			return 1;
+		default:
+			return -1;
+		}
+	case AF_INET: {
+		switch (rhs->sa_family) {
+		case AF_UNSPEC:
+			return -1;
+		case AF_INET:
+			struct sockaddr_in *saddr4 = (struct sockaddr_in *)srcaddr;
+			struct sockaddr_in *vaddr4 = (struct sockaddr_in *)rhs;
+			return memcmp(&saddr4->sin_addr.s_addr,
+			       &vaddr4->sin_addr.s_addr,
+			       sizeof(struct sockaddr_in));
+		case AF_INET6:
+			return 1;
+		default:
+			return -1;
+		}
+	}
+	case AF_INET6: {
+		switch (rhs->sa_family) {
+		case AF_UNSPEC:
+		case AF_INET:
+			return -1;
+		case AF_INET6:
+			struct sockaddr_in6 *saddr6 = (struct sockaddr_in6 *)srcaddr;
+			struct sockaddr_in6 *vaddr6 = (struct sockaddr_in6 *)rhs;
+			return memcmp(&saddr6->sin6_addr,
+				      &vaddr6->sin6_addr,
+				      sizeof(struct sockaddr_in6));
+		default:
+			return -1;
+		}
+	}
+	default:
+		return -1; /* don't expect to be here */
+	}
+}
+
 /*
  * Returns true if srcaddr isn't specified and rhs isn't specified, or
  * if srcaddr is specified and matches the IP address of the rhs argument
diff --git a/fs/cifs/smb2ops.c b/fs/cifs/smb2ops.c
index 6dfb865ee9d7..0627d5e38236 100644
--- a/fs/cifs/smb2ops.c
+++ b/fs/cifs/smb2ops.c
@@ -510,6 +510,43 @@ smb3_negotiate_rsize(struct cifs_tcon *tcon, struct smb3_fs_context *ctx)
 	return rsize;
 }
 
+/*
+ * compare two interfaces a and b
+ * return 0 if everything matches.
+ * return 1 if a is rdma capable, or rss capable, or has higher link speed
+ * return -1 otherwise.
+ */
+static int
+iface_cmp(struct cifs_server_iface *a, struct cifs_server_iface *b)
+{
+	int cmp_ret = 0;
+
+	WARN_ON(!a || !b);
+	if (a->rdma_capable == b->rdma_capable) {
+		if (a->rss_capable == b->rss_capable) {
+			if (a->speed == b->speed) {
+				cmp_ret = cifs_ipaddr_cmp((struct sockaddr *) &a->sockaddr,
+							  (struct sockaddr *) &b->sockaddr);
+				if (!cmp_ret)
+					return 0;
+				else if (cmp_ret > 0)
+					return 1;
+				else
+					return -1;
+			} else if (a->speed > b->speed)
+				return 1;
+			else
+				return -1;
+		} else if (a->rss_capable > b->rss_capable)
+			return 1;
+		else
+			return -1;
+	} else if (a->rdma_capable > b->rdma_capable)
+		return 1;
+	else
+		return -1;
+}
+
 static int
 parse_server_interfaces(struct network_interface_info_ioctl_rsp *buf,
 			size_t buf_len, struct cifs_ses *ses, bool in_mount)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 07/11] cifs: do not poll server interfaces too regularly
  2023-03-10 15:32 [PATCH 01/11] cifs: fix tcon status change after tree connect Shyam Prasad N
                   ` (4 preceding siblings ...)
  2023-03-10 15:32 ` [PATCH 06/11] cifs: fix sockaddr comparison in iface_cmp Shyam Prasad N
@ 2023-03-10 15:32 ` Shyam Prasad N
  2023-03-10 15:32 ` [PATCH 08/11] cifs: distribute channels across interfaces based on speed Shyam Prasad N
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 31+ messages in thread
From: Shyam Prasad N @ 2023-03-10 15:32 UTC (permalink / raw)
  To: smfrench, bharathsm.hsk, pc, tom, linux-cifs; +Cc: Shyam Prasad N

We have the server interface list hanging off the tcon
structure today for reasons unknown. So each tcon which is
connected to a file server can query them separately,
which is really unnecessary. To avoid this, in the query
function, we will check the time of last update of the
interface list, and avoid querying the server if it is
within a certain range.

Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
---
 fs/cifs/smb2ops.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/fs/cifs/smb2ops.c b/fs/cifs/smb2ops.c
index 0627d5e38236..c342a1db33ed 100644
--- a/fs/cifs/smb2ops.c
+++ b/fs/cifs/smb2ops.c
@@ -567,6 +567,14 @@ parse_server_interfaces(struct network_interface_info_ioctl_rsp *buf,
 	p = buf;
 
 	spin_lock(&ses->iface_lock);
+	/* do not query too frequently, this time with lock held */
+	if (ses->iface_last_update &&
+	    time_before(jiffies, ses->iface_last_update +
+			(SMB_INTERFACE_POLL_INTERVAL * HZ))) {
+		spin_unlock(&ses->iface_lock);
+		return 0;
+	}
+
 	/*
 	 * Go through iface_list and do kref_put to remove
 	 * any unused ifaces. ifaces in use will be removed
@@ -733,6 +741,12 @@ SMB3_request_interfaces(const unsigned int xid, struct cifs_tcon *tcon, bool in_
 	struct network_interface_info_ioctl_rsp *out_buf = NULL;
 	struct cifs_ses *ses = tcon->ses;
 
+	/* do not query too frequently */
+	if (ses->iface_last_update &&
+	    time_before(jiffies, ses->iface_last_update +
+			(SMB_INTERFACE_POLL_INTERVAL * HZ)))
+		return 0;
+
 	rc = SMB2_ioctl(xid, tcon, NO_FILE_ID, NO_FILE_ID,
 			FSCTL_QUERY_NETWORK_INTERFACE_INFO,
 			NULL /* no data input */, 0 /* no data input */,
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 08/11] cifs: distribute channels across interfaces based on speed
  2023-03-10 15:32 [PATCH 01/11] cifs: fix tcon status change after tree connect Shyam Prasad N
                   ` (5 preceding siblings ...)
  2023-03-10 15:32 ` [PATCH 07/11] cifs: do not poll server interfaces too regularly Shyam Prasad N
@ 2023-03-10 15:32 ` Shyam Prasad N
  2024-02-27 11:16   ` Jan Čermák
  2023-03-10 15:32 ` [PATCH 09/11] cifs: account for primary channel in the interface list Shyam Prasad N
                   ` (3 subsequent siblings)
  10 siblings, 1 reply; 31+ messages in thread
From: Shyam Prasad N @ 2023-03-10 15:32 UTC (permalink / raw)
  To: smfrench, bharathsm.hsk, pc, tom, linux-cifs; +Cc: Shyam Prasad N

Today, if the server interfaces RSS capable, we simply
choose the fastest interface to setup a channel. This is not
a scalable approach, and does not make a lot of attempt to
distribute the connections.

This change does a weighted distribution of channels across
all the available server interfaces, where the weight is
a function of the advertised interface speed.

Also make sure that we don't mix rdma and non-rdma for channels.

Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
---
 fs/cifs/cifs_debug.c | 16 +++++++++++
 fs/cifs/cifsglob.h   |  2 ++
 fs/cifs/sess.c       | 67 +++++++++++++++++++++++++++++++++++---------
 3 files changed, 71 insertions(+), 14 deletions(-)

diff --git a/fs/cifs/cifs_debug.c b/fs/cifs/cifs_debug.c
index 4391c7aac3cb..cee3af02e2c3 100644
--- a/fs/cifs/cifs_debug.c
+++ b/fs/cifs/cifs_debug.c
@@ -219,6 +219,8 @@ static int cifs_debug_data_proc_show(struct seq_file *m, void *v)
 	struct cifs_ses *ses;
 	struct cifs_tcon *tcon;
 	struct cifs_server_iface *iface;
+	size_t iface_weight = 0, iface_min_speed = 0;
+	struct cifs_server_iface *last_iface = NULL;
 	int c, i, j;
 
 	seq_puts(m,
@@ -465,11 +467,25 @@ static int cifs_debug_data_proc_show(struct seq_file *m, void *v)
 					   "\tLast updated: %lu seconds ago",
 					   ses->iface_count,
 					   (jiffies - ses->iface_last_update) / HZ);
+
+			last_iface = list_last_entry(&ses->iface_list,
+						     struct cifs_server_iface,
+						     iface_head);
+			iface_min_speed = last_iface->speed;
+
 			j = 0;
 			list_for_each_entry(iface, &ses->iface_list,
 						 iface_head) {
 				seq_printf(m, "\n\t%d)", ++j);
 				cifs_dump_iface(m, iface);
+
+				iface_weight = iface->speed / iface_min_speed;
+				seq_printf(m, "\t\tWeight (cur,total): (%zu,%zu)"
+					   "\n\t\tAllocated channels: %u\n",
+					   iface->weight_fulfilled,
+					   iface_weight,
+					   iface->num_channels);
+
 				if (is_ses_using_iface(ses, iface))
 					seq_puts(m, "\t\t[CONNECTED]\n");
 			}
diff --git a/fs/cifs/cifsglob.h b/fs/cifs/cifsglob.h
index a11e7b10f607..e3ba5c979832 100644
--- a/fs/cifs/cifsglob.h
+++ b/fs/cifs/cifsglob.h
@@ -948,6 +948,8 @@ struct cifs_server_iface {
 	struct list_head iface_head;
 	struct kref refcount;
 	size_t speed;
+	size_t weight_fulfilled;
+	unsigned int num_channels;
 	unsigned int rdma_capable : 1;
 	unsigned int rss_capable : 1;
 	unsigned int is_active : 1; /* unset if non existent */
diff --git a/fs/cifs/sess.c b/fs/cifs/sess.c
index b8bfebe4498e..78a7cfa75e91 100644
--- a/fs/cifs/sess.c
+++ b/fs/cifs/sess.c
@@ -167,7 +167,9 @@ int cifs_try_adding_channels(struct cifs_sb_info *cifs_sb, struct cifs_ses *ses)
 	int left;
 	int rc = 0;
 	int tries = 0;
+	size_t iface_weight = 0, iface_min_speed = 0;
 	struct cifs_server_iface *iface = NULL, *niface = NULL;
+	struct cifs_server_iface *last_iface = NULL;
 
 	spin_lock(&ses->chan_lock);
 
@@ -196,21 +198,11 @@ int cifs_try_adding_channels(struct cifs_sb_info *cifs_sb, struct cifs_ses *ses)
 	}
 	spin_unlock(&ses->chan_lock);
 
-	/*
-	 * Keep connecting to same, fastest, iface for all channels as
-	 * long as its RSS. Try next fastest one if not RSS or channel
-	 * creation fails.
-	 */
-	spin_lock(&ses->iface_lock);
-	iface = list_first_entry(&ses->iface_list, struct cifs_server_iface,
-				 iface_head);
-	spin_unlock(&ses->iface_lock);
-
 	while (left > 0) {
 
 		tries++;
 		if (tries > 3*ses->chan_max) {
-			cifs_dbg(FYI, "too many channel open attempts (%d channels left to open)\n",
+			cifs_dbg(VFS, "too many channel open attempts (%d channels left to open)\n",
 				 left);
 			break;
 		}
@@ -218,17 +210,34 @@ int cifs_try_adding_channels(struct cifs_sb_info *cifs_sb, struct cifs_ses *ses)
 		spin_lock(&ses->iface_lock);
 		if (!ses->iface_count) {
 			spin_unlock(&ses->iface_lock);
+			cifs_dbg(VFS, "server %s does not advertise interfaces\n", ses->server->hostname);
 			break;
 		}
 
+		if (!iface)
+			iface = list_first_entry(&ses->iface_list, struct cifs_server_iface,
+						 iface_head);
+		last_iface = list_last_entry(&ses->iface_list, struct cifs_server_iface,
+					     iface_head);
+		iface_min_speed = last_iface->speed;
+
 		list_for_each_entry_safe_from(iface, niface, &ses->iface_list,
 				    iface_head) {
+			/* do not mix rdma and non-rdma interfaces */
+			if (iface->rdma_capable != ses->server->rdma)
+				continue;
+
 			/* skip ifaces that are unusable */
 			if (!iface->is_active ||
 			    (is_ses_using_iface(ses, iface) &&
-			     !iface->rss_capable)) {
+			     !iface->rss_capable))
+				continue;
+
+			/* check if we already allocated enough channels */
+			iface_weight = iface->speed / iface_min_speed;
+
+			if (iface->weight_fulfilled >= iface_weight)
 				continue;
-			}
 
 			/* take ref before unlock */
 			kref_get(&iface->refcount);
@@ -245,10 +254,17 @@ int cifs_try_adding_channels(struct cifs_sb_info *cifs_sb, struct cifs_ses *ses)
 				continue;
 			}
 
-			cifs_dbg(FYI, "successfully opened new channel on iface:%pIS\n",
+			iface->num_channels++;
+			iface->weight_fulfilled++;
+			cifs_dbg(VFS, "successfully opened new channel on iface:%pIS\n",
 				 &iface->sockaddr);
 			break;
 		}
+
+		/* reached end of list. reset weight_fulfilled */
+		if (list_entry_is_head(iface, &ses->iface_list, iface_head))
+			list_for_each_entry(iface, &ses->iface_list, iface_head)
+				iface->weight_fulfilled = 0;
 		spin_unlock(&ses->iface_lock);
 
 		left--;
@@ -267,8 +283,10 @@ int
 cifs_chan_update_iface(struct cifs_ses *ses, struct TCP_Server_Info *server)
 {
 	unsigned int chan_index;
+	size_t iface_weight = 0, iface_min_speed = 0;
 	struct cifs_server_iface *iface = NULL;
 	struct cifs_server_iface *old_iface = NULL;
+	struct cifs_server_iface *last_iface = NULL;
 	int rc = 0;
 
 	spin_lock(&ses->chan_lock);
@@ -288,13 +306,34 @@ cifs_chan_update_iface(struct cifs_ses *ses, struct TCP_Server_Info *server)
 	spin_unlock(&ses->chan_lock);
 
 	spin_lock(&ses->iface_lock);
+	if (!ses->iface_count) {
+		spin_unlock(&ses->iface_lock);
+		cifs_dbg(VFS, "server %s does not advertise interfaces\n", ses->server->hostname);
+		return 0;
+	}
+
+	last_iface = list_last_entry(&ses->iface_list, struct cifs_server_iface,
+				     iface_head);
+	iface_min_speed = last_iface->speed;
+
 	/* then look for a new one */
 	list_for_each_entry(iface, &ses->iface_list, iface_head) {
+		/* do not mix rdma and non-rdma interfaces */
+		if (iface->rdma_capable != server->rdma)
+			continue;
+
 		if (!iface->is_active ||
 		    (is_ses_using_iface(ses, iface) &&
 		     !iface->rss_capable)) {
 			continue;
 		}
+
+		/* check if we already allocated enough channels */
+		iface_weight = iface->speed / iface_min_speed;
+
+		if (iface->weight_fulfilled >= iface_weight)
+			continue;
+
 		kref_get(&iface->refcount);
 		break;
 	}
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 09/11] cifs: account for primary channel in the interface list
  2023-03-10 15:32 [PATCH 01/11] cifs: fix tcon status change after tree connect Shyam Prasad N
                   ` (6 preceding siblings ...)
  2023-03-10 15:32 ` [PATCH 08/11] cifs: distribute channels across interfaces based on speed Shyam Prasad N
@ 2023-03-10 15:32 ` Shyam Prasad N
  2023-03-13  5:27   ` kernel test robot
  2023-03-10 15:32 ` [PATCH 10/11] cifs: handle when server stops supporting multichannel Shyam Prasad N
                   ` (2 subsequent siblings)
  10 siblings, 1 reply; 31+ messages in thread
From: Shyam Prasad N @ 2023-03-10 15:32 UTC (permalink / raw)
  To: smfrench, bharathsm.hsk, pc, tom, linux-cifs; +Cc: Shyam Prasad N

The refcounting of server interfaces should account
for the primary channel too. Although this is not
strictly necessary, doing so will account for the primary
channel in DebugData.

Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
---
 fs/cifs/sess.c    | 40 +++++++++++++++++++++++++++++++++++-----
 fs/cifs/smb2ops.c |  6 ++++++
 2 files changed, 41 insertions(+), 5 deletions(-)

diff --git a/fs/cifs/sess.c b/fs/cifs/sess.c
index 78a7cfa75e91..9b51b2309e9c 100644
--- a/fs/cifs/sess.c
+++ b/fs/cifs/sess.c
@@ -291,11 +291,6 @@ cifs_chan_update_iface(struct cifs_ses *ses, struct TCP_Server_Info *server)
 
 	spin_lock(&ses->chan_lock);
 	chan_index = cifs_ses_get_chan_index(ses, server);
-	if (!chan_index) {
-		spin_unlock(&ses->chan_lock);
-		return 0;
-	}
-
 	if (ses->chans[chan_index].iface) {
 		old_iface = ses->chans[chan_index].iface;
 		if (old_iface->is_active) {
@@ -318,6 +313,16 @@ cifs_chan_update_iface(struct cifs_ses *ses, struct TCP_Server_Info *server)
 
 	/* then look for a new one */
 	list_for_each_entry(iface, &ses->iface_list, iface_head) {
+		if (!chan_index) {
+			/* if we're trying to get the updated iface for primary channel */
+			if (!cifs_match_ipaddr((struct sockaddr *) &server->dstaddr,
+					       (struct sockaddr *) &iface->sockaddr))
+				continue;
+
+			kref_get(&iface->refcount);
+			break;
+		}
+
 		/* do not mix rdma and non-rdma interfaces */
 		if (iface->rdma_capable != server->rdma)
 			continue;
@@ -344,16 +349,41 @@ cifs_chan_update_iface(struct cifs_ses *ses, struct TCP_Server_Info *server)
 		cifs_dbg(FYI, "unable to find a suitable iface\n");
 	}
 
+	if (!chan_index && !iface) {
+		cifs_dbg(VFS, "unable to get the interface matching: %pIS\n",
+			 &server->dstaddr);
+		spin_unlock(&ses->iface_lock);
+		return 0;
+	}
+
 	/* now drop the ref to the current iface */
 	if (old_iface && iface) {
 		cifs_dbg(FYI, "replacing iface: %pIS with %pIS\n",
 			 &old_iface->sockaddr,
 			 &iface->sockaddr);
+
+		old_iface->num_channels--;
+		if (--old_iface->weight_fulfilled < 0)
+			old_iface->weight_fulfilled = 0;
+		iface->num_channels++;
+		iface->weight_fulfilled++;
+
 		kref_put(&old_iface->refcount, release_iface);
 	} else if (old_iface) {
 		cifs_dbg(FYI, "releasing ref to iface: %pIS\n",
 			 &old_iface->sockaddr);
+
+		old_iface->num_channels--;
+		if (--old_iface->weight_fulfilled < 0)
+			old_iface->weight_fulfilled = 0;
+
 		kref_put(&old_iface->refcount, release_iface);
+	} else if (!chan_index) {
+		/* special case: update interface for primary channel */
+		cifs_dbg(FYI, "referencing primary channel iface: %pIS\n",
+			 &iface->sockaddr);
+		iface->num_channels++;
+		iface->weight_fulfilled++;
 	} else {
 		WARN_ON(!iface);
 		cifs_dbg(FYI, "adding new iface: %pIS\n", &iface->sockaddr);
diff --git a/fs/cifs/smb2ops.c b/fs/cifs/smb2ops.c
index c342a1db33ed..a5e53cb1ac49 100644
--- a/fs/cifs/smb2ops.c
+++ b/fs/cifs/smb2ops.c
@@ -740,6 +740,7 @@ SMB3_request_interfaces(const unsigned int xid, struct cifs_tcon *tcon, bool in_
 	unsigned int ret_data_len = 0;
 	struct network_interface_info_ioctl_rsp *out_buf = NULL;
 	struct cifs_ses *ses = tcon->ses;
+	struct TCP_Server_Info *pserver;
 
 	/* do not query too frequently */
 	if (ses->iface_last_update &&
@@ -764,6 +765,11 @@ SMB3_request_interfaces(const unsigned int xid, struct cifs_tcon *tcon, bool in_
 	if (rc)
 		goto out;
 
+	/* check if iface is still active */
+	pserver = ses->chans[0].server;
+	if (pserver && !cifs_chan_is_iface_active(ses, pserver))
+		cifs_chan_update_iface(ses, pserver);
+
 out:
 	kfree(out_buf);
 	return rc;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 10/11] cifs: handle when server stops supporting multichannel
  2023-03-10 15:32 [PATCH 01/11] cifs: fix tcon status change after tree connect Shyam Prasad N
                   ` (7 preceding siblings ...)
  2023-03-10 15:32 ` [PATCH 09/11] cifs: account for primary channel in the interface list Shyam Prasad N
@ 2023-03-10 15:32 ` Shyam Prasad N
  2023-03-13  6:09   ` kernel test robot
  2023-03-10 15:32 ` [PATCH 11/11] cifs: empty interface list when server doesn't support query interfaces Shyam Prasad N
  2023-03-14 22:19 ` [PATCH 01/11] cifs: fix tcon status change after tree connect Paulo Alcantara
  10 siblings, 1 reply; 31+ messages in thread
From: Shyam Prasad N @ 2023-03-10 15:32 UTC (permalink / raw)
  To: smfrench, bharathsm.hsk, pc, tom, linux-cifs; +Cc: Shyam Prasad N

When a server stops supporting multichannel, we will
keep attempting reconnects to the secondary channels today.
Avoid this by freeing extra channels when negotiate
returns no multichannel support.

Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
---
 fs/cifs/cifsproto.h |  2 ++
 fs/cifs/connect.c   |  6 ++++++
 fs/cifs/sess.c      | 35 +++++++++++++++++++++++++++++++++++
 fs/cifs/smb2ops.c   |  8 ++++++++
 4 files changed, 51 insertions(+)

diff --git a/fs/cifs/cifsproto.h b/fs/cifs/cifsproto.h
index 30fd81268eb7..343e582672b9 100644
--- a/fs/cifs/cifsproto.h
+++ b/fs/cifs/cifsproto.h
@@ -638,6 +638,8 @@ cifs_chan_needs_reconnect(struct cifs_ses *ses,
 bool
 cifs_chan_is_iface_active(struct cifs_ses *ses,
 			  struct TCP_Server_Info *server);
+void
+cifs_disable_extra_channels(struct cifs_ses *ses);
 int
 cifs_chan_update_iface(struct cifs_ses *ses, struct TCP_Server_Info *server);
 int
diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c
index b9af60417194..6375b08b9bcb 100644
--- a/fs/cifs/connect.c
+++ b/fs/cifs/connect.c
@@ -130,6 +130,12 @@ static void smb2_query_server_interfaces(struct work_struct *work)
 	if (rc) {
 		cifs_dbg(FYI, "%s: failed to query server interfaces: %d\n",
 				__func__, rc);
+
+		if (rc == -EOPNOTSUPP) {
+			/* cancel polling of interfaces and do not resched */
+			cancel_delayed_work_sync(&tcon->query_interfaces);
+			return;
+		}
 	}
 
 	queue_delayed_work(cifsiod_wq, &tcon->query_interfaces,
diff --git a/fs/cifs/sess.c b/fs/cifs/sess.c
index 9b51b2309e9c..34ae292bdff2 100644
--- a/fs/cifs/sess.c
+++ b/fs/cifs/sess.c
@@ -274,6 +274,41 @@ int cifs_try_adding_channels(struct cifs_sb_info *cifs_sb, struct cifs_ses *ses)
 	return new_chan_count - old_chan_count;
 }
 
+/*
+ * called when multichannel is disabled by the server
+ */
+void
+cifs_disable_extra_channels(struct cifs_ses *ses)
+{
+	int i, chan_count;
+	struct cifs_server_iface *iface = NULL;
+	struct TCP_Server_Info *server = NULL;
+
+	spin_lock(&ses->chan_lock);
+	chan_count = ses->chan_count;
+	ses->chan_count = 1;
+	for (i = 1; i < chan_count; i++) {
+		iface = ses->chans[i].iface;
+		server = ses->chans[i].server;
+		spin_unlock(&ses->chan_lock);
+
+		if (iface) {
+			spin_lock(&ses->iface_lock);
+			kref_put(&iface->refcount, release_iface);
+			iface->num_channels--;
+			if (--iface->weight_fulfilled < 0)
+				iface->weight_fulfilled = 0;
+			spin_unlock(&ses->iface_lock);
+		}
+		cifs_put_tcp_session(server, 0);
+
+		spin_lock(&ses->chan_lock);
+		ses->chans[i].iface = NULL;
+		ses->chans[i].server = NULL;
+	}
+	spin_unlock(&ses->chan_lock);
+}
+
 /*
  * update the iface for the channel if necessary.
  * will return 0 when iface is updated, 1 if removed, 2 otherwise
diff --git a/fs/cifs/smb2ops.c b/fs/cifs/smb2ops.c
index a5e53cb1ac49..c7a8a6049291 100644
--- a/fs/cifs/smb2ops.c
+++ b/fs/cifs/smb2ops.c
@@ -411,6 +411,14 @@ smb2_negotiate(const unsigned int xid,
 	/* BB we probably don't need to retry with modern servers */
 	if (rc == -EAGAIN)
 		rc = -EHOSTDOWN;
+
+	if (!rc &&
+	    ses->chan_count > 1 &&
+	    !(server->capabilities & SMB2_GLOBAL_CAP_MULTI_CHANNEL)) {
+		cifs_dbg(VFS, "server %s does not support multichannel anymore\n", ses->server->hostname);
+		cifs_disable_extra_channels(ses);
+	}
+
 	return rc;
 }
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 11/11] cifs: empty interface list when server doesn't support query interfaces
  2023-03-10 15:32 [PATCH 01/11] cifs: fix tcon status change after tree connect Shyam Prasad N
                   ` (8 preceding siblings ...)
  2023-03-10 15:32 ` [PATCH 10/11] cifs: handle when server stops supporting multichannel Shyam Prasad N
@ 2023-03-10 15:32 ` Shyam Prasad N
  2023-03-14 22:19 ` [PATCH 01/11] cifs: fix tcon status change after tree connect Paulo Alcantara
  10 siblings, 0 replies; 31+ messages in thread
From: Shyam Prasad N @ 2023-03-10 15:32 UTC (permalink / raw)
  To: smfrench, bharathsm.hsk, pc, tom, linux-cifs; +Cc: Shyam Prasad N

When querying server interfaces returns -EOPNOTSUPP,
clear the list of interfaces. Assumption is that multichannel
would be disabled too.

Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
---
 fs/cifs/smb2ops.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/cifs/smb2ops.c b/fs/cifs/smb2ops.c
index c7a8a6049291..9ca55038b3db 100644
--- a/fs/cifs/smb2ops.c
+++ b/fs/cifs/smb2ops.c
@@ -763,7 +763,7 @@ SMB3_request_interfaces(const unsigned int xid, struct cifs_tcon *tcon, bool in_
 	if (rc == -EOPNOTSUPP) {
 		cifs_dbg(FYI,
 			 "server does not support query network interfaces\n");
-		goto out;
+		ret_data_len = 0;
 	} else if (rc != 0) {
 		cifs_tcon_dbg(VFS, "error %d on ioctl to get interface list\n", rc);
 		goto out;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH 04/11] cifs: serialize channel reconnects
  2023-03-10 15:32 ` [PATCH 04/11] cifs: serialize channel reconnects Shyam Prasad N
@ 2023-03-10 22:40   ` Steve French
  0 siblings, 0 replies; 31+ messages in thread
From: Steve French @ 2023-03-10 22:40 UTC (permalink / raw)
  To: Shyam Prasad N; +Cc: bharathsm.hsk, pc, tom, linux-cifs, Shyam Prasad N

Paulo had a few comments on patch 3 of the series, so have applied the
first two, and will wait on patch 3 and 4, but tentatively merged the
first two:

a7c73e3d5842 (HEAD -> for-next, origin/for-next) cifs: generate
signkey for the channel that's reconnecting
d169aadd50c7 cifs: fix tcon status change after tree connect

On Fri, Mar 10, 2023 at 9:33 AM Shyam Prasad N <nspmangalore@gmail.com> wrote:
>
> Parallel session reconnects are prone to race conditions
> that are difficult to avoid cleanly. The changes so far do
> ensure that parallel reconnects eventually go through.
> But that can take multiple session setups on the same channel.
>
> Avoiding that by serializing the session setups on parallel
> channels. In doing so, we should avoid such issues.
>
> Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
> ---
>  fs/cifs/connect.c | 18 ++++++++++++++----
>  1 file changed, 14 insertions(+), 4 deletions(-)
>
> diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c
> index 7b103f69432e..4ea1e51c3fa5 100644
> --- a/fs/cifs/connect.c
> +++ b/fs/cifs/connect.c
> @@ -222,6 +222,11 @@ cifs_mark_tcp_ses_conns_for_reconnect(struct TCP_Server_Info *server,
>                 else
>                         cifs_chan_set_need_reconnect(ses, server);
>
> +               cifs_dbg(FYI, "%s: channel connect bitmaps: 0x%lx 0x%lx\n",
> +                        __func__,
> +                        ses->chans_need_reconnect,
> +                        ses->chans_in_reconnect);
> +
>                 /* If all channels need reconnect, then tcon needs reconnect */
>                 if (!mark_smb_session && !CIFS_ALL_CHANS_NEED_RECONNECT(ses)) {
>                         spin_unlock(&ses->chan_lock);
> @@ -3744,6 +3749,11 @@ cifs_setup_session(const unsigned int xid, struct cifs_ses *ses,
>         retries = 5;
>
>         spin_lock(&ses->ses_lock);
> +       cifs_dbg(FYI, "%s: channel connect bitmaps: 0x%lx 0x%lx\n",
> +                __func__,
> +                ses->chans_need_reconnect,
> +                ses->chans_in_reconnect);
> +
>         if (ses->ses_status != SES_GOOD &&
>             ses->ses_status != SES_NEW &&
>             ses->ses_status != SES_NEED_RECON) {
> @@ -3762,11 +3772,11 @@ cifs_setup_session(const unsigned int xid, struct cifs_ses *ses,
>         }
>
>         /* another process is in the processs of sess setup */
> -       while (cifs_chan_in_reconnect(ses, server)) {
> +       while (ses->chans_in_reconnect) {
>                 spin_unlock(&ses->chan_lock);
>                 spin_unlock(&ses->ses_lock);
>                 rc = wait_event_interruptible_timeout(ses->reconnect_q,
> -                                                     (!cifs_chan_in_reconnect(ses, server)),
> +                                                     (!ses->chans_in_reconnect),
>                                                       HZ);
>                 if (rc < 0) {
>                         cifs_dbg(FYI, "%s: aborting sess setup due to a received signal by the process\n",
> @@ -3776,8 +3786,8 @@ cifs_setup_session(const unsigned int xid, struct cifs_ses *ses,
>                 spin_lock(&ses->ses_lock);
>                 spin_lock(&ses->chan_lock);
>
> -               /* are we still trying to reconnect? */
> -               if (!cifs_chan_in_reconnect(ses, server)) {
> +               /* did the bitmask change? */
> +               if (!ses->chans_in_reconnect) {
>                         spin_unlock(&ses->chan_lock);
>                         spin_unlock(&ses->ses_lock);
>                         goto check_again;
> --
> 2.34.1
>


-- 
Thanks,

Steve

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 06/11] cifs: fix sockaddr comparison in iface_cmp
  2023-03-10 15:32 ` [PATCH 06/11] cifs: fix sockaddr comparison in iface_cmp Shyam Prasad N
@ 2023-03-11  4:51   ` kernel test robot
  0 siblings, 0 replies; 31+ messages in thread
From: kernel test robot @ 2023-03-11  4:51 UTC (permalink / raw)
  To: Shyam Prasad N, smfrench, bharathsm.hsk, pc, tom, linux-cifs
  Cc: llvm, oe-kbuild-all, Shyam Prasad N

Hi Shyam,

I love your patch! Yet something to improve:

[auto build test ERROR on cifs/for-next]
[also build test ERROR on linus/master v6.3-rc1 next-20230310]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Shyam-Prasad-N/cifs-generate-signkey-for-the-channel-that-s-reconnecting/20230310-234711
base:   git://git.samba.org/sfrench/cifs-2.6.git for-next
patch link:    https://lore.kernel.org/r/20230310153211.10982-6-sprasad%40microsoft.com
patch subject: [PATCH 06/11] cifs: fix sockaddr comparison in iface_cmp
config: x86_64-randconfig-a001 (https://download.01.org/0day-ci/archive/20230311/202303111213.VsaCo9Ff-lkp@intel.com/config)
compiler: clang version 14.0.6 (https://github.com/llvm/llvm-project f28c006a5895fc0e329fe15fead81e37457cb1d1)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/intel-lab-lkp/linux/commit/799e15f02b7da48acdde0b568eef1deb23aa32ed
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Shyam-Prasad-N/cifs-generate-signkey-for-the-channel-that-s-reconnecting/20230310-234711
        git checkout 799e15f02b7da48acdde0b568eef1deb23aa32ed
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=x86_64 olddefconfig
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=x86_64 SHELL=/bin/bash fs/cifs/

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>
| Link: https://lore.kernel.org/oe-kbuild-all/202303111213.VsaCo9Ff-lkp@intel.com/

All error/warnings (new ones prefixed by >>):

>> fs/cifs/connect.c:1311:4: error: expected expression
                           struct sockaddr_in *saddr4 = (struct sockaddr_in *)srcaddr;
                           ^
>> fs/cifs/connect.c:1313:19: error: use of undeclared identifier 'saddr4'; did you mean 'vaddr4'?
                           return memcmp(&saddr4->sin_addr.s_addr,
                                          ^~~~~~
                                          vaddr4
   fs/cifs/connect.c:1312:24: note: 'vaddr4' declared here
                           struct sockaddr_in *vaddr4 = (struct sockaddr_in *)rhs;
                                               ^
>> fs/cifs/connect.c:1312:24: warning: mixing declarations and code is incompatible with standards before C99 [-Wdeclaration-after-statement]
                           struct sockaddr_in *vaddr4 = (struct sockaddr_in *)rhs;
                                               ^
   fs/cifs/connect.c:1328:4: error: expected expression
                           struct sockaddr_in6 *saddr6 = (struct sockaddr_in6 *)srcaddr;
                           ^
>> fs/cifs/connect.c:1330:19: error: use of undeclared identifier 'saddr6'; did you mean 'vaddr6'?
                           return memcmp(&saddr6->sin6_addr,
                                          ^~~~~~
                                          vaddr6
   fs/cifs/connect.c:1329:25: note: 'vaddr6' declared here
                           struct sockaddr_in6 *vaddr6 = (struct sockaddr_in6 *)rhs;
                                                ^
   fs/cifs/connect.c:1329:25: warning: mixing declarations and code is incompatible with standards before C99 [-Wdeclaration-after-statement]
                           struct sockaddr_in6 *vaddr6 = (struct sockaddr_in6 *)rhs;
                                                ^
   2 warnings and 4 errors generated.


vim +1311 fs/cifs/connect.c

  1291	
  1292	int
  1293	cifs_ipaddr_cmp(struct sockaddr *srcaddr, struct sockaddr *rhs)
  1294	{
  1295		switch (srcaddr->sa_family) {
  1296		case AF_UNSPEC:
  1297			switch (rhs->sa_family) {
  1298			case AF_UNSPEC:
  1299				return 0;
  1300			case AF_INET:
  1301			case AF_INET6:
  1302				return 1;
  1303			default:
  1304				return -1;
  1305			}
  1306		case AF_INET: {
  1307			switch (rhs->sa_family) {
  1308			case AF_UNSPEC:
  1309				return -1;
  1310			case AF_INET:
> 1311				struct sockaddr_in *saddr4 = (struct sockaddr_in *)srcaddr;
> 1312				struct sockaddr_in *vaddr4 = (struct sockaddr_in *)rhs;
> 1313				return memcmp(&saddr4->sin_addr.s_addr,
  1314				       &vaddr4->sin_addr.s_addr,
  1315				       sizeof(struct sockaddr_in));
  1316			case AF_INET6:
  1317				return 1;
  1318			default:
  1319				return -1;
  1320			}
  1321		}
  1322		case AF_INET6: {
  1323			switch (rhs->sa_family) {
  1324			case AF_UNSPEC:
  1325			case AF_INET:
  1326				return -1;
  1327			case AF_INET6:
  1328				struct sockaddr_in6 *saddr6 = (struct sockaddr_in6 *)srcaddr;
  1329				struct sockaddr_in6 *vaddr6 = (struct sockaddr_in6 *)rhs;
> 1330				return memcmp(&saddr6->sin6_addr,
  1331					      &vaddr6->sin6_addr,
  1332					      sizeof(struct sockaddr_in6));
  1333			default:
  1334				return -1;
  1335			}
  1336		}
  1337		default:
  1338			return -1; /* don't expect to be here */
  1339		}
  1340	}
  1341	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 09/11] cifs: account for primary channel in the interface list
  2023-03-10 15:32 ` [PATCH 09/11] cifs: account for primary channel in the interface list Shyam Prasad N
@ 2023-03-13  5:27   ` kernel test robot
  0 siblings, 0 replies; 31+ messages in thread
From: kernel test robot @ 2023-03-13  5:27 UTC (permalink / raw)
  To: Shyam Prasad N, smfrench, bharathsm.hsk, pc, tom, linux-cifs
  Cc: oe-kbuild-all, Shyam Prasad N

Hi Shyam,

I love your patch! Perhaps something to improve:

[auto build test WARNING on cifs/for-next]
[also build test WARNING on linus/master v6.3-rc2 next-20230310]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Shyam-Prasad-N/cifs-generate-signkey-for-the-channel-that-s-reconnecting/20230310-234711
base:   git://git.samba.org/sfrench/cifs-2.6.git for-next
patch link:    https://lore.kernel.org/r/20230310153211.10982-9-sprasad%40microsoft.com
patch subject: [PATCH 09/11] cifs: account for primary channel in the interface list
config: i386-randconfig-m021 (https://download.01.org/0day-ci/archive/20230313/202303131349.VOlTuw2U-lkp@intel.com/config)
compiler: gcc-11 (Debian 11.3.0-8) 11.3.0

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>
| Link: https://lore.kernel.org/oe-kbuild-all/202303131349.VOlTuw2U-lkp@intel.com/

New smatch warnings:
fs/cifs/sess.c:366 cifs_chan_update_iface() warn: unsigned '--old_iface->weight_fulfilled' is never less than zero.

Old smatch warnings:
fs/cifs/sess.c:377 cifs_chan_update_iface() warn: unsigned '--old_iface->weight_fulfilled' is never less than zero.

vim +366 fs/cifs/sess.c

   276	
   277	/*
   278	 * update the iface for the channel if necessary.
   279	 * will return 0 when iface is updated, 1 if removed, 2 otherwise
   280	 * Must be called with chan_lock held.
   281	 */
   282	int
   283	cifs_chan_update_iface(struct cifs_ses *ses, struct TCP_Server_Info *server)
   284	{
   285		unsigned int chan_index;
   286		size_t iface_weight = 0, iface_min_speed = 0;
   287		struct cifs_server_iface *iface = NULL;
   288		struct cifs_server_iface *old_iface = NULL;
   289		struct cifs_server_iface *last_iface = NULL;
   290		int rc = 0;
   291	
   292		spin_lock(&ses->chan_lock);
   293		chan_index = cifs_ses_get_chan_index(ses, server);
   294		if (ses->chans[chan_index].iface) {
   295			old_iface = ses->chans[chan_index].iface;
   296			if (old_iface->is_active) {
   297				spin_unlock(&ses->chan_lock);
   298				return 1;
   299			}
   300		}
   301		spin_unlock(&ses->chan_lock);
   302	
   303		spin_lock(&ses->iface_lock);
   304		if (!ses->iface_count) {
   305			spin_unlock(&ses->iface_lock);
   306			cifs_dbg(VFS, "server %s does not advertise interfaces\n", ses->server->hostname);
   307			return 0;
   308		}
   309	
   310		last_iface = list_last_entry(&ses->iface_list, struct cifs_server_iface,
   311					     iface_head);
   312		iface_min_speed = last_iface->speed;
   313	
   314		/* then look for a new one */
   315		list_for_each_entry(iface, &ses->iface_list, iface_head) {
   316			if (!chan_index) {
   317				/* if we're trying to get the updated iface for primary channel */
   318				if (!cifs_match_ipaddr((struct sockaddr *) &server->dstaddr,
   319						       (struct sockaddr *) &iface->sockaddr))
   320					continue;
   321	
   322				kref_get(&iface->refcount);
   323				break;
   324			}
   325	
   326			/* do not mix rdma and non-rdma interfaces */
   327			if (iface->rdma_capable != server->rdma)
   328				continue;
   329	
   330			if (!iface->is_active ||
   331			    (is_ses_using_iface(ses, iface) &&
   332			     !iface->rss_capable)) {
   333				continue;
   334			}
   335	
   336			/* check if we already allocated enough channels */
   337			iface_weight = iface->speed / iface_min_speed;
   338	
   339			if (iface->weight_fulfilled >= iface_weight)
   340				continue;
   341	
   342			kref_get(&iface->refcount);
   343			break;
   344		}
   345	
   346		if (list_entry_is_head(iface, &ses->iface_list, iface_head)) {
   347			rc = 1;
   348			iface = NULL;
   349			cifs_dbg(FYI, "unable to find a suitable iface\n");
   350		}
   351	
   352		if (!chan_index && !iface) {
   353			cifs_dbg(VFS, "unable to get the interface matching: %pIS\n",
   354				 &server->dstaddr);
   355			spin_unlock(&ses->iface_lock);
   356			return 0;
   357		}
   358	
   359		/* now drop the ref to the current iface */
   360		if (old_iface && iface) {
   361			cifs_dbg(FYI, "replacing iface: %pIS with %pIS\n",
   362				 &old_iface->sockaddr,
   363				 &iface->sockaddr);
   364	
   365			old_iface->num_channels--;
 > 366			if (--old_iface->weight_fulfilled < 0)
   367				old_iface->weight_fulfilled = 0;
   368			iface->num_channels++;
   369			iface->weight_fulfilled++;
   370	
   371			kref_put(&old_iface->refcount, release_iface);
   372		} else if (old_iface) {
   373			cifs_dbg(FYI, "releasing ref to iface: %pIS\n",
   374				 &old_iface->sockaddr);
   375	
   376			old_iface->num_channels--;
   377			if (--old_iface->weight_fulfilled < 0)
   378				old_iface->weight_fulfilled = 0;
   379	
   380			kref_put(&old_iface->refcount, release_iface);
   381		} else if (!chan_index) {
   382			/* special case: update interface for primary channel */
   383			cifs_dbg(FYI, "referencing primary channel iface: %pIS\n",
   384				 &iface->sockaddr);
   385			iface->num_channels++;
   386			iface->weight_fulfilled++;
   387		} else {
   388			WARN_ON(!iface);
   389			cifs_dbg(FYI, "adding new iface: %pIS\n", &iface->sockaddr);
   390		}
   391		spin_unlock(&ses->iface_lock);
   392	
   393		spin_lock(&ses->chan_lock);
   394		chan_index = cifs_ses_get_chan_index(ses, server);
   395		ses->chans[chan_index].iface = iface;
   396	
   397		/* No iface is found. if secondary chan, drop connection */
   398		if (!iface && CIFS_SERVER_IS_CHAN(server))
   399			ses->chans[chan_index].server = NULL;
   400	
   401		spin_unlock(&ses->chan_lock);
   402	
   403		if (!iface && CIFS_SERVER_IS_CHAN(server))
   404			cifs_put_tcp_session(server, false);
   405	
   406		return rc;
   407	}
   408	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 10/11] cifs: handle when server stops supporting multichannel
  2023-03-10 15:32 ` [PATCH 10/11] cifs: handle when server stops supporting multichannel Shyam Prasad N
@ 2023-03-13  6:09   ` kernel test robot
  0 siblings, 0 replies; 31+ messages in thread
From: kernel test robot @ 2023-03-13  6:09 UTC (permalink / raw)
  To: Shyam Prasad N, smfrench, bharathsm.hsk, pc, tom, linux-cifs
  Cc: oe-kbuild-all, Shyam Prasad N

Hi Shyam,

I love your patch! Perhaps something to improve:

[auto build test WARNING on cifs/for-next]
[also build test WARNING on linus/master v6.3-rc2 next-20230310]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Shyam-Prasad-N/cifs-generate-signkey-for-the-channel-that-s-reconnecting/20230310-234711
base:   git://git.samba.org/sfrench/cifs-2.6.git for-next
patch link:    https://lore.kernel.org/r/20230310153211.10982-10-sprasad%40microsoft.com
patch subject: [PATCH 10/11] cifs: handle when server stops supporting multichannel
config: i386-randconfig-m021 (https://download.01.org/0day-ci/archive/20230313/202303131424.W2jbbLmc-lkp@intel.com/config)
compiler: gcc-11 (Debian 11.3.0-8) 11.3.0

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>
| Link: https://lore.kernel.org/oe-kbuild-all/202303131424.W2jbbLmc-lkp@intel.com/

New smatch warnings:
fs/cifs/sess.c:299 cifs_disable_extra_channels() warn: unsigned '--iface->weight_fulfilled' is never less than zero.

Old smatch warnings:
fs/cifs/sess.c:401 cifs_chan_update_iface() warn: unsigned '--old_iface->weight_fulfilled' is never less than zero.
fs/cifs/sess.c:412 cifs_chan_update_iface() warn: unsigned '--old_iface->weight_fulfilled' is never less than zero.

vim +299 fs/cifs/sess.c

   276	
   277	/*
   278	 * called when multichannel is disabled by the server
   279	 */
   280	void
   281	cifs_disable_extra_channels(struct cifs_ses *ses)
   282	{
   283		int i, chan_count;
   284		struct cifs_server_iface *iface = NULL;
   285		struct TCP_Server_Info *server = NULL;
   286	
   287		spin_lock(&ses->chan_lock);
   288		chan_count = ses->chan_count;
   289		ses->chan_count = 1;
   290		for (i = 1; i < chan_count; i++) {
   291			iface = ses->chans[i].iface;
   292			server = ses->chans[i].server;
   293			spin_unlock(&ses->chan_lock);
   294	
   295			if (iface) {
   296				spin_lock(&ses->iface_lock);
   297				kref_put(&iface->refcount, release_iface);
   298				iface->num_channels--;
 > 299				if (--iface->weight_fulfilled < 0)
   300					iface->weight_fulfilled = 0;
   301				spin_unlock(&ses->iface_lock);
   302			}
   303			cifs_put_tcp_session(server, 0);
   304	
   305			spin_lock(&ses->chan_lock);
   306			ses->chans[i].iface = NULL;
   307			ses->chans[i].server = NULL;
   308		}
   309		spin_unlock(&ses->chan_lock);
   310	}
   311	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 01/11] cifs: fix tcon status change after tree connect
  2023-03-10 15:32 [PATCH 01/11] cifs: fix tcon status change after tree connect Shyam Prasad N
                   ` (9 preceding siblings ...)
  2023-03-10 15:32 ` [PATCH 11/11] cifs: empty interface list when server doesn't support query interfaces Shyam Prasad N
@ 2023-03-14 22:19 ` Paulo Alcantara
  2023-03-16 10:57   ` Shyam Prasad N
  10 siblings, 1 reply; 31+ messages in thread
From: Paulo Alcantara @ 2023-03-14 22:19 UTC (permalink / raw)
  To: Shyam Prasad N, smfrench, bharathsm.hsk, tom, linux-cifs; +Cc: Shyam Prasad N

Hi Shyam,

Shyam Prasad N <nspmangalore@gmail.com> writes:

> After cifs_tree_connect, tcon status should not be
> set to TID_GOOD. There could still be files that need
> reopen. The status should instead be changed to
> TID_NEED_FILES_INVALIDATE. That way, after reopen of
> files, the status can be changed to TID_GOOD.
>
> Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
> ---
>  fs/cifs/cifsglob.h |  2 +-
>  fs/cifs/connect.c  | 14 ++++++++++----
>  fs/cifs/dfs.c      | 16 +++++++++++-----
>  fs/cifs/file.c     | 10 +++++-----
>  4 files changed, 27 insertions(+), 15 deletions(-)

With this patch, the status of TID_GOOD is no longer set after
reconnecting tcons.  I've noticed that when the DFS cache refresher
attempted to get a new referral for updating a cached entry but the IPC
tcon status was still set to TID_NEED_FILES_INVALIDATE, therefore
skipping the I/O as it thought the IPC tcon was disconnected.

IOW, the TID_NEED_FILES_INVALIDATE status remains set for both types of
tcons after reconnect.

Besides, could you please split this patch into two?  The first one
would fix the checking of tcon statuses by using the appropriate spin
lock, and the second would make use of TID_NEED_FILES_INVALIDATE on
non-IPC tcons and set the TID_GOOD status afterwards.

Thanks.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 01/11] cifs: fix tcon status change after tree connect
  2023-03-14 22:19 ` [PATCH 01/11] cifs: fix tcon status change after tree connect Paulo Alcantara
@ 2023-03-16 10:57   ` Shyam Prasad N
  2023-03-16 20:59     ` Paulo Alcantara
  0 siblings, 1 reply; 31+ messages in thread
From: Shyam Prasad N @ 2023-03-16 10:57 UTC (permalink / raw)
  To: Paulo Alcantara; +Cc: smfrench, bharathsm.hsk, tom, linux-cifs, Shyam Prasad N

[-- Attachment #1: Type: text/plain, Size: 1739 bytes --]

On Wed, Mar 15, 2023 at 3:49 AM Paulo Alcantara <pc@manguebit.com> wrote:
>
> Hi Shyam,
>
> Shyam Prasad N <nspmangalore@gmail.com> writes:
>
> > After cifs_tree_connect, tcon status should not be
> > set to TID_GOOD. There could still be files that need
> > reopen. The status should instead be changed to
> > TID_NEED_FILES_INVALIDATE. That way, after reopen of
> > files, the status can be changed to TID_GOOD.
> >
> > Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
> > ---
> >  fs/cifs/cifsglob.h |  2 +-
> >  fs/cifs/connect.c  | 14 ++++++++++----
> >  fs/cifs/dfs.c      | 16 +++++++++++-----
> >  fs/cifs/file.c     | 10 +++++-----
> >  4 files changed, 27 insertions(+), 15 deletions(-)
>
> With this patch, the status of TID_GOOD is no longer set after
> reconnecting tcons.  I've noticed that when the DFS cache refresher
> attempted to get a new referral for updating a cached entry but the IPC
> tcon status was still set to TID_NEED_FILES_INVALIDATE, therefore
> skipping the I/O as it thought the IPC tcon was disconnected.
>
> IOW, the TID_NEED_FILES_INVALIDATE status remains set for both types of
> tcons after reconnect.
>
> Besides, could you please split this patch into two?  The first one
> would fix the checking of tcon statuses by using the appropriate spin
> lock, and the second would make use of TID_NEED_FILES_INVALIDATE on
> non-IPC tcons and set the TID_GOOD status afterwards.
>
> Thanks.

Hi Paulo,

Good point. The tcon status were indeed messed up after this patch.
Should have checked it earlier. My bad.

Let me revert this one and include only the checking of tcon status
with the right lock.

Attached the patch for that.

-- 
Regards,
Shyam

[-- Attachment #2: 0001-cifs-check-only-tcon-status-on-tcon-related-function.patch --]
[-- Type: application/octet-stream, Size: 2655 bytes --]

From d41e84175d7956948320bab648bcea625c35fd3d Mon Sep 17 00:00:00 2001
From: Shyam Prasad N <sprasad@microsoft.com>
Date: Thu, 16 Mar 2023 10:45:12 +0000
Subject: [PATCH 01/13] cifs: check only tcon status on tcon related functions

We had a couple of checks for session in cifs_tree_connect
and cifs_mark_open_files_invalid, which were unnecessary.
And that was done with ses_lock. Changed that to tc_lock too.

Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
---
 fs/cifs/connect.c | 10 +++++++---
 fs/cifs/dfs.c     | 10 +++++++---
 fs/cifs/file.c    |  8 ++++----
 3 files changed, 18 insertions(+), 10 deletions(-)

diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c
index 5233f14f0636..e249f6fecd26 100644
--- a/fs/cifs/connect.c
+++ b/fs/cifs/connect.c
@@ -4038,9 +4038,13 @@ int cifs_tree_connect(const unsigned int xid, struct cifs_tcon *tcon, const stru
 
 	/* only send once per connect */
 	spin_lock(&tcon->tc_lock);
-	if (tcon->ses->ses_status != SES_GOOD ||
-	    (tcon->status != TID_NEW &&
-	    tcon->status != TID_NEED_TCON)) {
+	if (tcon->status != TID_NEW &&
+	    tcon->status != TID_NEED_TCON) {
+		spin_unlock(&tcon->tc_lock);
+		return -EHOSTDOWN;
+	}
+
+	if (tcon->status == TID_GOOD) {
 		spin_unlock(&tcon->tc_lock);
 		return 0;
 	}
diff --git a/fs/cifs/dfs.c b/fs/cifs/dfs.c
index b64d20374b9c..83bc7e16f3a3 100644
--- a/fs/cifs/dfs.c
+++ b/fs/cifs/dfs.c
@@ -479,9 +479,13 @@ int cifs_tree_connect(const unsigned int xid, struct cifs_tcon *tcon, const stru
 
 	/* only send once per connect */
 	spin_lock(&tcon->tc_lock);
-	if (tcon->ses->ses_status != SES_GOOD ||
-	    (tcon->status != TID_NEW &&
-	    tcon->status != TID_NEED_TCON)) {
+	if (tcon->status != TID_NEW &&
+	    tcon->status != TID_NEED_TCON) {
+		spin_unlock(&tcon->tc_lock);
+		return -EHOSTDOWN;
+	}
+
+	if (tcon->status == TID_GOOD) {
 		spin_unlock(&tcon->tc_lock);
 		return 0;
 	}
diff --git a/fs/cifs/file.c b/fs/cifs/file.c
index 4d4a2d82636d..6831a9949c43 100644
--- a/fs/cifs/file.c
+++ b/fs/cifs/file.c
@@ -174,13 +174,13 @@ cifs_mark_open_files_invalid(struct cifs_tcon *tcon)
 	struct list_head *tmp1;
 
 	/* only send once per connect */
-	spin_lock(&tcon->ses->ses_lock);
-	if ((tcon->ses->ses_status != SES_GOOD) || (tcon->status != TID_NEED_RECON)) {
-		spin_unlock(&tcon->ses->ses_lock);
+	spin_lock(&tcon->tc_lock);
+	if (tcon->status != TID_NEED_RECON) {
+		spin_unlock(&tcon->tc_lock);
 		return;
 	}
 	tcon->status = TID_IN_FILES_INVALIDATE;
-	spin_unlock(&tcon->ses->ses_lock);
+	spin_unlock(&tcon->tc_lock);
 
 	/* list all files open on tree connection and mark them invalid */
 	spin_lock(&tcon->open_file_lock);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH 01/11] cifs: fix tcon status change after tree connect
  2023-03-16 10:57   ` Shyam Prasad N
@ 2023-03-16 20:59     ` Paulo Alcantara
  2023-03-17 10:48       ` Shyam Prasad N
  0 siblings, 1 reply; 31+ messages in thread
From: Paulo Alcantara @ 2023-03-16 20:59 UTC (permalink / raw)
  To: Shyam Prasad N; +Cc: smfrench, bharathsm.hsk, tom, linux-cifs, Shyam Prasad N

Shyam Prasad N <nspmangalore@gmail.com> writes:

> On Wed, Mar 15, 2023 at 3:49 AM Paulo Alcantara <pc@manguebit.com> wrote:
>> Shyam Prasad N <nspmangalore@gmail.com> writes:
>>
>> > After cifs_tree_connect, tcon status should not be
>> > set to TID_GOOD. There could still be files that need
>> > reopen. The status should instead be changed to
>> > TID_NEED_FILES_INVALIDATE. That way, after reopen of
>> > files, the status can be changed to TID_GOOD.
>> >
>> > Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
>> > ---
>> >  fs/cifs/cifsglob.h |  2 +-
>> >  fs/cifs/connect.c  | 14 ++++++++++----
>> >  fs/cifs/dfs.c      | 16 +++++++++++-----
>> >  fs/cifs/file.c     | 10 +++++-----
>> >  4 files changed, 27 insertions(+), 15 deletions(-)
>>
>> With this patch, the status of TID_GOOD is no longer set after
>> reconnecting tcons.  I've noticed that when the DFS cache refresher
>> attempted to get a new referral for updating a cached entry but the IPC
>> tcon status was still set to TID_NEED_FILES_INVALIDATE, therefore
>> skipping the I/O as it thought the IPC tcon was disconnected.
>>
>> IOW, the TID_NEED_FILES_INVALIDATE status remains set for both types of
>> tcons after reconnect.
>>
>> Besides, could you please split this patch into two?  The first one
>> would fix the checking of tcon statuses by using the appropriate spin
>> lock, and the second would make use of TID_NEED_FILES_INVALIDATE on
>> non-IPC tcons and set the TID_GOOD status afterwards.
>
> Good point. The tcon status were indeed messed up after this patch.
> Should have checked it earlier. My bad.
>
> Let me revert this one and include only the checking of tcon status
> with the right lock.
>
> Attached the patch for that.

Thanks for the patch!  Looks good to me.

BTW, don't you think we need to get rid of unnecessary ses status check
in __refresh_tcon() as well

        ...
	spin_lock(&ipc->tc_lock);
	if (ses->ses_status != SES_GOOD || ipc->status != TID_GOOD) {

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 01/11] cifs: fix tcon status change after tree connect
  2023-03-16 20:59     ` Paulo Alcantara
@ 2023-03-17 10:48       ` Shyam Prasad N
  2023-03-17 12:35         ` Paulo Alcantara
  0 siblings, 1 reply; 31+ messages in thread
From: Shyam Prasad N @ 2023-03-17 10:48 UTC (permalink / raw)
  To: Paulo Alcantara; +Cc: smfrench, bharathsm.hsk, tom, linux-cifs, Shyam Prasad N

[-- Attachment #1: Type: text/plain, Size: 2273 bytes --]

On Fri, Mar 17, 2023 at 2:29 AM Paulo Alcantara <pc@manguebit.com> wrote:
>
> Shyam Prasad N <nspmangalore@gmail.com> writes:
>
> > On Wed, Mar 15, 2023 at 3:49 AM Paulo Alcantara <pc@manguebit.com> wrote:
> >> Shyam Prasad N <nspmangalore@gmail.com> writes:
> >>
> >> > After cifs_tree_connect, tcon status should not be
> >> > set to TID_GOOD. There could still be files that need
> >> > reopen. The status should instead be changed to
> >> > TID_NEED_FILES_INVALIDATE. That way, after reopen of
> >> > files, the status can be changed to TID_GOOD.
> >> >
> >> > Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
> >> > ---
> >> >  fs/cifs/cifsglob.h |  2 +-
> >> >  fs/cifs/connect.c  | 14 ++++++++++----
> >> >  fs/cifs/dfs.c      | 16 +++++++++++-----
> >> >  fs/cifs/file.c     | 10 +++++-----
> >> >  4 files changed, 27 insertions(+), 15 deletions(-)
> >>
> >> With this patch, the status of TID_GOOD is no longer set after
> >> reconnecting tcons.  I've noticed that when the DFS cache refresher
> >> attempted to get a new referral for updating a cached entry but the IPC
> >> tcon status was still set to TID_NEED_FILES_INVALIDATE, therefore
> >> skipping the I/O as it thought the IPC tcon was disconnected.
> >>
> >> IOW, the TID_NEED_FILES_INVALIDATE status remains set for both types of
> >> tcons after reconnect.
> >>
> >> Besides, could you please split this patch into two?  The first one
> >> would fix the checking of tcon statuses by using the appropriate spin
> >> lock, and the second would make use of TID_NEED_FILES_INVALIDATE on
> >> non-IPC tcons and set the TID_GOOD status afterwards.
> >
> > Good point. The tcon status were indeed messed up after this patch.
> > Should have checked it earlier. My bad.
> >
> > Let me revert this one and include only the checking of tcon status
> > with the right lock.
> >
> > Attached the patch for that.
>
> Thanks for the patch!  Looks good to me.
>
> BTW, don't you think we need to get rid of unnecessary ses status check
> in __refresh_tcon() as well
>
>         ...
>         spin_lock(&ipc->tc_lock);
>         if (ses->ses_status != SES_GOOD || ipc->status != TID_GOOD) {

Good point.
Here's the updated patch.

-- 
Regards,
Shyam

[-- Attachment #2: 0001-cifs-check-only-tcon-status-on-tcon-related-function.patch --]
[-- Type: application/octet-stream, Size: 3195 bytes --]

From 787f4e2afd54c7b5865939b61f661029a4f5f6b9 Mon Sep 17 00:00:00 2001
From: Shyam Prasad N <sprasad@microsoft.com>
Date: Thu, 16 Mar 2023 10:45:12 +0000
Subject: [PATCH 01/13] cifs: check only tcon status on tcon related functions

We had a couple of checks for session in cifs_tree_connect
and cifs_mark_open_files_invalid, which were unnecessary.
And that was done with ses_lock. Changed that to tc_lock too.

Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
---
 fs/cifs/connect.c   | 10 +++++++---
 fs/cifs/dfs.c       | 10 +++++++---
 fs/cifs/dfs_cache.c |  2 +-
 fs/cifs/file.c      |  8 ++++----
 4 files changed, 19 insertions(+), 11 deletions(-)

diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c
index 5233f14f0636..e249f6fecd26 100644
--- a/fs/cifs/connect.c
+++ b/fs/cifs/connect.c
@@ -4038,9 +4038,13 @@ int cifs_tree_connect(const unsigned int xid, struct cifs_tcon *tcon, const stru
 
 	/* only send once per connect */
 	spin_lock(&tcon->tc_lock);
-	if (tcon->ses->ses_status != SES_GOOD ||
-	    (tcon->status != TID_NEW &&
-	    tcon->status != TID_NEED_TCON)) {
+	if (tcon->status != TID_NEW &&
+	    tcon->status != TID_NEED_TCON) {
+		spin_unlock(&tcon->tc_lock);
+		return -EHOSTDOWN;
+	}
+
+	if (tcon->status == TID_GOOD) {
 		spin_unlock(&tcon->tc_lock);
 		return 0;
 	}
diff --git a/fs/cifs/dfs.c b/fs/cifs/dfs.c
index b64d20374b9c..83bc7e16f3a3 100644
--- a/fs/cifs/dfs.c
+++ b/fs/cifs/dfs.c
@@ -479,9 +479,13 @@ int cifs_tree_connect(const unsigned int xid, struct cifs_tcon *tcon, const stru
 
 	/* only send once per connect */
 	spin_lock(&tcon->tc_lock);
-	if (tcon->ses->ses_status != SES_GOOD ||
-	    (tcon->status != TID_NEW &&
-	    tcon->status != TID_NEED_TCON)) {
+	if (tcon->status != TID_NEW &&
+	    tcon->status != TID_NEED_TCON) {
+		spin_unlock(&tcon->tc_lock);
+		return -EHOSTDOWN;
+	}
+
+	if (tcon->status == TID_GOOD) {
 		spin_unlock(&tcon->tc_lock);
 		return 0;
 	}
diff --git a/fs/cifs/dfs_cache.c b/fs/cifs/dfs_cache.c
index ac86bd0ebd63..a03d0209720e 100644
--- a/fs/cifs/dfs_cache.c
+++ b/fs/cifs/dfs_cache.c
@@ -1326,7 +1326,7 @@ static int __refresh_tcon(const char *path, struct cifs_tcon *tcon, bool force_r
 	}
 
 	spin_lock(&ipc->tc_lock);
-	if (ses->ses_status != SES_GOOD || ipc->status != TID_GOOD) {
+	if (ipc->status != TID_GOOD) {
 		spin_unlock(&ipc->tc_lock);
 		cifs_dbg(FYI, "%s: skip cache refresh due to disconnected ipc\n", __func__);
 		goto out;
diff --git a/fs/cifs/file.c b/fs/cifs/file.c
index 4d4a2d82636d..6831a9949c43 100644
--- a/fs/cifs/file.c
+++ b/fs/cifs/file.c
@@ -174,13 +174,13 @@ cifs_mark_open_files_invalid(struct cifs_tcon *tcon)
 	struct list_head *tmp1;
 
 	/* only send once per connect */
-	spin_lock(&tcon->ses->ses_lock);
-	if ((tcon->ses->ses_status != SES_GOOD) || (tcon->status != TID_NEED_RECON)) {
-		spin_unlock(&tcon->ses->ses_lock);
+	spin_lock(&tcon->tc_lock);
+	if (tcon->status != TID_NEED_RECON) {
+		spin_unlock(&tcon->tc_lock);
 		return;
 	}
 	tcon->status = TID_IN_FILES_INVALIDATE;
-	spin_unlock(&tcon->ses->ses_lock);
+	spin_unlock(&tcon->tc_lock);
 
 	/* list all files open on tree connection and mark them invalid */
 	spin_lock(&tcon->open_file_lock);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH 01/11] cifs: fix tcon status change after tree connect
  2023-03-17 10:48       ` Shyam Prasad N
@ 2023-03-17 12:35         ` Paulo Alcantara
  2023-03-17 18:25           ` Steve French
  0 siblings, 1 reply; 31+ messages in thread
From: Paulo Alcantara @ 2023-03-17 12:35 UTC (permalink / raw)
  To: Shyam Prasad N; +Cc: smfrench, bharathsm.hsk, tom, linux-cifs, Shyam Prasad N

Shyam Prasad N <nspmangalore@gmail.com> writes:

> Here's the updated patch.

Thanks!

Reviewed-by: Paulo Alcantara (SUSE) <pc@manguebit.com>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 01/11] cifs: fix tcon status change after tree connect
  2023-03-17 12:35         ` Paulo Alcantara
@ 2023-03-17 18:25           ` Steve French
  0 siblings, 0 replies; 31+ messages in thread
From: Steve French @ 2023-03-17 18:25 UTC (permalink / raw)
  To: Paulo Alcantara
  Cc: Shyam Prasad N, bharathsm.hsk, tom, linux-cifs, Shyam Prasad N

updated for-next with this newer version of the patch

On Fri, Mar 17, 2023 at 7:35 AM Paulo Alcantara <pc@manguebit.com> wrote:
>
> Shyam Prasad N <nspmangalore@gmail.com> writes:
>
> > Here's the updated patch.
>
> Thanks!
>
> Reviewed-by: Paulo Alcantara (SUSE) <pc@manguebit.com>



-- 
Thanks,

Steve

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 08/11] cifs: distribute channels across interfaces based on speed
  2023-03-10 15:32 ` [PATCH 08/11] cifs: distribute channels across interfaces based on speed Shyam Prasad N
@ 2024-02-27 11:16   ` Jan Čermák
  2024-02-27 16:17     ` Shyam Prasad N
  0 siblings, 1 reply; 31+ messages in thread
From: Jan Čermák @ 2024-02-27 11:16 UTC (permalink / raw)
  To: Shyam Prasad N, smfrench, bharathsm.hsk, pc, tom, linux-cifs
  Cc: Shyam Prasad N, Stefan Agner

Hi Shyam, everyone,

On 10. 03. 23 16:32, Shyam Prasad N wrote:

> +	if (!ses->iface_count) {
> +		spin_unlock(&ses->iface_lock);
> +		cifs_dbg(VFS, "server %s does not advertise interfaces\n", ses->server->hostname);
> +		return 0;
> +	}

May I ask why this is now being logged, and what can be tweaked in the 
case that a server does not advertise interfaces? After updating the 
kernel from 6.1 to 6.6 in Home Assistant OS, we got a report [1] of 
these messages appearing, yet so far only from a single attentive user. 
I wonder if we are going to see them more often, and it that case a 
suggestion to users would come handy. If there's not a simple 
resolution, could the verbosity be lowered to FYI, maybe?

Thanks,
Jan


[1] https://github.com/home-assistant/operating-system/issues/3201

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 08/11] cifs: distribute channels across interfaces based on speed
  2024-02-27 11:16   ` Jan Čermák
@ 2024-02-27 16:17     ` Shyam Prasad N
  2024-02-28  9:22       ` Jan Čermák
  0 siblings, 1 reply; 31+ messages in thread
From: Shyam Prasad N @ 2024-02-27 16:17 UTC (permalink / raw)
  To: Jan Čermák
  Cc: smfrench, bharathsm.hsk, pc, tom, linux-cifs, Shyam Prasad N,
	Stefan Agner

On Tue, Feb 27, 2024 at 4:46 PM Jan Čermák <sairon@sairon.cz> wrote:
>
> Hi Shyam, everyone,
>
> On 10. 03. 23 16:32, Shyam Prasad N wrote:
>
> > +     if (!ses->iface_count) {
> > +             spin_unlock(&ses->iface_lock);
> > +             cifs_dbg(VFS, "server %s does not advertise interfaces\n", ses->server->hostname);
> > +             return 0;
> > +     }
>
> May I ask why this is now being logged, and what can be tweaked in the
> case that a server does not advertise interfaces? After updating the
> kernel from 6.1 to 6.6 in Home Assistant OS, we got a report [1] of
> these messages appearing, yet so far only from a single attentive user.
> I wonder if we are going to see them more often, and it that case a
> suggestion to users would come handy. If there's not a simple
> resolution, could the verbosity be lowered to FYI, maybe?
>
> Thanks,
> Jan
>
>
> [1] https://github.com/home-assistant/operating-system/issues/3201

Hi Jan,

These messages (in theory) should not show up if either multichannel
or max_channels are not specified mount options.
And if multichannel is enabled, then the server should support the
ioctl to return the server interface list.
These messages are meant to help the user/developer understand why
multiple channels are not getting established to the server, even
after specifying multichannel as a mount option.

The repeating nature of these messages here leads me to also believe
that there's something fishy going on here.
Either the network health is not good, or that there's some bug at play here.

-- 
Regards,
Shyam

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 08/11] cifs: distribute channels across interfaces based on speed
  2024-02-27 16:17     ` Shyam Prasad N
@ 2024-02-28  9:22       ` Jan Čermák
  2024-03-05 14:56         ` Shyam Prasad N
  0 siblings, 1 reply; 31+ messages in thread
From: Jan Čermák @ 2024-02-28  9:22 UTC (permalink / raw)
  To: Shyam Prasad N
  Cc: smfrench, bharathsm.hsk, pc, tom, linux-cifs, Shyam Prasad N,
	Stefan Agner

Hi Shyam,

On 27. 02. 24 17:17, Shyam Prasad N wrote:
> These messages (in theory) should not show up if either multichannel
> or max_channels are not specified mount options.

That shouldn't be the case here, I checked with the user and he's not 
doing anything fishy himself (like interfering with the standard mount 
utilities), and the userspace tools creating the mounts should not be 
setting any of these options, which I confirmed by asking for his mounts 
list:

//192.168.1.12/folder on /mnt/data/supervisor/mounts/folder type cifs 
(rw,relatime,vers=default,cache=strict,username=user,uid=0,noforceuid,gid=0,noforcegid,addr=192.168.1.12,file_mode=0755,dir_mode=0755,soft,nounix,mapposix,rsize=4194304,wsize=4194304,bsize=1048576,echo_interval=60,actimeo=1,closetimeo=1)
//192.168.1.12/folder on /mnt/data/supervisor/media/folder type cifs 
(rw,relatime,vers=default,cache=strict,username=user,uid=0,noforceuid,gid=0,noforcegid,addr=192.168.1.12,file_mode=0755,dir_mode=0755,soft,nounix,mapposix,rsize=4194304,wsize=4194304,bsize=1048576,echo_interval=60,actimeo=1,closetimeo=1)


Or am I missing anything here?

> The repeating nature of these messages here leads me to also believe
> that there's something fishy going on here.
> Either the network health is not good, or that there's some bug at play here.

Maybe, however I'm not able to reproduce the above behavior yet. But 
there's been so far one more report of this happening, so it's not a 
single isolated case. I appreciate any advice what to look at further.

Cheers,
Jan

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 08/11] cifs: distribute channels across interfaces based on speed
  2024-02-28  9:22       ` Jan Čermák
@ 2024-03-05 14:56         ` Shyam Prasad N
  2024-03-06 15:43           ` Paulo Alcantara
  0 siblings, 1 reply; 31+ messages in thread
From: Shyam Prasad N @ 2024-03-05 14:56 UTC (permalink / raw)
  To: Jan Čermák
  Cc: smfrench, bharathsm.hsk, pc, tom, linux-cifs, Shyam Prasad N,
	Stefan Agner

On Wed, Feb 28, 2024 at 2:52 PM Jan Čermák <sairon@sairon.cz> wrote:
>
> Hi Shyam,
>

Hi Jan,
Apologies for the delay.

> On 27. 02. 24 17:17, Shyam Prasad N wrote:
> > These messages (in theory) should not show up if either multichannel
> > or max_channels are not specified mount options.
>
> That shouldn't be the case here, I checked with the user and he's not
> doing anything fishy himself (like interfering with the standard mount
> utilities), and the userspace tools creating the mounts should not be
> setting any of these options, which I confirmed by asking for his mounts
> list:
>
> //192.168.1.12/folder on /mnt/data/supervisor/mounts/folder type cifs
> (rw,relatime,vers=default,cache=strict,username=user,uid=0,noforceuid,gid=0,noforcegid,addr=192.168.1.12,file_mode=0755,dir_mode=0755,soft,nounix,mapposix,rsize=4194304,wsize=4194304,bsize=1048576,echo_interval=60,actimeo=1,closetimeo=1)
> //192.168.1.12/folder on /mnt/data/supervisor/media/folder type cifs
> (rw,relatime,vers=default,cache=strict,username=user,uid=0,noforceuid,gid=0,noforcegid,addr=192.168.1.12,file_mode=0755,dir_mode=0755,soft,nounix,mapposix,rsize=4194304,wsize=4194304,bsize=1048576,echo_interval=60,actimeo=1,closetimeo=1)

Hmmm.. That seems like a bug.
Is there any chance that the user is willing to try out if the same
bug reproduces with the latest mainline kernel?

The other option is for us to try with the 6.6 kernel. But without the
steps to repro, it'll just be shots in the dark.
Let me try to go through the code and see if I can spot anything here.

>
>
> Or am I missing anything here?
>
> > The repeating nature of these messages here leads me to also believe
> > that there's something fishy going on here.
> > Either the network health is not good, or that there's some bug at play here.
>
> Maybe, however I'm not able to reproduce the above behavior yet. But
> there's been so far one more report of this happening, so it's not a
> single isolated case. I appreciate any advice what to look at further.

If a user has reproduced this issue, the one thing they can send us is
the ftrace output of cifs events when the issue is being seen.
i.e. something like this:
# trace-cmd start -e cifs
# <now wait for the issue to reproduce>
# trace-cmd stop
# trace-cmd extract > /tmp/outputs.txt
# uname -r >> /tmp/outputs,txt
# cat /proc/fs/cifs/DebugData >> /tmp/outputs,txt
# cat /proc/fs/cifs/Stats >> /tmp/outputs,txt

And then provide the outputs.txt file to us.
Going through that capture can help us understand this better.

>
> Cheers,
> Jan



-- 
Regards,
Shyam

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 08/11] cifs: distribute channels across interfaces based on speed
  2024-03-05 14:56         ` Shyam Prasad N
@ 2024-03-06 15:43           ` Paulo Alcantara
  2024-03-11 10:01             ` Jan Čermák
  0 siblings, 1 reply; 31+ messages in thread
From: Paulo Alcantara @ 2024-03-06 15:43 UTC (permalink / raw)
  To: Shyam Prasad N, Jan Čermák
  Cc: smfrench, bharathsm.hsk, tom, linux-cifs, Shyam Prasad N, Stefan Agner

Shyam Prasad N <nspmangalore@gmail.com> writes:

> On Wed, Feb 28, 2024 at 2:52 PM Jan Čermák <sairon@sairon.cz> wrote:
>>
>> Hi Shyam,
>>
>
> Hi Jan,
> Apologies for the delay.
>
>> On 27. 02. 24 17:17, Shyam Prasad N wrote:
>> > These messages (in theory) should not show up if either multichannel
>> > or max_channels are not specified mount options.
>>
>> That shouldn't be the case here, I checked with the user and he's not
>> doing anything fishy himself (like interfering with the standard mount
>> utilities), and the userspace tools creating the mounts should not be
>> setting any of these options, which I confirmed by asking for his mounts
>> list:
>>
>> //192.168.1.12/folder on /mnt/data/supervisor/mounts/folder type cifs
>> (rw,relatime,vers=default,cache=strict,username=user,uid=0,noforceuid,gid=0,noforcegid,addr=192.168.1.12,file_mode=0755,dir_mode=0755,soft,nounix,mapposix,rsize=4194304,wsize=4194304,bsize=1048576,echo_interval=60,actimeo=1,closetimeo=1)
>> //192.168.1.12/folder on /mnt/data/supervisor/media/folder type cifs
>> (rw,relatime,vers=default,cache=strict,username=user,uid=0,noforceuid,gid=0,noforcegid,addr=192.168.1.12,file_mode=0755,dir_mode=0755,soft,nounix,mapposix,rsize=4194304,wsize=4194304,bsize=1048576,echo_interval=60,actimeo=1,closetimeo=1)
>
> Hmmm.. That seems like a bug.

Yes.  I see a couple of issues here:

(1) cifs_chan_update_iface() seems to be called over reconnect for all
dialect versions and servers that do not support
FSCTL_QUERY_NETWORK_INTERFACE_INFO, so @ses->iface_count will be zero in
some cases and then VFS message will start being flooded on dmesg.

(2) __cifs_reconnect() is queueing smb2_reconnect_server() even for SMB1
mounts, so smb2_reconnect() ends up being called and then call
SMB3_request_interfaces() because SMB2_GLOBAL_CAP_MULTI_CHANNEL is mixed
with CAP_LARGE_FILES.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 08/11] cifs: distribute channels across interfaces based on speed
  2024-03-06 15:43           ` Paulo Alcantara
@ 2024-03-11 10:01             ` Jan Čermák
  2024-03-11 11:14               ` Shyam Prasad N
  0 siblings, 1 reply; 31+ messages in thread
From: Jan Čermák @ 2024-03-11 10:01 UTC (permalink / raw)
  To: Paulo Alcantara, Shyam Prasad N
  Cc: smfrench, bharathsm.hsk, tom, linux-cifs, Shyam Prasad N, Stefan Agner

Hi Paulo, Shyam,

On 06. 03. 24 16:43, Paulo Alcantara wrote:
> 
> Yes.  I see a couple of issues here:
> 
> (1) cifs_chan_update_iface() seems to be called over reconnect for all
> dialect versions and servers that do not support
> FSCTL_QUERY_NETWORK_INTERFACE_INFO, so @ses->iface_count will be zero in
> some cases and then VFS message will start being flooded on dmesg.
> 
> (2) __cifs_reconnect() is queueing smb2_reconnect_server() even for SMB1
> mounts, so smb2_reconnect() ends up being called and then call
> SMB3_request_interfaces() because SMB2_GLOBAL_CAP_MULTI_CHANNEL is mixed
> with CAP_LARGE_FILES.

thanks for looking into this! Is there anything else you'll need from me 
or to test by the users who are observing the issue, or do you have 
enough information for a fix?

Regards,
Jan

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 08/11] cifs: distribute channels across interfaces based on speed
  2024-03-11 10:01             ` Jan Čermák
@ 2024-03-11 11:14               ` Shyam Prasad N
  2024-03-12 14:20                 ` Jan Čermák
  0 siblings, 1 reply; 31+ messages in thread
From: Shyam Prasad N @ 2024-03-11 11:14 UTC (permalink / raw)
  To: Jan Čermák
  Cc: Paulo Alcantara, smfrench, bharathsm.hsk, tom, linux-cifs,
	Shyam Prasad N, Stefan Agner

On Mon, Mar 11, 2024 at 3:31 PM Jan Čermák <sairon@sairon.cz> wrote:
>
> Hi Paulo, Shyam,
>
> On 06. 03. 24 16:43, Paulo Alcantara wrote:
> >
> > Yes.  I see a couple of issues here:
> >

Thanks for the review, Paulo.

> > (1) cifs_chan_update_iface() seems to be called over reconnect for all
> > dialect versions and servers that do not support
> > FSCTL_QUERY_NETWORK_INTERFACE_INFO, so @ses->iface_count will be zero in
> > some cases and then VFS message will start being flooded on dmesg.

Valid point.

> >
> > (2) __cifs_reconnect() is queueing smb2_reconnect_server() even for SMB1
> > mounts, so smb2_reconnect() ends up being called and then call
> > SMB3_request_interfaces() because SMB2_GLOBAL_CAP_MULTI_CHANNEL is mixed
> > with CAP_LARGE_FILES.

Excellent catch. I did not take this into account.
It is very likely that this is going on here.

Let me submit a patch to check the exact dialect before calling these
functions to make sure we only do it for SMB3+.

>
> thanks for looking into this! Is there anything else you'll need from me
> or to test by the users who are observing the issue, or do you have
> enough information for a fix?
>
> Regards,
> Jan

Thanks for bringing this to our notice, Jan.

-- 
Regards,
Shyam

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 08/11] cifs: distribute channels across interfaces based on speed
  2024-03-11 11:14               ` Shyam Prasad N
@ 2024-03-12 14:20                 ` Jan Čermák
  2024-03-13 10:45                   ` Shyam Prasad N
  0 siblings, 1 reply; 31+ messages in thread
From: Jan Čermák @ 2024-03-12 14:20 UTC (permalink / raw)
  To: Shyam Prasad N
  Cc: Paulo Alcantara, smfrench, bharathsm.hsk, tom, linux-cifs,
	Shyam Prasad N, Stefan Agner

On 11. 03. 24 12:14, Shyam Prasad N wrote:
> Let me submit a patch to check the exact dialect before calling these
> functions to make sure we only do it for SMB3+.
> 

FWIW, most of the reports (except the first one with macOS) are users 
running their SMB server on QNAP NAS's (more details are in the GH 
ticket I linked earlier).

Please CC me in the e-mail with the patch as well, I can create a build 
with that patch applied to 6.6.y and ask if someone could try if it 
resolves the issue in their environment.

Cheers,
Jan

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 08/11] cifs: distribute channels across interfaces based on speed
  2024-03-12 14:20                 ` Jan Čermák
@ 2024-03-13 10:45                   ` Shyam Prasad N
  2024-03-26 14:10                     ` Jan Čermák
  0 siblings, 1 reply; 31+ messages in thread
From: Shyam Prasad N @ 2024-03-13 10:45 UTC (permalink / raw)
  To: Jan Čermák
  Cc: Paulo Alcantara, smfrench, bharathsm.hsk, tom, linux-cifs,
	Shyam Prasad N, Stefan Agner

On Tue, Mar 12, 2024 at 7:50 PM Jan Čermák <sairon@sairon.cz> wrote:
>
> On 11. 03. 24 12:14, Shyam Prasad N wrote:
> > Let me submit a patch to check the exact dialect before calling these
> > functions to make sure we only do it for SMB3+.
> >
>
> FWIW, most of the reports (except the first one with macOS) are users
> running their SMB server on QNAP NAS's (more details are in the GH
> ticket I linked earlier).
>
> Please CC me in the e-mail with the patch as well, I can create a build
> with that patch applied to 6.6.y and ask if someone could try if it
> resolves the issue in their environment.
>
> Cheers,
> Jan

Just sent two patches for this.
One patch is just to change the log level for this log from VFS -> FYI.
The other one is for the suggestions made by Paulo.
#1 will fix this anyway. I'm curious to know if #2 alone would fix this problem.
Also, please ask for the output of the following command while testing this out:
# cat /proc/fs/cifs/DebugData

-- 
Regards,
Shyam

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 08/11] cifs: distribute channels across interfaces based on speed
  2024-03-13 10:45                   ` Shyam Prasad N
@ 2024-03-26 14:10                     ` Jan Čermák
  0 siblings, 0 replies; 31+ messages in thread
From: Jan Čermák @ 2024-03-26 14:10 UTC (permalink / raw)
  To: Shyam Prasad N
  Cc: Paulo Alcantara, smfrench, bharathsm.hsk, tom, linux-cifs,
	Shyam Prasad N, Stefan Agner

On 13. 03. 24 11:45, Shyam Prasad N wrote:
> Just sent two patches for this.
> One patch is just to change the log level for this log from VFS -> FYI.
> The other one is for the suggestions made by Paulo.
> #1 will fix this anyway. I'm curious to know if #2 alone would fix this problem.
> Also, please ask for the output of the following command while testing this out:
> # cat /proc/fs/cifs/DebugData

One of the users tried testing on the same system with only the second 
patch applied and reports the issue still persists :( He supplied the 
DebugData output only as a screenshot [1], not sure what to look there 
for, but definitely the "Server interfaces" section is missing there 
compared to my setup.

Cheers,
Jan

[1] 
https://github.com/home-assistant/operating-system/issues/3201#issuecomment-2019357433

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2024-03-26 14:10 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-10 15:32 [PATCH 01/11] cifs: fix tcon status change after tree connect Shyam Prasad N
2023-03-10 15:32 ` [PATCH 02/11] cifs: generate signkey for the channel that's reconnecting Shyam Prasad N
2023-03-10 15:32 ` [PATCH 03/11] cifs: avoid race conditions with parallel reconnects Shyam Prasad N
2023-03-10 15:32 ` [PATCH 04/11] cifs: serialize channel reconnects Shyam Prasad N
2023-03-10 22:40   ` Steve French
2023-03-10 15:32 ` [PATCH 05/11] cifs: lock chan_lock outside match_session Shyam Prasad N
2023-03-10 15:32 ` [PATCH 06/11] cifs: fix sockaddr comparison in iface_cmp Shyam Prasad N
2023-03-11  4:51   ` kernel test robot
2023-03-10 15:32 ` [PATCH 07/11] cifs: do not poll server interfaces too regularly Shyam Prasad N
2023-03-10 15:32 ` [PATCH 08/11] cifs: distribute channels across interfaces based on speed Shyam Prasad N
2024-02-27 11:16   ` Jan Čermák
2024-02-27 16:17     ` Shyam Prasad N
2024-02-28  9:22       ` Jan Čermák
2024-03-05 14:56         ` Shyam Prasad N
2024-03-06 15:43           ` Paulo Alcantara
2024-03-11 10:01             ` Jan Čermák
2024-03-11 11:14               ` Shyam Prasad N
2024-03-12 14:20                 ` Jan Čermák
2024-03-13 10:45                   ` Shyam Prasad N
2024-03-26 14:10                     ` Jan Čermák
2023-03-10 15:32 ` [PATCH 09/11] cifs: account for primary channel in the interface list Shyam Prasad N
2023-03-13  5:27   ` kernel test robot
2023-03-10 15:32 ` [PATCH 10/11] cifs: handle when server stops supporting multichannel Shyam Prasad N
2023-03-13  6:09   ` kernel test robot
2023-03-10 15:32 ` [PATCH 11/11] cifs: empty interface list when server doesn't support query interfaces Shyam Prasad N
2023-03-14 22:19 ` [PATCH 01/11] cifs: fix tcon status change after tree connect Paulo Alcantara
2023-03-16 10:57   ` Shyam Prasad N
2023-03-16 20:59     ` Paulo Alcantara
2023-03-17 10:48       ` Shyam Prasad N
2023-03-17 12:35         ` Paulo Alcantara
2023-03-17 18:25           ` Steve French

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).