* [PATCH 01/11] cifs: fix tcon status change after tree connect
@ 2023-03-10 15:32 Shyam Prasad N
2023-03-10 15:32 ` [PATCH 02/11] cifs: generate signkey for the channel that's reconnecting Shyam Prasad N
` (10 more replies)
0 siblings, 11 replies; 31+ messages in thread
From: Shyam Prasad N @ 2023-03-10 15:32 UTC (permalink / raw)
To: smfrench, bharathsm.hsk, pc, tom, linux-cifs; +Cc: Shyam Prasad N
After cifs_tree_connect, tcon status should not be
set to TID_GOOD. There could still be files that need
reopen. The status should instead be changed to
TID_NEED_FILES_INVALIDATE. That way, after reopen of
files, the status can be changed to TID_GOOD.
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
---
fs/cifs/cifsglob.h | 2 +-
fs/cifs/connect.c | 14 ++++++++++----
fs/cifs/dfs.c | 16 +++++++++++-----
fs/cifs/file.c | 10 +++++-----
4 files changed, 27 insertions(+), 15 deletions(-)
diff --git a/fs/cifs/cifsglob.h b/fs/cifs/cifsglob.h
index a99883f16d94..8a37b1553dc6 100644
--- a/fs/cifs/cifsglob.h
+++ b/fs/cifs/cifsglob.h
@@ -137,7 +137,7 @@ enum tid_status_enum {
TID_NEED_RECON,
TID_NEED_TCON,
TID_IN_TCON,
- TID_NEED_FILES_INVALIDATE, /* currently unused */
+ TID_NEED_FILES_INVALIDATE,
TID_IN_FILES_INVALIDATE
};
diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c
index 5233f14f0636..3d07729c91a1 100644
--- a/fs/cifs/connect.c
+++ b/fs/cifs/connect.c
@@ -4038,9 +4038,15 @@ int cifs_tree_connect(const unsigned int xid, struct cifs_tcon *tcon, const stru
/* only send once per connect */
spin_lock(&tcon->tc_lock);
- if (tcon->ses->ses_status != SES_GOOD ||
- (tcon->status != TID_NEW &&
- tcon->status != TID_NEED_TCON)) {
+ if (tcon->status != TID_GOOD &&
+ tcon->status != TID_NEW &&
+ tcon->status != TID_NEED_RECON) {
+ spin_unlock(&tcon->tc_lock);
+ return -EHOSTDOWN;
+ }
+
+ if (tcon->status == TID_NEED_FILES_INVALIDATE ||
+ tcon->status == TID_GOOD) {
spin_unlock(&tcon->tc_lock);
return 0;
}
@@ -4051,7 +4057,7 @@ int cifs_tree_connect(const unsigned int xid, struct cifs_tcon *tcon, const stru
if (rc) {
spin_lock(&tcon->tc_lock);
if (tcon->status == TID_IN_TCON)
- tcon->status = TID_NEED_TCON;
+ tcon->status = TID_NEED_RECON;
spin_unlock(&tcon->tc_lock);
} else {
spin_lock(&tcon->tc_lock);
diff --git a/fs/cifs/dfs.c b/fs/cifs/dfs.c
index b64d20374b9c..d37af02902c5 100644
--- a/fs/cifs/dfs.c
+++ b/fs/cifs/dfs.c
@@ -479,9 +479,15 @@ int cifs_tree_connect(const unsigned int xid, struct cifs_tcon *tcon, const stru
/* only send once per connect */
spin_lock(&tcon->tc_lock);
- if (tcon->ses->ses_status != SES_GOOD ||
- (tcon->status != TID_NEW &&
- tcon->status != TID_NEED_TCON)) {
+ if (tcon->status != TID_GOOD &&
+ tcon->status != TID_NEW &&
+ tcon->status != TID_NEED_RECON) {
+ spin_unlock(&tcon->tc_lock);
+ return -EHOSTDOWN;
+ }
+
+ if (tcon->status == TID_NEED_FILES_INVALIDATE ||
+ tcon->status == TID_GOOD) {
spin_unlock(&tcon->tc_lock);
return 0;
}
@@ -529,12 +535,12 @@ int cifs_tree_connect(const unsigned int xid, struct cifs_tcon *tcon, const stru
if (rc) {
spin_lock(&tcon->tc_lock);
if (tcon->status == TID_IN_TCON)
- tcon->status = TID_NEED_TCON;
+ tcon->status = TID_NEED_RECON;
spin_unlock(&tcon->tc_lock);
} else {
spin_lock(&tcon->tc_lock);
if (tcon->status == TID_IN_TCON)
- tcon->status = TID_GOOD;
+ tcon->status = TID_NEED_FILES_INVALIDATE;
spin_unlock(&tcon->tc_lock);
tcon->need_reconnect = false;
}
diff --git a/fs/cifs/file.c b/fs/cifs/file.c
index 4d4a2d82636d..96d865e108f4 100644
--- a/fs/cifs/file.c
+++ b/fs/cifs/file.c
@@ -174,13 +174,13 @@ cifs_mark_open_files_invalid(struct cifs_tcon *tcon)
struct list_head *tmp1;
/* only send once per connect */
- spin_lock(&tcon->ses->ses_lock);
- if ((tcon->ses->ses_status != SES_GOOD) || (tcon->status != TID_NEED_RECON)) {
- spin_unlock(&tcon->ses->ses_lock);
+ spin_lock(&tcon->tc_lock);
+ if (tcon->status != TID_NEED_FILES_INVALIDATE) {
+ spin_unlock(&tcon->tc_lock);
return;
}
tcon->status = TID_IN_FILES_INVALIDATE;
- spin_unlock(&tcon->ses->ses_lock);
+ spin_unlock(&tcon->tc_lock);
/* list all files open on tree connection and mark them invalid */
spin_lock(&tcon->open_file_lock);
@@ -194,7 +194,7 @@ cifs_mark_open_files_invalid(struct cifs_tcon *tcon)
invalidate_all_cached_dirs(tcon);
spin_lock(&tcon->tc_lock);
if (tcon->status == TID_IN_FILES_INVALIDATE)
- tcon->status = TID_NEED_TCON;
+ tcon->status = TID_GOOD;
spin_unlock(&tcon->tc_lock);
/*
--
2.34.1
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH 02/11] cifs: generate signkey for the channel that's reconnecting
2023-03-10 15:32 [PATCH 01/11] cifs: fix tcon status change after tree connect Shyam Prasad N
@ 2023-03-10 15:32 ` Shyam Prasad N
2023-03-10 15:32 ` [PATCH 03/11] cifs: avoid race conditions with parallel reconnects Shyam Prasad N
` (9 subsequent siblings)
10 siblings, 0 replies; 31+ messages in thread
From: Shyam Prasad N @ 2023-03-10 15:32 UTC (permalink / raw)
To: smfrench, bharathsm.hsk, pc, tom, linux-cifs; +Cc: Shyam Prasad N
Before my changes to how multichannel reconnects work, the
primary channel was always used to do a non-binding session
setup. With my changes, that is not the case anymore.
Missed this place where channel at index 0 was forcibly
updated with the signing key.
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
---
fs/cifs/smb2transport.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/cifs/smb2transport.c b/fs/cifs/smb2transport.c
index 381babc1212c..d827b7547ffa 100644
--- a/fs/cifs/smb2transport.c
+++ b/fs/cifs/smb2transport.c
@@ -425,7 +425,7 @@ generate_smb3signingkey(struct cifs_ses *ses,
/* safe to access primary channel, since it will never go away */
spin_lock(&ses->chan_lock);
- memcpy(ses->chans[0].signkey, ses->smb3signingkey,
+ memcpy(ses->chans[chan_index].signkey, ses->smb3signingkey,
SMB3_SIGN_KEY_SIZE);
spin_unlock(&ses->chan_lock);
--
2.34.1
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH 03/11] cifs: avoid race conditions with parallel reconnects
2023-03-10 15:32 [PATCH 01/11] cifs: fix tcon status change after tree connect Shyam Prasad N
2023-03-10 15:32 ` [PATCH 02/11] cifs: generate signkey for the channel that's reconnecting Shyam Prasad N
@ 2023-03-10 15:32 ` Shyam Prasad N
2023-03-10 15:32 ` [PATCH 04/11] cifs: serialize channel reconnects Shyam Prasad N
` (8 subsequent siblings)
10 siblings, 0 replies; 31+ messages in thread
From: Shyam Prasad N @ 2023-03-10 15:32 UTC (permalink / raw)
To: smfrench, bharathsm.hsk, pc, tom, linux-cifs; +Cc: Shyam Prasad N
When multiple processes/channels do reconnects in parallel
we used to return success immediately
negotiate/session-setup/tree-connect, causing race conditions
between processes that enter the function in parallel.
This caused several errors related to session not found to
show up during parallel reconnects.
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
---
fs/cifs/cifs_debug.c | 5 ++
fs/cifs/cifsglob.h | 14 +++-
fs/cifs/connect.c | 145 ++++++++++++++++++++++++++++++++++++----
fs/cifs/dfs.c | 33 +++++++++
fs/cifs/misc.c | 2 +
fs/cifs/sess.c | 8 ++-
fs/cifs/smb2pdu.c | 35 +++++-----
fs/cifs/smb2transport.c | 17 ++++-
8 files changed, 225 insertions(+), 34 deletions(-)
diff --git a/fs/cifs/cifs_debug.c b/fs/cifs/cifs_debug.c
index 1911f7016fa1..4391c7aac3cb 100644
--- a/fs/cifs/cifs_debug.c
+++ b/fs/cifs/cifs_debug.c
@@ -406,6 +406,11 @@ static int cifs_debug_data_proc_show(struct seq_file *m, void *v)
ses->capabilities, ses->ses_status);
}
+ if (ses->chan_count > 1)
+ seq_printf(m, "\n\tChannel reconnect bitmaps: 0x%lx 0x%lx",
+ ses->chans_need_reconnect,
+ ses->chans_in_reconnect);
+
seq_printf(m, "\n\tSecurity type: %s ",
get_security_type_str(server->ops->select_sectype(server, ses->sectype)));
diff --git a/fs/cifs/cifsglob.h b/fs/cifs/cifsglob.h
index 8a37b1553dc6..81ff13e41f97 100644
--- a/fs/cifs/cifsglob.h
+++ b/fs/cifs/cifsglob.h
@@ -624,6 +624,7 @@ struct TCP_Server_Info {
#ifdef CONFIG_NET_NS
struct net *net;
#endif
+ wait_queue_head_t reconnect_q; /* for handling parallel reconnects */
wait_queue_head_t response_q;
wait_queue_head_t request_q; /* if more than maxmpx to srvr must block*/
spinlock_t mid_lock; /* protect mid queue and it's entries */
@@ -1002,7 +1003,6 @@ iface_cmp(struct cifs_server_iface *a, struct cifs_server_iface *b)
}
struct cifs_chan {
- unsigned int in_reconnect : 1; /* if session setup in progress for this channel */
struct TCP_Server_Info *server;
struct cifs_server_iface *iface; /* interface in use */
__u8 signkey[SMB3_SIGN_KEY_SIZE];
@@ -1017,6 +1017,7 @@ struct cifs_ses {
struct list_head tcon_list;
struct cifs_tcon *tcon_ipc;
spinlock_t ses_lock; /* protect anything here that is not protected */
+ wait_queue_head_t reconnect_q; /* for handling parallel reconnects */
struct mutex session_mutex;
struct TCP_Server_Info *server; /* pointer to server info */
int ses_count; /* reference counter */
@@ -1076,7 +1077,9 @@ struct cifs_ses {
#define CIFS_CHAN_NEEDS_RECONNECT(ses, index) \
test_bit((index), &(ses)->chans_need_reconnect)
#define CIFS_CHAN_IN_RECONNECT(ses, index) \
- ((ses)->chans[(index)].in_reconnect)
+ test_bit((index), &(ses)->chans_in_reconnect)
+#define CIFS_ALL_CHANS_IN_RECONNECT(ses) \
+ ((ses)->chans_in_reconnect == CIFS_ALL_CHANNELS_SET(ses))
struct cifs_chan chans[CIFS_MAX_CHANNELS];
size_t chan_count;
@@ -1092,8 +1095,14 @@ struct cifs_ses {
* channels are marked for needing reconnection. This will
* enable the sessions on top to continue to live till any
* of the channels below are active.
+ *
+ * chans_in_reconnect is a bitmap indicating which channels
+ * are in the process of reconnecting. This is needed
+ * to avoid race conditions between processes which
+ * do channel binding in parallel.
*/
unsigned long chans_need_reconnect;
+ unsigned long chans_in_reconnect;
/* ========= end: protected by chan_lock ======== */
struct cifs_ses *dfs_root_ses;
};
@@ -1145,6 +1154,7 @@ struct cifs_tcon {
int tc_count;
struct list_head rlist; /* reconnect list */
spinlock_t tc_lock; /* protect anything here that is not protected */
+ wait_queue_head_t reconnect_q; /* for handling parallel reconnects */
atomic_t num_local_opens; /* num of all opens including disconnected */
atomic_t num_remote_opens; /* num of all network opens on server */
struct list_head openFileList;
diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c
index 3d07729c91a1..7b103f69432e 100644
--- a/fs/cifs/connect.c
+++ b/fs/cifs/connect.c
@@ -212,8 +212,10 @@ cifs_mark_tcp_ses_conns_for_reconnect(struct TCP_Server_Info *server,
cifs_chan_update_iface(ses, server);
spin_lock(&ses->chan_lock);
- if (!mark_smb_session && cifs_chan_needs_reconnect(ses, server))
- goto next_session;
+ if (!mark_smb_session && cifs_chan_needs_reconnect(ses, server)) {
+ spin_unlock(&ses->chan_lock);
+ continue;
+ }
if (mark_smb_session)
CIFS_SET_ALL_CHANS_NEED_RECONNECT(ses);
@@ -221,22 +223,28 @@ cifs_mark_tcp_ses_conns_for_reconnect(struct TCP_Server_Info *server,
cifs_chan_set_need_reconnect(ses, server);
/* If all channels need reconnect, then tcon needs reconnect */
- if (!mark_smb_session && !CIFS_ALL_CHANS_NEED_RECONNECT(ses))
- goto next_session;
+ if (!mark_smb_session && !CIFS_ALL_CHANS_NEED_RECONNECT(ses)) {
+ spin_unlock(&ses->chan_lock);
+ continue;
+ }
+ spin_unlock(&ses->chan_lock);
+ spin_lock(&ses->ses_lock);
ses->ses_status = SES_NEED_RECON;
+ spin_unlock(&ses->ses_lock);
list_for_each_entry(tcon, &ses->tcon_list, tcon_list) {
tcon->need_reconnect = true;
+ spin_lock(&tcon->tc_lock);
tcon->status = TID_NEED_RECON;
+ spin_unlock(&tcon->tc_lock);
}
if (ses->tcon_ipc) {
ses->tcon_ipc->need_reconnect = true;
+ spin_lock(&ses->tcon_ipc->tc_lock);
ses->tcon_ipc->status = TID_NEED_RECON;
+ spin_unlock(&ses->tcon_ipc->tc_lock);
}
-
-next_session:
- spin_unlock(&ses->chan_lock);
}
spin_unlock(&cifs_tcp_ses_lock);
}
@@ -1596,6 +1604,7 @@ cifs_get_tcp_session(struct smb3_fs_context *ctx,
}
init_waitqueue_head(&tcp_ses->response_q);
init_waitqueue_head(&tcp_ses->request_q);
+ init_waitqueue_head(&tcp_ses->reconnect_q);
INIT_LIST_HEAD(&tcp_ses->pending_mid_q);
mutex_init(&tcp_ses->_srv_mutex);
memcpy(tcp_ses->workstation_RFC1001_name,
@@ -3648,17 +3657,55 @@ cifs_negotiate_protocol(const unsigned int xid, struct cifs_ses *ses,
struct TCP_Server_Info *server)
{
int rc = 0;
+ int retries = server->nr_targets;
if (!server->ops->need_neg || !server->ops->negotiate)
return -ENOSYS;
+check_again:
/* only send once per connect */
spin_lock(&server->srv_lock);
- if (!server->ops->need_neg(server) ||
+ if (server->tcpStatus != CifsGood &&
+ server->tcpStatus != CifsNew &&
server->tcpStatus != CifsNeedNegotiate) {
+ spin_unlock(&server->srv_lock);
+ return -EHOSTDOWN;
+ }
+
+ if (!server->ops->need_neg(server) &&
+ server->tcpStatus == CifsGood) {
spin_unlock(&server->srv_lock);
return 0;
}
+
+ /* another process is in the processs of negotiating */
+ while (server->tcpStatus == CifsInNegotiate) {
+ spin_unlock(&server->srv_lock);
+ rc = wait_event_interruptible_timeout(server->reconnect_q,
+ (server->tcpStatus != CifsInNegotiate),
+ HZ);
+ if (rc < 0) {
+ cifs_dbg(FYI, "%s: aborting negotiate due to a received signal by the process\n",
+ __func__);
+ return -ERESTARTSYS;
+ }
+ spin_lock(&server->srv_lock);
+
+ /* are we still waiting for others */
+ if (server->tcpStatus != CifsInNegotiate) {
+ spin_unlock(&server->srv_lock);
+ goto check_again;
+ }
+
+ if (retries && --retries)
+ continue;
+
+ cifs_dbg(FYI, "gave up waiting on CifsInNegotiate\n");
+ spin_unlock(&server->srv_lock);
+ return -EHOSTDOWN;
+ }
+
+ /* now mark the server so that others don't reach here */
server->tcpStatus = CifsInNegotiate;
spin_unlock(&server->srv_lock);
@@ -3676,6 +3723,7 @@ cifs_negotiate_protocol(const unsigned int xid, struct cifs_ses *ses,
server->tcpStatus = CifsNeedNegotiate;
spin_unlock(&server->srv_lock);
}
+ wake_up(&server->reconnect_q);
return rc;
}
@@ -3690,25 +3738,63 @@ cifs_setup_session(const unsigned int xid, struct cifs_ses *ses,
struct sockaddr_in6 *addr6 = (struct sockaddr_in6 *)&pserver->dstaddr;
struct sockaddr_in *addr = (struct sockaddr_in *)&pserver->dstaddr;
bool is_binding = false;
+ int retries;
+
+check_again:
+ retries = 5;
spin_lock(&ses->ses_lock);
if (ses->ses_status != SES_GOOD &&
ses->ses_status != SES_NEW &&
ses->ses_status != SES_NEED_RECON) {
spin_unlock(&ses->ses_lock);
- return 0;
+ return -EHOSTDOWN;
}
/* only send once per connect */
spin_lock(&ses->chan_lock);
- if (CIFS_ALL_CHANS_GOOD(ses) ||
- cifs_chan_in_reconnect(ses, server)) {
+ if (CIFS_ALL_CHANS_GOOD(ses)) {
+ if (ses->ses_status == SES_NEED_RECON)
+ ses->ses_status = SES_GOOD;
spin_unlock(&ses->chan_lock);
spin_unlock(&ses->ses_lock);
return 0;
}
- is_binding = !CIFS_ALL_CHANS_NEED_RECONNECT(ses);
+
+ /* another process is in the processs of sess setup */
+ while (cifs_chan_in_reconnect(ses, server)) {
+ spin_unlock(&ses->chan_lock);
+ spin_unlock(&ses->ses_lock);
+ rc = wait_event_interruptible_timeout(ses->reconnect_q,
+ (!cifs_chan_in_reconnect(ses, server)),
+ HZ);
+ if (rc < 0) {
+ cifs_dbg(FYI, "%s: aborting sess setup due to a received signal by the process\n",
+ __func__);
+ return -ERESTARTSYS;
+ }
+ spin_lock(&ses->ses_lock);
+ spin_lock(&ses->chan_lock);
+
+ /* are we still trying to reconnect? */
+ if (!cifs_chan_in_reconnect(ses, server)) {
+ spin_unlock(&ses->chan_lock);
+ spin_unlock(&ses->ses_lock);
+ goto check_again;
+ }
+
+ if (retries && --retries)
+ continue;
+
+ cifs_dbg(FYI, "gave up waiting on cifs_chan_in_reconnect\n");
+ spin_unlock(&ses->chan_lock);
+ spin_unlock(&ses->ses_lock);
+ return -EHOSTDOWN;
+ }
+
+ /* now mark the session so that others don't reach here */
cifs_chan_set_in_reconnect(ses, server);
+ is_binding = !CIFS_ALL_CHANS_NEED_RECONNECT(ses);
spin_unlock(&ses->chan_lock);
if (!is_binding)
@@ -3762,6 +3848,7 @@ cifs_setup_session(const unsigned int xid, struct cifs_ses *ses,
spin_unlock(&ses->chan_lock);
spin_unlock(&ses->ses_lock);
}
+ wake_up(&ses->reconnect_q);
return rc;
}
@@ -4035,6 +4122,10 @@ int cifs_tree_connect(const unsigned int xid, struct cifs_tcon *tcon, const stru
{
int rc;
const struct smb_version_operations *ops = tcon->ses->server->ops;
+ int retries;
+
+check_again:
+ retries = 5;
/* only send once per connect */
spin_lock(&tcon->tc_lock);
@@ -4050,6 +4141,35 @@ int cifs_tree_connect(const unsigned int xid, struct cifs_tcon *tcon, const stru
spin_unlock(&tcon->tc_lock);
return 0;
}
+
+ /* another process is in the processs of negotiating */
+ while (tcon->status == TID_IN_TCON) {
+ spin_unlock(&tcon->tc_lock);
+ rc = wait_event_interruptible_timeout(tcon->reconnect_q,
+ (tcon->status != TID_IN_TCON),
+ HZ);
+ if (rc < 0) {
+ cifs_dbg(FYI, "%s: aborting tree connect due to a received signal by the process\n",
+ __func__);
+ return -ERESTARTSYS;
+ }
+ spin_lock(&tcon->tc_lock);
+
+ /* are we still trying to reconnect? */
+ if (tcon->status != TID_IN_TCON) {
+ spin_unlock(&tcon->tc_lock);
+ goto check_again;
+ }
+
+ if (retries && --retries)
+ continue;
+
+ cifs_dbg(FYI, "gave up waiting on TID_IN_TCON\n");
+ spin_unlock(&tcon->tc_lock);
+ return -EHOSTDOWN;
+ }
+
+ /* now mark the tcon so that others don't reach here */
tcon->status = TID_IN_TCON;
spin_unlock(&tcon->tc_lock);
@@ -4066,6 +4186,7 @@ int cifs_tree_connect(const unsigned int xid, struct cifs_tcon *tcon, const stru
tcon->need_reconnect = false;
spin_unlock(&tcon->tc_lock);
}
+ wake_up(&tcon->reconnect_q);
return rc;
}
diff --git a/fs/cifs/dfs.c b/fs/cifs/dfs.c
index d37af02902c5..013a399088c3 100644
--- a/fs/cifs/dfs.c
+++ b/fs/cifs/dfs.c
@@ -476,6 +476,10 @@ int cifs_tree_connect(const unsigned int xid, struct cifs_tcon *tcon, const stru
struct dfs_cache_tgt_list tl = DFS_CACHE_TGT_LIST_INIT(tl);
char *tree;
struct dfs_info3_param ref = {0};
+ int retries;
+
+check_again:
+ retries = 5;
/* only send once per connect */
spin_lock(&tcon->tc_lock);
@@ -491,6 +495,35 @@ int cifs_tree_connect(const unsigned int xid, struct cifs_tcon *tcon, const stru
spin_unlock(&tcon->tc_lock);
return 0;
}
+
+ /* another process is in the processs of negotiating */
+ while (tcon->status == TID_IN_TCON) {
+ spin_unlock(&tcon->tc_lock);
+ rc = wait_event_interruptible_timeout(tcon->reconnect_q,
+ (tcon->status != TID_IN_TCON),
+ HZ);
+ if (rc < 0) {
+ cifs_dbg(FYI, "%s: aborting tree connect due to a received signal by the process\n",
+ __func__);
+ return -ERESTARTSYS;
+ }
+ spin_lock(&tcon->tc_lock);
+
+ /* are we still trying to reconnect? */
+ if (tcon->status != TID_IN_TCON) {
+ spin_unlock(&tcon->tc_lock);
+ goto check_again;
+ }
+
+ if (retries && --retries)
+ continue;
+
+ cifs_dbg(FYI, "gave up waiting on TID_IN_TCON\n");
+ spin_unlock(&tcon->tc_lock);
+ return -EHOSTDOWN;
+ }
+
+ /* now mark the tcon so that others don't reach here */
tcon->status = TID_IN_TCON;
spin_unlock(&tcon->tc_lock);
diff --git a/fs/cifs/misc.c b/fs/cifs/misc.c
index a0d286ee723d..5a974689fde9 100644
--- a/fs/cifs/misc.c
+++ b/fs/cifs/misc.c
@@ -80,6 +80,7 @@ sesInfoAlloc(void)
spin_lock_init(&ret_buf->iface_lock);
INIT_LIST_HEAD(&ret_buf->iface_list);
spin_lock_init(&ret_buf->chan_lock);
+ init_waitqueue_head(&ret_buf->reconnect_q);
}
return ret_buf;
}
@@ -134,6 +135,7 @@ tconInfoAlloc(void)
spin_lock_init(&ret_buf->stat_lock);
atomic_set(&ret_buf->num_local_opens, 0);
atomic_set(&ret_buf->num_remote_opens, 0);
+ init_waitqueue_head(&ret_buf->reconnect_q);
return ret_buf;
}
diff --git a/fs/cifs/sess.c b/fs/cifs/sess.c
index d2cbae4b5d21..b8bfebe4498e 100644
--- a/fs/cifs/sess.c
+++ b/fs/cifs/sess.c
@@ -94,7 +94,9 @@ cifs_chan_set_in_reconnect(struct cifs_ses *ses,
{
unsigned int chan_index = cifs_ses_get_chan_index(ses, server);
- ses->chans[chan_index].in_reconnect = true;
+ set_bit(chan_index, &ses->chans_in_reconnect);
+ cifs_dbg(FYI, "Set in-reconnect bitmask for chan %u; now 0x%lx\n",
+ chan_index, ses->chans_in_reconnect);
}
void
@@ -103,7 +105,9 @@ cifs_chan_clear_in_reconnect(struct cifs_ses *ses,
{
unsigned int chan_index = cifs_ses_get_chan_index(ses, server);
- ses->chans[chan_index].in_reconnect = false;
+ clear_bit(chan_index, &ses->chans_in_reconnect);
+ cifs_dbg(FYI, "Cleared in-reconnect bitmask for chan %u; now 0x%lx\n",
+ chan_index, ses->chans_in_reconnect);
}
bool
diff --git a/fs/cifs/smb2pdu.c b/fs/cifs/smb2pdu.c
index 0e53265e1462..52318a79c848 100644
--- a/fs/cifs/smb2pdu.c
+++ b/fs/cifs/smb2pdu.c
@@ -215,8 +215,9 @@ smb2_reconnect(__le16 smb2_command, struct cifs_tcon *tcon,
return 0;
}
spin_unlock(&ses->chan_lock);
- cifs_dbg(FYI, "sess reconnect mask: 0x%lx, tcon reconnect: %d",
+ cifs_dbg(FYI, "sess reconnect masks: 0x%lx 0x%lx, tcon reconnect: %d",
tcon->ses->chans_need_reconnect,
+ tcon->ses->chans_in_reconnect,
tcon->need_reconnect);
nls_codepage = load_nls_default();
@@ -238,9 +239,12 @@ smb2_reconnect(__le16 smb2_command, struct cifs_tcon *tcon,
* need to prevent multiple threads trying to simultaneously
* reconnect the same SMB session
*/
+ spin_lock(&ses->ses_lock);
spin_lock(&ses->chan_lock);
- if (!cifs_chan_needs_reconnect(ses, server)) {
+ if (!cifs_chan_needs_reconnect(ses, server) &&
+ ses->ses_status == SES_GOOD) {
spin_unlock(&ses->chan_lock);
+ spin_unlock(&ses->ses_lock);
/* this means that we only need to tree connect */
if (tcon->need_reconnect)
@@ -249,6 +253,7 @@ smb2_reconnect(__le16 smb2_command, struct cifs_tcon *tcon,
goto out;
}
spin_unlock(&ses->chan_lock);
+ spin_unlock(&ses->ses_lock);
mutex_lock(&ses->session_mutex);
rc = cifs_negotiate_protocol(0, ses, server);
@@ -284,7 +289,7 @@ smb2_reconnect(__le16 smb2_command, struct cifs_tcon *tcon,
cifs_dbg(FYI, "reconnect tcon rc = %d\n", rc);
if (rc) {
/* If sess reconnected but tcon didn't, something strange ... */
- pr_warn_once("reconnect tcon failed rc = %d\n", rc);
+ cifs_dbg(VFS, "reconnect tcon failed rc = %d\n", rc);
goto out;
}
@@ -1256,9 +1261,9 @@ SMB2_sess_alloc_buffer(struct SMB2_sess_data *sess_data)
if (rc)
return rc;
- spin_lock(&ses->chan_lock);
- is_binding = !CIFS_ALL_CHANS_NEED_RECONNECT(ses);
- spin_unlock(&ses->chan_lock);
+ spin_lock(&ses->ses_lock);
+ is_binding = (ses->ses_status == SES_GOOD);
+ spin_unlock(&ses->ses_lock);
if (is_binding) {
req->hdr.SessionId = cpu_to_le64(ses->Suid);
@@ -1416,9 +1421,9 @@ SMB2_auth_kerberos(struct SMB2_sess_data *sess_data)
goto out_put_spnego_key;
}
- spin_lock(&ses->chan_lock);
- is_binding = !CIFS_ALL_CHANS_NEED_RECONNECT(ses);
- spin_unlock(&ses->chan_lock);
+ spin_lock(&ses->ses_lock);
+ is_binding = (ses->ses_status == SES_GOOD);
+ spin_unlock(&ses->ses_lock);
/* keep session key if binding */
if (!is_binding) {
@@ -1542,9 +1547,9 @@ SMB2_sess_auth_rawntlmssp_negotiate(struct SMB2_sess_data *sess_data)
cifs_dbg(FYI, "rawntlmssp session setup challenge phase\n");
- spin_lock(&ses->chan_lock);
- is_binding = !CIFS_ALL_CHANS_NEED_RECONNECT(ses);
- spin_unlock(&ses->chan_lock);
+ spin_lock(&ses->ses_lock);
+ is_binding = (ses->ses_status == SES_GOOD);
+ spin_unlock(&ses->ses_lock);
/* keep existing ses id and flags if binding */
if (!is_binding) {
@@ -1610,9 +1615,9 @@ SMB2_sess_auth_rawntlmssp_authenticate(struct SMB2_sess_data *sess_data)
rsp = (struct smb2_sess_setup_rsp *)sess_data->iov[0].iov_base;
- spin_lock(&ses->chan_lock);
- is_binding = !CIFS_ALL_CHANS_NEED_RECONNECT(ses);
- spin_unlock(&ses->chan_lock);
+ spin_lock(&ses->ses_lock);
+ is_binding = (ses->ses_status == SES_GOOD);
+ spin_unlock(&ses->ses_lock);
/* keep existing ses id and flags if binding */
if (!is_binding) {
diff --git a/fs/cifs/smb2transport.c b/fs/cifs/smb2transport.c
index d827b7547ffa..790acf65a092 100644
--- a/fs/cifs/smb2transport.c
+++ b/fs/cifs/smb2transport.c
@@ -81,6 +81,7 @@ int smb2_get_sign_key(__u64 ses_id, struct TCP_Server_Info *server, u8 *key)
struct cifs_ses *ses = NULL;
int i;
int rc = 0;
+ bool is_binding = false;
spin_lock(&cifs_tcp_ses_lock);
@@ -97,9 +98,12 @@ int smb2_get_sign_key(__u64 ses_id, struct TCP_Server_Info *server, u8 *key)
goto out;
found:
+ spin_lock(&ses->ses_lock);
spin_lock(&ses->chan_lock);
- if (cifs_chan_needs_reconnect(ses, server) &&
- !CIFS_ALL_CHANS_NEED_RECONNECT(ses)) {
+
+ is_binding = (cifs_chan_needs_reconnect(ses, server) &&
+ ses->ses_status == SES_GOOD);
+ if (is_binding) {
/*
* If we are in the process of binding a new channel
* to an existing session, use the master connection
@@ -107,6 +111,7 @@ int smb2_get_sign_key(__u64 ses_id, struct TCP_Server_Info *server, u8 *key)
*/
memcpy(key, ses->smb3signingkey, SMB3_SIGN_KEY_SIZE);
spin_unlock(&ses->chan_lock);
+ spin_unlock(&ses->ses_lock);
goto out;
}
@@ -119,10 +124,12 @@ int smb2_get_sign_key(__u64 ses_id, struct TCP_Server_Info *server, u8 *key)
if (chan->server == server) {
memcpy(key, chan->signkey, SMB3_SIGN_KEY_SIZE);
spin_unlock(&ses->chan_lock);
+ spin_unlock(&ses->ses_lock);
goto out;
}
}
spin_unlock(&ses->chan_lock);
+ spin_unlock(&ses->ses_lock);
cifs_dbg(VFS,
"%s: Could not find channel signing key for session 0x%llx\n",
@@ -392,11 +399,15 @@ generate_smb3signingkey(struct cifs_ses *ses,
bool is_binding = false;
int chan_index = 0;
+ spin_lock(&ses->ses_lock);
spin_lock(&ses->chan_lock);
- is_binding = !CIFS_ALL_CHANS_NEED_RECONNECT(ses);
+ is_binding = (cifs_chan_needs_reconnect(ses, server) &&
+ ses->ses_status == SES_GOOD);
+
chan_index = cifs_ses_get_chan_index(ses, server);
/* TODO: introduce ref counting for channels when the can be freed */
spin_unlock(&ses->chan_lock);
+ spin_unlock(&ses->ses_lock);
/*
* All channels use the same encryption/decryption keys but
--
2.34.1
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH 04/11] cifs: serialize channel reconnects
2023-03-10 15:32 [PATCH 01/11] cifs: fix tcon status change after tree connect Shyam Prasad N
2023-03-10 15:32 ` [PATCH 02/11] cifs: generate signkey for the channel that's reconnecting Shyam Prasad N
2023-03-10 15:32 ` [PATCH 03/11] cifs: avoid race conditions with parallel reconnects Shyam Prasad N
@ 2023-03-10 15:32 ` Shyam Prasad N
2023-03-10 22:40 ` Steve French
2023-03-10 15:32 ` [PATCH 05/11] cifs: lock chan_lock outside match_session Shyam Prasad N
` (7 subsequent siblings)
10 siblings, 1 reply; 31+ messages in thread
From: Shyam Prasad N @ 2023-03-10 15:32 UTC (permalink / raw)
To: smfrench, bharathsm.hsk, pc, tom, linux-cifs; +Cc: Shyam Prasad N
Parallel session reconnects are prone to race conditions
that are difficult to avoid cleanly. The changes so far do
ensure that parallel reconnects eventually go through.
But that can take multiple session setups on the same channel.
Avoiding that by serializing the session setups on parallel
channels. In doing so, we should avoid such issues.
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
---
fs/cifs/connect.c | 18 ++++++++++++++----
1 file changed, 14 insertions(+), 4 deletions(-)
diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c
index 7b103f69432e..4ea1e51c3fa5 100644
--- a/fs/cifs/connect.c
+++ b/fs/cifs/connect.c
@@ -222,6 +222,11 @@ cifs_mark_tcp_ses_conns_for_reconnect(struct TCP_Server_Info *server,
else
cifs_chan_set_need_reconnect(ses, server);
+ cifs_dbg(FYI, "%s: channel connect bitmaps: 0x%lx 0x%lx\n",
+ __func__,
+ ses->chans_need_reconnect,
+ ses->chans_in_reconnect);
+
/* If all channels need reconnect, then tcon needs reconnect */
if (!mark_smb_session && !CIFS_ALL_CHANS_NEED_RECONNECT(ses)) {
spin_unlock(&ses->chan_lock);
@@ -3744,6 +3749,11 @@ cifs_setup_session(const unsigned int xid, struct cifs_ses *ses,
retries = 5;
spin_lock(&ses->ses_lock);
+ cifs_dbg(FYI, "%s: channel connect bitmaps: 0x%lx 0x%lx\n",
+ __func__,
+ ses->chans_need_reconnect,
+ ses->chans_in_reconnect);
+
if (ses->ses_status != SES_GOOD &&
ses->ses_status != SES_NEW &&
ses->ses_status != SES_NEED_RECON) {
@@ -3762,11 +3772,11 @@ cifs_setup_session(const unsigned int xid, struct cifs_ses *ses,
}
/* another process is in the processs of sess setup */
- while (cifs_chan_in_reconnect(ses, server)) {
+ while (ses->chans_in_reconnect) {
spin_unlock(&ses->chan_lock);
spin_unlock(&ses->ses_lock);
rc = wait_event_interruptible_timeout(ses->reconnect_q,
- (!cifs_chan_in_reconnect(ses, server)),
+ (!ses->chans_in_reconnect),
HZ);
if (rc < 0) {
cifs_dbg(FYI, "%s: aborting sess setup due to a received signal by the process\n",
@@ -3776,8 +3786,8 @@ cifs_setup_session(const unsigned int xid, struct cifs_ses *ses,
spin_lock(&ses->ses_lock);
spin_lock(&ses->chan_lock);
- /* are we still trying to reconnect? */
- if (!cifs_chan_in_reconnect(ses, server)) {
+ /* did the bitmask change? */
+ if (!ses->chans_in_reconnect) {
spin_unlock(&ses->chan_lock);
spin_unlock(&ses->ses_lock);
goto check_again;
--
2.34.1
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH 05/11] cifs: lock chan_lock outside match_session
2023-03-10 15:32 [PATCH 01/11] cifs: fix tcon status change after tree connect Shyam Prasad N
` (2 preceding siblings ...)
2023-03-10 15:32 ` [PATCH 04/11] cifs: serialize channel reconnects Shyam Prasad N
@ 2023-03-10 15:32 ` Shyam Prasad N
2023-03-10 15:32 ` [PATCH 06/11] cifs: fix sockaddr comparison in iface_cmp Shyam Prasad N
` (6 subsequent siblings)
10 siblings, 0 replies; 31+ messages in thread
From: Shyam Prasad N @ 2023-03-10 15:32 UTC (permalink / raw)
To: smfrench, bharathsm.hsk, pc, tom, linux-cifs; +Cc: Shyam Prasad N
Coverity had rightly indicated a possible deadlock
due to chan_lock being done inside match_session.
All callers of match_* functions should pick up the
necessary locks and call them.
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
---
fs/cifs/connect.c | 13 +++++++------
1 file changed, 7 insertions(+), 6 deletions(-)
diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c
index 4ea1e51c3fa5..fb9d9994df09 100644
--- a/fs/cifs/connect.c
+++ b/fs/cifs/connect.c
@@ -1735,7 +1735,7 @@ cifs_get_tcp_session(struct smb3_fs_context *ctx,
return ERR_PTR(rc);
}
-/* this function must be called with ses_lock held */
+/* this function must be called with ses_lock and chan_lock held */
static int match_session(struct cifs_ses *ses, struct smb3_fs_context *ctx)
{
if (ctx->sectype != Unspecified &&
@@ -1746,12 +1746,8 @@ static int match_session(struct cifs_ses *ses, struct smb3_fs_context *ctx)
* If an existing session is limited to less channels than
* requested, it should not be reused
*/
- spin_lock(&ses->chan_lock);
- if (ses->chan_max < ctx->max_channels) {
- spin_unlock(&ses->chan_lock);
+ if (ses->chan_max < ctx->max_channels)
return 0;
- }
- spin_unlock(&ses->chan_lock);
switch (ses->sectype) {
case Kerberos:
@@ -1879,10 +1875,13 @@ cifs_find_smb_ses(struct TCP_Server_Info *server, struct smb3_fs_context *ctx)
spin_unlock(&ses->ses_lock);
continue;
}
+ spin_lock(&ses->chan_lock);
if (!match_session(ses, ctx)) {
+ spin_unlock(&ses->chan_lock);
spin_unlock(&ses->ses_lock);
continue;
}
+ spin_unlock(&ses->chan_lock);
spin_unlock(&ses->ses_lock);
++ses->ses_count;
@@ -2706,6 +2705,7 @@ cifs_match_super(struct super_block *sb, void *data)
spin_lock(&tcp_srv->srv_lock);
spin_lock(&ses->ses_lock);
+ spin_lock(&ses->chan_lock);
spin_lock(&tcon->tc_lock);
if (!match_server(tcp_srv, ctx, dfs_super_cmp) ||
!match_session(ses, ctx) ||
@@ -2718,6 +2718,7 @@ cifs_match_super(struct super_block *sb, void *data)
rc = compare_mount_options(sb, mnt_data);
out:
spin_unlock(&tcon->tc_lock);
+ spin_unlock(&ses->chan_lock);
spin_unlock(&ses->ses_lock);
spin_unlock(&tcp_srv->srv_lock);
--
2.34.1
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH 06/11] cifs: fix sockaddr comparison in iface_cmp
2023-03-10 15:32 [PATCH 01/11] cifs: fix tcon status change after tree connect Shyam Prasad N
` (3 preceding siblings ...)
2023-03-10 15:32 ` [PATCH 05/11] cifs: lock chan_lock outside match_session Shyam Prasad N
@ 2023-03-10 15:32 ` Shyam Prasad N
2023-03-11 4:51 ` kernel test robot
2023-03-10 15:32 ` [PATCH 07/11] cifs: do not poll server interfaces too regularly Shyam Prasad N
` (5 subsequent siblings)
10 siblings, 1 reply; 31+ messages in thread
From: Shyam Prasad N @ 2023-03-10 15:32 UTC (permalink / raw)
To: smfrench, bharathsm.hsk, pc, tom, linux-cifs; +Cc: Shyam Prasad N
iface_cmp used to simply do a memcmp of the two
provided struct sockaddrs. The comparison needs to do more
based on the address family. Similar logic was already
present in cifs_match_ipaddr. Doing something similar now.
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
---
fs/cifs/cifsglob.h | 37 ---------------------------------
fs/cifs/cifsproto.h | 1 +
fs/cifs/connect.c | 50 +++++++++++++++++++++++++++++++++++++++++++++
fs/cifs/smb2ops.c | 37 +++++++++++++++++++++++++++++++++
4 files changed, 88 insertions(+), 37 deletions(-)
diff --git a/fs/cifs/cifsglob.h b/fs/cifs/cifsglob.h
index 81ff13e41f97..a11e7b10f607 100644
--- a/fs/cifs/cifsglob.h
+++ b/fs/cifs/cifsglob.h
@@ -965,43 +965,6 @@ release_iface(struct kref *ref)
kfree(iface);
}
-/*
- * compare two interfaces a and b
- * return 0 if everything matches.
- * return 1 if a has higher link speed, or rdma capable, or rss capable
- * return -1 otherwise.
- */
-static inline int
-iface_cmp(struct cifs_server_iface *a, struct cifs_server_iface *b)
-{
- int cmp_ret = 0;
-
- WARN_ON(!a || !b);
- if (a->speed == b->speed) {
- if (a->rdma_capable == b->rdma_capable) {
- if (a->rss_capable == b->rss_capable) {
- cmp_ret = memcmp(&a->sockaddr, &b->sockaddr,
- sizeof(a->sockaddr));
- if (!cmp_ret)
- return 0;
- else if (cmp_ret > 0)
- return 1;
- else
- return -1;
- } else if (a->rss_capable > b->rss_capable)
- return 1;
- else
- return -1;
- } else if (a->rdma_capable > b->rdma_capable)
- return 1;
- else
- return -1;
- } else if (a->speed > b->speed)
- return 1;
- else
- return -1;
-}
-
struct cifs_chan {
struct TCP_Server_Info *server;
struct cifs_server_iface *iface; /* interface in use */
diff --git a/fs/cifs/cifsproto.h b/fs/cifs/cifsproto.h
index e2eff66eefab..30fd81268eb7 100644
--- a/fs/cifs/cifsproto.h
+++ b/fs/cifs/cifsproto.h
@@ -86,6 +86,7 @@ extern int cifs_handle_standard(struct TCP_Server_Info *server,
struct mid_q_entry *mid);
extern int smb3_parse_devname(const char *devname, struct smb3_fs_context *ctx);
extern int smb3_parse_opt(const char *options, const char *key, char **val);
+extern int cifs_ipaddr_cmp(struct sockaddr *srcaddr, struct sockaddr *rhs);
extern bool cifs_match_ipaddr(struct sockaddr *srcaddr, struct sockaddr *rhs);
extern int cifs_discard_remaining_data(struct TCP_Server_Info *server);
extern int cifs_call_async(struct TCP_Server_Info *server,
diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c
index fb9d9994df09..b9af60417194 100644
--- a/fs/cifs/connect.c
+++ b/fs/cifs/connect.c
@@ -1289,6 +1289,56 @@ cifs_demultiplex_thread(void *p)
module_put_and_kthread_exit(0);
}
+int
+cifs_ipaddr_cmp(struct sockaddr *srcaddr, struct sockaddr *rhs)
+{
+ switch (srcaddr->sa_family) {
+ case AF_UNSPEC:
+ switch (rhs->sa_family) {
+ case AF_UNSPEC:
+ return 0;
+ case AF_INET:
+ case AF_INET6:
+ return 1;
+ default:
+ return -1;
+ }
+ case AF_INET: {
+ switch (rhs->sa_family) {
+ case AF_UNSPEC:
+ return -1;
+ case AF_INET:
+ struct sockaddr_in *saddr4 = (struct sockaddr_in *)srcaddr;
+ struct sockaddr_in *vaddr4 = (struct sockaddr_in *)rhs;
+ return memcmp(&saddr4->sin_addr.s_addr,
+ &vaddr4->sin_addr.s_addr,
+ sizeof(struct sockaddr_in));
+ case AF_INET6:
+ return 1;
+ default:
+ return -1;
+ }
+ }
+ case AF_INET6: {
+ switch (rhs->sa_family) {
+ case AF_UNSPEC:
+ case AF_INET:
+ return -1;
+ case AF_INET6:
+ struct sockaddr_in6 *saddr6 = (struct sockaddr_in6 *)srcaddr;
+ struct sockaddr_in6 *vaddr6 = (struct sockaddr_in6 *)rhs;
+ return memcmp(&saddr6->sin6_addr,
+ &vaddr6->sin6_addr,
+ sizeof(struct sockaddr_in6));
+ default:
+ return -1;
+ }
+ }
+ default:
+ return -1; /* don't expect to be here */
+ }
+}
+
/*
* Returns true if srcaddr isn't specified and rhs isn't specified, or
* if srcaddr is specified and matches the IP address of the rhs argument
diff --git a/fs/cifs/smb2ops.c b/fs/cifs/smb2ops.c
index 6dfb865ee9d7..0627d5e38236 100644
--- a/fs/cifs/smb2ops.c
+++ b/fs/cifs/smb2ops.c
@@ -510,6 +510,43 @@ smb3_negotiate_rsize(struct cifs_tcon *tcon, struct smb3_fs_context *ctx)
return rsize;
}
+/*
+ * compare two interfaces a and b
+ * return 0 if everything matches.
+ * return 1 if a is rdma capable, or rss capable, or has higher link speed
+ * return -1 otherwise.
+ */
+static int
+iface_cmp(struct cifs_server_iface *a, struct cifs_server_iface *b)
+{
+ int cmp_ret = 0;
+
+ WARN_ON(!a || !b);
+ if (a->rdma_capable == b->rdma_capable) {
+ if (a->rss_capable == b->rss_capable) {
+ if (a->speed == b->speed) {
+ cmp_ret = cifs_ipaddr_cmp((struct sockaddr *) &a->sockaddr,
+ (struct sockaddr *) &b->sockaddr);
+ if (!cmp_ret)
+ return 0;
+ else if (cmp_ret > 0)
+ return 1;
+ else
+ return -1;
+ } else if (a->speed > b->speed)
+ return 1;
+ else
+ return -1;
+ } else if (a->rss_capable > b->rss_capable)
+ return 1;
+ else
+ return -1;
+ } else if (a->rdma_capable > b->rdma_capable)
+ return 1;
+ else
+ return -1;
+}
+
static int
parse_server_interfaces(struct network_interface_info_ioctl_rsp *buf,
size_t buf_len, struct cifs_ses *ses, bool in_mount)
--
2.34.1
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH 07/11] cifs: do not poll server interfaces too regularly
2023-03-10 15:32 [PATCH 01/11] cifs: fix tcon status change after tree connect Shyam Prasad N
` (4 preceding siblings ...)
2023-03-10 15:32 ` [PATCH 06/11] cifs: fix sockaddr comparison in iface_cmp Shyam Prasad N
@ 2023-03-10 15:32 ` Shyam Prasad N
2023-03-10 15:32 ` [PATCH 08/11] cifs: distribute channels across interfaces based on speed Shyam Prasad N
` (4 subsequent siblings)
10 siblings, 0 replies; 31+ messages in thread
From: Shyam Prasad N @ 2023-03-10 15:32 UTC (permalink / raw)
To: smfrench, bharathsm.hsk, pc, tom, linux-cifs; +Cc: Shyam Prasad N
We have the server interface list hanging off the tcon
structure today for reasons unknown. So each tcon which is
connected to a file server can query them separately,
which is really unnecessary. To avoid this, in the query
function, we will check the time of last update of the
interface list, and avoid querying the server if it is
within a certain range.
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
---
fs/cifs/smb2ops.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)
diff --git a/fs/cifs/smb2ops.c b/fs/cifs/smb2ops.c
index 0627d5e38236..c342a1db33ed 100644
--- a/fs/cifs/smb2ops.c
+++ b/fs/cifs/smb2ops.c
@@ -567,6 +567,14 @@ parse_server_interfaces(struct network_interface_info_ioctl_rsp *buf,
p = buf;
spin_lock(&ses->iface_lock);
+ /* do not query too frequently, this time with lock held */
+ if (ses->iface_last_update &&
+ time_before(jiffies, ses->iface_last_update +
+ (SMB_INTERFACE_POLL_INTERVAL * HZ))) {
+ spin_unlock(&ses->iface_lock);
+ return 0;
+ }
+
/*
* Go through iface_list and do kref_put to remove
* any unused ifaces. ifaces in use will be removed
@@ -733,6 +741,12 @@ SMB3_request_interfaces(const unsigned int xid, struct cifs_tcon *tcon, bool in_
struct network_interface_info_ioctl_rsp *out_buf = NULL;
struct cifs_ses *ses = tcon->ses;
+ /* do not query too frequently */
+ if (ses->iface_last_update &&
+ time_before(jiffies, ses->iface_last_update +
+ (SMB_INTERFACE_POLL_INTERVAL * HZ)))
+ return 0;
+
rc = SMB2_ioctl(xid, tcon, NO_FILE_ID, NO_FILE_ID,
FSCTL_QUERY_NETWORK_INTERFACE_INFO,
NULL /* no data input */, 0 /* no data input */,
--
2.34.1
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH 08/11] cifs: distribute channels across interfaces based on speed
2023-03-10 15:32 [PATCH 01/11] cifs: fix tcon status change after tree connect Shyam Prasad N
` (5 preceding siblings ...)
2023-03-10 15:32 ` [PATCH 07/11] cifs: do not poll server interfaces too regularly Shyam Prasad N
@ 2023-03-10 15:32 ` Shyam Prasad N
2024-02-27 11:16 ` Jan Čermák
2023-03-10 15:32 ` [PATCH 09/11] cifs: account for primary channel in the interface list Shyam Prasad N
` (3 subsequent siblings)
10 siblings, 1 reply; 31+ messages in thread
From: Shyam Prasad N @ 2023-03-10 15:32 UTC (permalink / raw)
To: smfrench, bharathsm.hsk, pc, tom, linux-cifs; +Cc: Shyam Prasad N
Today, if the server interfaces RSS capable, we simply
choose the fastest interface to setup a channel. This is not
a scalable approach, and does not make a lot of attempt to
distribute the connections.
This change does a weighted distribution of channels across
all the available server interfaces, where the weight is
a function of the advertised interface speed.
Also make sure that we don't mix rdma and non-rdma for channels.
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
---
fs/cifs/cifs_debug.c | 16 +++++++++++
fs/cifs/cifsglob.h | 2 ++
fs/cifs/sess.c | 67 +++++++++++++++++++++++++++++++++++---------
3 files changed, 71 insertions(+), 14 deletions(-)
diff --git a/fs/cifs/cifs_debug.c b/fs/cifs/cifs_debug.c
index 4391c7aac3cb..cee3af02e2c3 100644
--- a/fs/cifs/cifs_debug.c
+++ b/fs/cifs/cifs_debug.c
@@ -219,6 +219,8 @@ static int cifs_debug_data_proc_show(struct seq_file *m, void *v)
struct cifs_ses *ses;
struct cifs_tcon *tcon;
struct cifs_server_iface *iface;
+ size_t iface_weight = 0, iface_min_speed = 0;
+ struct cifs_server_iface *last_iface = NULL;
int c, i, j;
seq_puts(m,
@@ -465,11 +467,25 @@ static int cifs_debug_data_proc_show(struct seq_file *m, void *v)
"\tLast updated: %lu seconds ago",
ses->iface_count,
(jiffies - ses->iface_last_update) / HZ);
+
+ last_iface = list_last_entry(&ses->iface_list,
+ struct cifs_server_iface,
+ iface_head);
+ iface_min_speed = last_iface->speed;
+
j = 0;
list_for_each_entry(iface, &ses->iface_list,
iface_head) {
seq_printf(m, "\n\t%d)", ++j);
cifs_dump_iface(m, iface);
+
+ iface_weight = iface->speed / iface_min_speed;
+ seq_printf(m, "\t\tWeight (cur,total): (%zu,%zu)"
+ "\n\t\tAllocated channels: %u\n",
+ iface->weight_fulfilled,
+ iface_weight,
+ iface->num_channels);
+
if (is_ses_using_iface(ses, iface))
seq_puts(m, "\t\t[CONNECTED]\n");
}
diff --git a/fs/cifs/cifsglob.h b/fs/cifs/cifsglob.h
index a11e7b10f607..e3ba5c979832 100644
--- a/fs/cifs/cifsglob.h
+++ b/fs/cifs/cifsglob.h
@@ -948,6 +948,8 @@ struct cifs_server_iface {
struct list_head iface_head;
struct kref refcount;
size_t speed;
+ size_t weight_fulfilled;
+ unsigned int num_channels;
unsigned int rdma_capable : 1;
unsigned int rss_capable : 1;
unsigned int is_active : 1; /* unset if non existent */
diff --git a/fs/cifs/sess.c b/fs/cifs/sess.c
index b8bfebe4498e..78a7cfa75e91 100644
--- a/fs/cifs/sess.c
+++ b/fs/cifs/sess.c
@@ -167,7 +167,9 @@ int cifs_try_adding_channels(struct cifs_sb_info *cifs_sb, struct cifs_ses *ses)
int left;
int rc = 0;
int tries = 0;
+ size_t iface_weight = 0, iface_min_speed = 0;
struct cifs_server_iface *iface = NULL, *niface = NULL;
+ struct cifs_server_iface *last_iface = NULL;
spin_lock(&ses->chan_lock);
@@ -196,21 +198,11 @@ int cifs_try_adding_channels(struct cifs_sb_info *cifs_sb, struct cifs_ses *ses)
}
spin_unlock(&ses->chan_lock);
- /*
- * Keep connecting to same, fastest, iface for all channels as
- * long as its RSS. Try next fastest one if not RSS or channel
- * creation fails.
- */
- spin_lock(&ses->iface_lock);
- iface = list_first_entry(&ses->iface_list, struct cifs_server_iface,
- iface_head);
- spin_unlock(&ses->iface_lock);
-
while (left > 0) {
tries++;
if (tries > 3*ses->chan_max) {
- cifs_dbg(FYI, "too many channel open attempts (%d channels left to open)\n",
+ cifs_dbg(VFS, "too many channel open attempts (%d channels left to open)\n",
left);
break;
}
@@ -218,17 +210,34 @@ int cifs_try_adding_channels(struct cifs_sb_info *cifs_sb, struct cifs_ses *ses)
spin_lock(&ses->iface_lock);
if (!ses->iface_count) {
spin_unlock(&ses->iface_lock);
+ cifs_dbg(VFS, "server %s does not advertise interfaces\n", ses->server->hostname);
break;
}
+ if (!iface)
+ iface = list_first_entry(&ses->iface_list, struct cifs_server_iface,
+ iface_head);
+ last_iface = list_last_entry(&ses->iface_list, struct cifs_server_iface,
+ iface_head);
+ iface_min_speed = last_iface->speed;
+
list_for_each_entry_safe_from(iface, niface, &ses->iface_list,
iface_head) {
+ /* do not mix rdma and non-rdma interfaces */
+ if (iface->rdma_capable != ses->server->rdma)
+ continue;
+
/* skip ifaces that are unusable */
if (!iface->is_active ||
(is_ses_using_iface(ses, iface) &&
- !iface->rss_capable)) {
+ !iface->rss_capable))
+ continue;
+
+ /* check if we already allocated enough channels */
+ iface_weight = iface->speed / iface_min_speed;
+
+ if (iface->weight_fulfilled >= iface_weight)
continue;
- }
/* take ref before unlock */
kref_get(&iface->refcount);
@@ -245,10 +254,17 @@ int cifs_try_adding_channels(struct cifs_sb_info *cifs_sb, struct cifs_ses *ses)
continue;
}
- cifs_dbg(FYI, "successfully opened new channel on iface:%pIS\n",
+ iface->num_channels++;
+ iface->weight_fulfilled++;
+ cifs_dbg(VFS, "successfully opened new channel on iface:%pIS\n",
&iface->sockaddr);
break;
}
+
+ /* reached end of list. reset weight_fulfilled */
+ if (list_entry_is_head(iface, &ses->iface_list, iface_head))
+ list_for_each_entry(iface, &ses->iface_list, iface_head)
+ iface->weight_fulfilled = 0;
spin_unlock(&ses->iface_lock);
left--;
@@ -267,8 +283,10 @@ int
cifs_chan_update_iface(struct cifs_ses *ses, struct TCP_Server_Info *server)
{
unsigned int chan_index;
+ size_t iface_weight = 0, iface_min_speed = 0;
struct cifs_server_iface *iface = NULL;
struct cifs_server_iface *old_iface = NULL;
+ struct cifs_server_iface *last_iface = NULL;
int rc = 0;
spin_lock(&ses->chan_lock);
@@ -288,13 +306,34 @@ cifs_chan_update_iface(struct cifs_ses *ses, struct TCP_Server_Info *server)
spin_unlock(&ses->chan_lock);
spin_lock(&ses->iface_lock);
+ if (!ses->iface_count) {
+ spin_unlock(&ses->iface_lock);
+ cifs_dbg(VFS, "server %s does not advertise interfaces\n", ses->server->hostname);
+ return 0;
+ }
+
+ last_iface = list_last_entry(&ses->iface_list, struct cifs_server_iface,
+ iface_head);
+ iface_min_speed = last_iface->speed;
+
/* then look for a new one */
list_for_each_entry(iface, &ses->iface_list, iface_head) {
+ /* do not mix rdma and non-rdma interfaces */
+ if (iface->rdma_capable != server->rdma)
+ continue;
+
if (!iface->is_active ||
(is_ses_using_iface(ses, iface) &&
!iface->rss_capable)) {
continue;
}
+
+ /* check if we already allocated enough channels */
+ iface_weight = iface->speed / iface_min_speed;
+
+ if (iface->weight_fulfilled >= iface_weight)
+ continue;
+
kref_get(&iface->refcount);
break;
}
--
2.34.1
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH 09/11] cifs: account for primary channel in the interface list
2023-03-10 15:32 [PATCH 01/11] cifs: fix tcon status change after tree connect Shyam Prasad N
` (6 preceding siblings ...)
2023-03-10 15:32 ` [PATCH 08/11] cifs: distribute channels across interfaces based on speed Shyam Prasad N
@ 2023-03-10 15:32 ` Shyam Prasad N
2023-03-13 5:27 ` kernel test robot
2023-03-10 15:32 ` [PATCH 10/11] cifs: handle when server stops supporting multichannel Shyam Prasad N
` (2 subsequent siblings)
10 siblings, 1 reply; 31+ messages in thread
From: Shyam Prasad N @ 2023-03-10 15:32 UTC (permalink / raw)
To: smfrench, bharathsm.hsk, pc, tom, linux-cifs; +Cc: Shyam Prasad N
The refcounting of server interfaces should account
for the primary channel too. Although this is not
strictly necessary, doing so will account for the primary
channel in DebugData.
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
---
fs/cifs/sess.c | 40 +++++++++++++++++++++++++++++++++++-----
fs/cifs/smb2ops.c | 6 ++++++
2 files changed, 41 insertions(+), 5 deletions(-)
diff --git a/fs/cifs/sess.c b/fs/cifs/sess.c
index 78a7cfa75e91..9b51b2309e9c 100644
--- a/fs/cifs/sess.c
+++ b/fs/cifs/sess.c
@@ -291,11 +291,6 @@ cifs_chan_update_iface(struct cifs_ses *ses, struct TCP_Server_Info *server)
spin_lock(&ses->chan_lock);
chan_index = cifs_ses_get_chan_index(ses, server);
- if (!chan_index) {
- spin_unlock(&ses->chan_lock);
- return 0;
- }
-
if (ses->chans[chan_index].iface) {
old_iface = ses->chans[chan_index].iface;
if (old_iface->is_active) {
@@ -318,6 +313,16 @@ cifs_chan_update_iface(struct cifs_ses *ses, struct TCP_Server_Info *server)
/* then look for a new one */
list_for_each_entry(iface, &ses->iface_list, iface_head) {
+ if (!chan_index) {
+ /* if we're trying to get the updated iface for primary channel */
+ if (!cifs_match_ipaddr((struct sockaddr *) &server->dstaddr,
+ (struct sockaddr *) &iface->sockaddr))
+ continue;
+
+ kref_get(&iface->refcount);
+ break;
+ }
+
/* do not mix rdma and non-rdma interfaces */
if (iface->rdma_capable != server->rdma)
continue;
@@ -344,16 +349,41 @@ cifs_chan_update_iface(struct cifs_ses *ses, struct TCP_Server_Info *server)
cifs_dbg(FYI, "unable to find a suitable iface\n");
}
+ if (!chan_index && !iface) {
+ cifs_dbg(VFS, "unable to get the interface matching: %pIS\n",
+ &server->dstaddr);
+ spin_unlock(&ses->iface_lock);
+ return 0;
+ }
+
/* now drop the ref to the current iface */
if (old_iface && iface) {
cifs_dbg(FYI, "replacing iface: %pIS with %pIS\n",
&old_iface->sockaddr,
&iface->sockaddr);
+
+ old_iface->num_channels--;
+ if (--old_iface->weight_fulfilled < 0)
+ old_iface->weight_fulfilled = 0;
+ iface->num_channels++;
+ iface->weight_fulfilled++;
+
kref_put(&old_iface->refcount, release_iface);
} else if (old_iface) {
cifs_dbg(FYI, "releasing ref to iface: %pIS\n",
&old_iface->sockaddr);
+
+ old_iface->num_channels--;
+ if (--old_iface->weight_fulfilled < 0)
+ old_iface->weight_fulfilled = 0;
+
kref_put(&old_iface->refcount, release_iface);
+ } else if (!chan_index) {
+ /* special case: update interface for primary channel */
+ cifs_dbg(FYI, "referencing primary channel iface: %pIS\n",
+ &iface->sockaddr);
+ iface->num_channels++;
+ iface->weight_fulfilled++;
} else {
WARN_ON(!iface);
cifs_dbg(FYI, "adding new iface: %pIS\n", &iface->sockaddr);
diff --git a/fs/cifs/smb2ops.c b/fs/cifs/smb2ops.c
index c342a1db33ed..a5e53cb1ac49 100644
--- a/fs/cifs/smb2ops.c
+++ b/fs/cifs/smb2ops.c
@@ -740,6 +740,7 @@ SMB3_request_interfaces(const unsigned int xid, struct cifs_tcon *tcon, bool in_
unsigned int ret_data_len = 0;
struct network_interface_info_ioctl_rsp *out_buf = NULL;
struct cifs_ses *ses = tcon->ses;
+ struct TCP_Server_Info *pserver;
/* do not query too frequently */
if (ses->iface_last_update &&
@@ -764,6 +765,11 @@ SMB3_request_interfaces(const unsigned int xid, struct cifs_tcon *tcon, bool in_
if (rc)
goto out;
+ /* check if iface is still active */
+ pserver = ses->chans[0].server;
+ if (pserver && !cifs_chan_is_iface_active(ses, pserver))
+ cifs_chan_update_iface(ses, pserver);
+
out:
kfree(out_buf);
return rc;
--
2.34.1
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH 10/11] cifs: handle when server stops supporting multichannel
2023-03-10 15:32 [PATCH 01/11] cifs: fix tcon status change after tree connect Shyam Prasad N
` (7 preceding siblings ...)
2023-03-10 15:32 ` [PATCH 09/11] cifs: account for primary channel in the interface list Shyam Prasad N
@ 2023-03-10 15:32 ` Shyam Prasad N
2023-03-13 6:09 ` kernel test robot
2023-03-10 15:32 ` [PATCH 11/11] cifs: empty interface list when server doesn't support query interfaces Shyam Prasad N
2023-03-14 22:19 ` [PATCH 01/11] cifs: fix tcon status change after tree connect Paulo Alcantara
10 siblings, 1 reply; 31+ messages in thread
From: Shyam Prasad N @ 2023-03-10 15:32 UTC (permalink / raw)
To: smfrench, bharathsm.hsk, pc, tom, linux-cifs; +Cc: Shyam Prasad N
When a server stops supporting multichannel, we will
keep attempting reconnects to the secondary channels today.
Avoid this by freeing extra channels when negotiate
returns no multichannel support.
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
---
fs/cifs/cifsproto.h | 2 ++
fs/cifs/connect.c | 6 ++++++
fs/cifs/sess.c | 35 +++++++++++++++++++++++++++++++++++
fs/cifs/smb2ops.c | 8 ++++++++
4 files changed, 51 insertions(+)
diff --git a/fs/cifs/cifsproto.h b/fs/cifs/cifsproto.h
index 30fd81268eb7..343e582672b9 100644
--- a/fs/cifs/cifsproto.h
+++ b/fs/cifs/cifsproto.h
@@ -638,6 +638,8 @@ cifs_chan_needs_reconnect(struct cifs_ses *ses,
bool
cifs_chan_is_iface_active(struct cifs_ses *ses,
struct TCP_Server_Info *server);
+void
+cifs_disable_extra_channels(struct cifs_ses *ses);
int
cifs_chan_update_iface(struct cifs_ses *ses, struct TCP_Server_Info *server);
int
diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c
index b9af60417194..6375b08b9bcb 100644
--- a/fs/cifs/connect.c
+++ b/fs/cifs/connect.c
@@ -130,6 +130,12 @@ static void smb2_query_server_interfaces(struct work_struct *work)
if (rc) {
cifs_dbg(FYI, "%s: failed to query server interfaces: %d\n",
__func__, rc);
+
+ if (rc == -EOPNOTSUPP) {
+ /* cancel polling of interfaces and do not resched */
+ cancel_delayed_work_sync(&tcon->query_interfaces);
+ return;
+ }
}
queue_delayed_work(cifsiod_wq, &tcon->query_interfaces,
diff --git a/fs/cifs/sess.c b/fs/cifs/sess.c
index 9b51b2309e9c..34ae292bdff2 100644
--- a/fs/cifs/sess.c
+++ b/fs/cifs/sess.c
@@ -274,6 +274,41 @@ int cifs_try_adding_channels(struct cifs_sb_info *cifs_sb, struct cifs_ses *ses)
return new_chan_count - old_chan_count;
}
+/*
+ * called when multichannel is disabled by the server
+ */
+void
+cifs_disable_extra_channels(struct cifs_ses *ses)
+{
+ int i, chan_count;
+ struct cifs_server_iface *iface = NULL;
+ struct TCP_Server_Info *server = NULL;
+
+ spin_lock(&ses->chan_lock);
+ chan_count = ses->chan_count;
+ ses->chan_count = 1;
+ for (i = 1; i < chan_count; i++) {
+ iface = ses->chans[i].iface;
+ server = ses->chans[i].server;
+ spin_unlock(&ses->chan_lock);
+
+ if (iface) {
+ spin_lock(&ses->iface_lock);
+ kref_put(&iface->refcount, release_iface);
+ iface->num_channels--;
+ if (--iface->weight_fulfilled < 0)
+ iface->weight_fulfilled = 0;
+ spin_unlock(&ses->iface_lock);
+ }
+ cifs_put_tcp_session(server, 0);
+
+ spin_lock(&ses->chan_lock);
+ ses->chans[i].iface = NULL;
+ ses->chans[i].server = NULL;
+ }
+ spin_unlock(&ses->chan_lock);
+}
+
/*
* update the iface for the channel if necessary.
* will return 0 when iface is updated, 1 if removed, 2 otherwise
diff --git a/fs/cifs/smb2ops.c b/fs/cifs/smb2ops.c
index a5e53cb1ac49..c7a8a6049291 100644
--- a/fs/cifs/smb2ops.c
+++ b/fs/cifs/smb2ops.c
@@ -411,6 +411,14 @@ smb2_negotiate(const unsigned int xid,
/* BB we probably don't need to retry with modern servers */
if (rc == -EAGAIN)
rc = -EHOSTDOWN;
+
+ if (!rc &&
+ ses->chan_count > 1 &&
+ !(server->capabilities & SMB2_GLOBAL_CAP_MULTI_CHANNEL)) {
+ cifs_dbg(VFS, "server %s does not support multichannel anymore\n", ses->server->hostname);
+ cifs_disable_extra_channels(ses);
+ }
+
return rc;
}
--
2.34.1
^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH 11/11] cifs: empty interface list when server doesn't support query interfaces
2023-03-10 15:32 [PATCH 01/11] cifs: fix tcon status change after tree connect Shyam Prasad N
` (8 preceding siblings ...)
2023-03-10 15:32 ` [PATCH 10/11] cifs: handle when server stops supporting multichannel Shyam Prasad N
@ 2023-03-10 15:32 ` Shyam Prasad N
2023-03-14 22:19 ` [PATCH 01/11] cifs: fix tcon status change after tree connect Paulo Alcantara
10 siblings, 0 replies; 31+ messages in thread
From: Shyam Prasad N @ 2023-03-10 15:32 UTC (permalink / raw)
To: smfrench, bharathsm.hsk, pc, tom, linux-cifs; +Cc: Shyam Prasad N
When querying server interfaces returns -EOPNOTSUPP,
clear the list of interfaces. Assumption is that multichannel
would be disabled too.
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
---
fs/cifs/smb2ops.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/cifs/smb2ops.c b/fs/cifs/smb2ops.c
index c7a8a6049291..9ca55038b3db 100644
--- a/fs/cifs/smb2ops.c
+++ b/fs/cifs/smb2ops.c
@@ -763,7 +763,7 @@ SMB3_request_interfaces(const unsigned int xid, struct cifs_tcon *tcon, bool in_
if (rc == -EOPNOTSUPP) {
cifs_dbg(FYI,
"server does not support query network interfaces\n");
- goto out;
+ ret_data_len = 0;
} else if (rc != 0) {
cifs_tcon_dbg(VFS, "error %d on ioctl to get interface list\n", rc);
goto out;
--
2.34.1
^ permalink raw reply related [flat|nested] 31+ messages in thread
* Re: [PATCH 04/11] cifs: serialize channel reconnects
2023-03-10 15:32 ` [PATCH 04/11] cifs: serialize channel reconnects Shyam Prasad N
@ 2023-03-10 22:40 ` Steve French
0 siblings, 0 replies; 31+ messages in thread
From: Steve French @ 2023-03-10 22:40 UTC (permalink / raw)
To: Shyam Prasad N; +Cc: bharathsm.hsk, pc, tom, linux-cifs, Shyam Prasad N
Paulo had a few comments on patch 3 of the series, so have applied the
first two, and will wait on patch 3 and 4, but tentatively merged the
first two:
a7c73e3d5842 (HEAD -> for-next, origin/for-next) cifs: generate
signkey for the channel that's reconnecting
d169aadd50c7 cifs: fix tcon status change after tree connect
On Fri, Mar 10, 2023 at 9:33 AM Shyam Prasad N <nspmangalore@gmail.com> wrote:
>
> Parallel session reconnects are prone to race conditions
> that are difficult to avoid cleanly. The changes so far do
> ensure that parallel reconnects eventually go through.
> But that can take multiple session setups on the same channel.
>
> Avoiding that by serializing the session setups on parallel
> channels. In doing so, we should avoid such issues.
>
> Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
> ---
> fs/cifs/connect.c | 18 ++++++++++++++----
> 1 file changed, 14 insertions(+), 4 deletions(-)
>
> diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c
> index 7b103f69432e..4ea1e51c3fa5 100644
> --- a/fs/cifs/connect.c
> +++ b/fs/cifs/connect.c
> @@ -222,6 +222,11 @@ cifs_mark_tcp_ses_conns_for_reconnect(struct TCP_Server_Info *server,
> else
> cifs_chan_set_need_reconnect(ses, server);
>
> + cifs_dbg(FYI, "%s: channel connect bitmaps: 0x%lx 0x%lx\n",
> + __func__,
> + ses->chans_need_reconnect,
> + ses->chans_in_reconnect);
> +
> /* If all channels need reconnect, then tcon needs reconnect */
> if (!mark_smb_session && !CIFS_ALL_CHANS_NEED_RECONNECT(ses)) {
> spin_unlock(&ses->chan_lock);
> @@ -3744,6 +3749,11 @@ cifs_setup_session(const unsigned int xid, struct cifs_ses *ses,
> retries = 5;
>
> spin_lock(&ses->ses_lock);
> + cifs_dbg(FYI, "%s: channel connect bitmaps: 0x%lx 0x%lx\n",
> + __func__,
> + ses->chans_need_reconnect,
> + ses->chans_in_reconnect);
> +
> if (ses->ses_status != SES_GOOD &&
> ses->ses_status != SES_NEW &&
> ses->ses_status != SES_NEED_RECON) {
> @@ -3762,11 +3772,11 @@ cifs_setup_session(const unsigned int xid, struct cifs_ses *ses,
> }
>
> /* another process is in the processs of sess setup */
> - while (cifs_chan_in_reconnect(ses, server)) {
> + while (ses->chans_in_reconnect) {
> spin_unlock(&ses->chan_lock);
> spin_unlock(&ses->ses_lock);
> rc = wait_event_interruptible_timeout(ses->reconnect_q,
> - (!cifs_chan_in_reconnect(ses, server)),
> + (!ses->chans_in_reconnect),
> HZ);
> if (rc < 0) {
> cifs_dbg(FYI, "%s: aborting sess setup due to a received signal by the process\n",
> @@ -3776,8 +3786,8 @@ cifs_setup_session(const unsigned int xid, struct cifs_ses *ses,
> spin_lock(&ses->ses_lock);
> spin_lock(&ses->chan_lock);
>
> - /* are we still trying to reconnect? */
> - if (!cifs_chan_in_reconnect(ses, server)) {
> + /* did the bitmask change? */
> + if (!ses->chans_in_reconnect) {
> spin_unlock(&ses->chan_lock);
> spin_unlock(&ses->ses_lock);
> goto check_again;
> --
> 2.34.1
>
--
Thanks,
Steve
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 06/11] cifs: fix sockaddr comparison in iface_cmp
2023-03-10 15:32 ` [PATCH 06/11] cifs: fix sockaddr comparison in iface_cmp Shyam Prasad N
@ 2023-03-11 4:51 ` kernel test robot
0 siblings, 0 replies; 31+ messages in thread
From: kernel test robot @ 2023-03-11 4:51 UTC (permalink / raw)
To: Shyam Prasad N, smfrench, bharathsm.hsk, pc, tom, linux-cifs
Cc: llvm, oe-kbuild-all, Shyam Prasad N
Hi Shyam,
I love your patch! Yet something to improve:
[auto build test ERROR on cifs/for-next]
[also build test ERROR on linus/master v6.3-rc1 next-20230310]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Shyam-Prasad-N/cifs-generate-signkey-for-the-channel-that-s-reconnecting/20230310-234711
base: git://git.samba.org/sfrench/cifs-2.6.git for-next
patch link: https://lore.kernel.org/r/20230310153211.10982-6-sprasad%40microsoft.com
patch subject: [PATCH 06/11] cifs: fix sockaddr comparison in iface_cmp
config: x86_64-randconfig-a001 (https://download.01.org/0day-ci/archive/20230311/202303111213.VsaCo9Ff-lkp@intel.com/config)
compiler: clang version 14.0.6 (https://github.com/llvm/llvm-project f28c006a5895fc0e329fe15fead81e37457cb1d1)
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/intel-lab-lkp/linux/commit/799e15f02b7da48acdde0b568eef1deb23aa32ed
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review Shyam-Prasad-N/cifs-generate-signkey-for-the-channel-that-s-reconnecting/20230310-234711
git checkout 799e15f02b7da48acdde0b568eef1deb23aa32ed
# save the config file
mkdir build_dir && cp config build_dir/.config
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=x86_64 olddefconfig
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=x86_64 SHELL=/bin/bash fs/cifs/
If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>
| Link: https://lore.kernel.org/oe-kbuild-all/202303111213.VsaCo9Ff-lkp@intel.com/
All error/warnings (new ones prefixed by >>):
>> fs/cifs/connect.c:1311:4: error: expected expression
struct sockaddr_in *saddr4 = (struct sockaddr_in *)srcaddr;
^
>> fs/cifs/connect.c:1313:19: error: use of undeclared identifier 'saddr4'; did you mean 'vaddr4'?
return memcmp(&saddr4->sin_addr.s_addr,
^~~~~~
vaddr4
fs/cifs/connect.c:1312:24: note: 'vaddr4' declared here
struct sockaddr_in *vaddr4 = (struct sockaddr_in *)rhs;
^
>> fs/cifs/connect.c:1312:24: warning: mixing declarations and code is incompatible with standards before C99 [-Wdeclaration-after-statement]
struct sockaddr_in *vaddr4 = (struct sockaddr_in *)rhs;
^
fs/cifs/connect.c:1328:4: error: expected expression
struct sockaddr_in6 *saddr6 = (struct sockaddr_in6 *)srcaddr;
^
>> fs/cifs/connect.c:1330:19: error: use of undeclared identifier 'saddr6'; did you mean 'vaddr6'?
return memcmp(&saddr6->sin6_addr,
^~~~~~
vaddr6
fs/cifs/connect.c:1329:25: note: 'vaddr6' declared here
struct sockaddr_in6 *vaddr6 = (struct sockaddr_in6 *)rhs;
^
fs/cifs/connect.c:1329:25: warning: mixing declarations and code is incompatible with standards before C99 [-Wdeclaration-after-statement]
struct sockaddr_in6 *vaddr6 = (struct sockaddr_in6 *)rhs;
^
2 warnings and 4 errors generated.
vim +1311 fs/cifs/connect.c
1291
1292 int
1293 cifs_ipaddr_cmp(struct sockaddr *srcaddr, struct sockaddr *rhs)
1294 {
1295 switch (srcaddr->sa_family) {
1296 case AF_UNSPEC:
1297 switch (rhs->sa_family) {
1298 case AF_UNSPEC:
1299 return 0;
1300 case AF_INET:
1301 case AF_INET6:
1302 return 1;
1303 default:
1304 return -1;
1305 }
1306 case AF_INET: {
1307 switch (rhs->sa_family) {
1308 case AF_UNSPEC:
1309 return -1;
1310 case AF_INET:
> 1311 struct sockaddr_in *saddr4 = (struct sockaddr_in *)srcaddr;
> 1312 struct sockaddr_in *vaddr4 = (struct sockaddr_in *)rhs;
> 1313 return memcmp(&saddr4->sin_addr.s_addr,
1314 &vaddr4->sin_addr.s_addr,
1315 sizeof(struct sockaddr_in));
1316 case AF_INET6:
1317 return 1;
1318 default:
1319 return -1;
1320 }
1321 }
1322 case AF_INET6: {
1323 switch (rhs->sa_family) {
1324 case AF_UNSPEC:
1325 case AF_INET:
1326 return -1;
1327 case AF_INET6:
1328 struct sockaddr_in6 *saddr6 = (struct sockaddr_in6 *)srcaddr;
1329 struct sockaddr_in6 *vaddr6 = (struct sockaddr_in6 *)rhs;
> 1330 return memcmp(&saddr6->sin6_addr,
1331 &vaddr6->sin6_addr,
1332 sizeof(struct sockaddr_in6));
1333 default:
1334 return -1;
1335 }
1336 }
1337 default:
1338 return -1; /* don't expect to be here */
1339 }
1340 }
1341
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 09/11] cifs: account for primary channel in the interface list
2023-03-10 15:32 ` [PATCH 09/11] cifs: account for primary channel in the interface list Shyam Prasad N
@ 2023-03-13 5:27 ` kernel test robot
0 siblings, 0 replies; 31+ messages in thread
From: kernel test robot @ 2023-03-13 5:27 UTC (permalink / raw)
To: Shyam Prasad N, smfrench, bharathsm.hsk, pc, tom, linux-cifs
Cc: oe-kbuild-all, Shyam Prasad N
Hi Shyam,
I love your patch! Perhaps something to improve:
[auto build test WARNING on cifs/for-next]
[also build test WARNING on linus/master v6.3-rc2 next-20230310]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Shyam-Prasad-N/cifs-generate-signkey-for-the-channel-that-s-reconnecting/20230310-234711
base: git://git.samba.org/sfrench/cifs-2.6.git for-next
patch link: https://lore.kernel.org/r/20230310153211.10982-9-sprasad%40microsoft.com
patch subject: [PATCH 09/11] cifs: account for primary channel in the interface list
config: i386-randconfig-m021 (https://download.01.org/0day-ci/archive/20230313/202303131349.VOlTuw2U-lkp@intel.com/config)
compiler: gcc-11 (Debian 11.3.0-8) 11.3.0
If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>
| Link: https://lore.kernel.org/oe-kbuild-all/202303131349.VOlTuw2U-lkp@intel.com/
New smatch warnings:
fs/cifs/sess.c:366 cifs_chan_update_iface() warn: unsigned '--old_iface->weight_fulfilled' is never less than zero.
Old smatch warnings:
fs/cifs/sess.c:377 cifs_chan_update_iface() warn: unsigned '--old_iface->weight_fulfilled' is never less than zero.
vim +366 fs/cifs/sess.c
276
277 /*
278 * update the iface for the channel if necessary.
279 * will return 0 when iface is updated, 1 if removed, 2 otherwise
280 * Must be called with chan_lock held.
281 */
282 int
283 cifs_chan_update_iface(struct cifs_ses *ses, struct TCP_Server_Info *server)
284 {
285 unsigned int chan_index;
286 size_t iface_weight = 0, iface_min_speed = 0;
287 struct cifs_server_iface *iface = NULL;
288 struct cifs_server_iface *old_iface = NULL;
289 struct cifs_server_iface *last_iface = NULL;
290 int rc = 0;
291
292 spin_lock(&ses->chan_lock);
293 chan_index = cifs_ses_get_chan_index(ses, server);
294 if (ses->chans[chan_index].iface) {
295 old_iface = ses->chans[chan_index].iface;
296 if (old_iface->is_active) {
297 spin_unlock(&ses->chan_lock);
298 return 1;
299 }
300 }
301 spin_unlock(&ses->chan_lock);
302
303 spin_lock(&ses->iface_lock);
304 if (!ses->iface_count) {
305 spin_unlock(&ses->iface_lock);
306 cifs_dbg(VFS, "server %s does not advertise interfaces\n", ses->server->hostname);
307 return 0;
308 }
309
310 last_iface = list_last_entry(&ses->iface_list, struct cifs_server_iface,
311 iface_head);
312 iface_min_speed = last_iface->speed;
313
314 /* then look for a new one */
315 list_for_each_entry(iface, &ses->iface_list, iface_head) {
316 if (!chan_index) {
317 /* if we're trying to get the updated iface for primary channel */
318 if (!cifs_match_ipaddr((struct sockaddr *) &server->dstaddr,
319 (struct sockaddr *) &iface->sockaddr))
320 continue;
321
322 kref_get(&iface->refcount);
323 break;
324 }
325
326 /* do not mix rdma and non-rdma interfaces */
327 if (iface->rdma_capable != server->rdma)
328 continue;
329
330 if (!iface->is_active ||
331 (is_ses_using_iface(ses, iface) &&
332 !iface->rss_capable)) {
333 continue;
334 }
335
336 /* check if we already allocated enough channels */
337 iface_weight = iface->speed / iface_min_speed;
338
339 if (iface->weight_fulfilled >= iface_weight)
340 continue;
341
342 kref_get(&iface->refcount);
343 break;
344 }
345
346 if (list_entry_is_head(iface, &ses->iface_list, iface_head)) {
347 rc = 1;
348 iface = NULL;
349 cifs_dbg(FYI, "unable to find a suitable iface\n");
350 }
351
352 if (!chan_index && !iface) {
353 cifs_dbg(VFS, "unable to get the interface matching: %pIS\n",
354 &server->dstaddr);
355 spin_unlock(&ses->iface_lock);
356 return 0;
357 }
358
359 /* now drop the ref to the current iface */
360 if (old_iface && iface) {
361 cifs_dbg(FYI, "replacing iface: %pIS with %pIS\n",
362 &old_iface->sockaddr,
363 &iface->sockaddr);
364
365 old_iface->num_channels--;
> 366 if (--old_iface->weight_fulfilled < 0)
367 old_iface->weight_fulfilled = 0;
368 iface->num_channels++;
369 iface->weight_fulfilled++;
370
371 kref_put(&old_iface->refcount, release_iface);
372 } else if (old_iface) {
373 cifs_dbg(FYI, "releasing ref to iface: %pIS\n",
374 &old_iface->sockaddr);
375
376 old_iface->num_channels--;
377 if (--old_iface->weight_fulfilled < 0)
378 old_iface->weight_fulfilled = 0;
379
380 kref_put(&old_iface->refcount, release_iface);
381 } else if (!chan_index) {
382 /* special case: update interface for primary channel */
383 cifs_dbg(FYI, "referencing primary channel iface: %pIS\n",
384 &iface->sockaddr);
385 iface->num_channels++;
386 iface->weight_fulfilled++;
387 } else {
388 WARN_ON(!iface);
389 cifs_dbg(FYI, "adding new iface: %pIS\n", &iface->sockaddr);
390 }
391 spin_unlock(&ses->iface_lock);
392
393 spin_lock(&ses->chan_lock);
394 chan_index = cifs_ses_get_chan_index(ses, server);
395 ses->chans[chan_index].iface = iface;
396
397 /* No iface is found. if secondary chan, drop connection */
398 if (!iface && CIFS_SERVER_IS_CHAN(server))
399 ses->chans[chan_index].server = NULL;
400
401 spin_unlock(&ses->chan_lock);
402
403 if (!iface && CIFS_SERVER_IS_CHAN(server))
404 cifs_put_tcp_session(server, false);
405
406 return rc;
407 }
408
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 10/11] cifs: handle when server stops supporting multichannel
2023-03-10 15:32 ` [PATCH 10/11] cifs: handle when server stops supporting multichannel Shyam Prasad N
@ 2023-03-13 6:09 ` kernel test robot
0 siblings, 0 replies; 31+ messages in thread
From: kernel test robot @ 2023-03-13 6:09 UTC (permalink / raw)
To: Shyam Prasad N, smfrench, bharathsm.hsk, pc, tom, linux-cifs
Cc: oe-kbuild-all, Shyam Prasad N
Hi Shyam,
I love your patch! Perhaps something to improve:
[auto build test WARNING on cifs/for-next]
[also build test WARNING on linus/master v6.3-rc2 next-20230310]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Shyam-Prasad-N/cifs-generate-signkey-for-the-channel-that-s-reconnecting/20230310-234711
base: git://git.samba.org/sfrench/cifs-2.6.git for-next
patch link: https://lore.kernel.org/r/20230310153211.10982-10-sprasad%40microsoft.com
patch subject: [PATCH 10/11] cifs: handle when server stops supporting multichannel
config: i386-randconfig-m021 (https://download.01.org/0day-ci/archive/20230313/202303131424.W2jbbLmc-lkp@intel.com/config)
compiler: gcc-11 (Debian 11.3.0-8) 11.3.0
If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>
| Link: https://lore.kernel.org/oe-kbuild-all/202303131424.W2jbbLmc-lkp@intel.com/
New smatch warnings:
fs/cifs/sess.c:299 cifs_disable_extra_channels() warn: unsigned '--iface->weight_fulfilled' is never less than zero.
Old smatch warnings:
fs/cifs/sess.c:401 cifs_chan_update_iface() warn: unsigned '--old_iface->weight_fulfilled' is never less than zero.
fs/cifs/sess.c:412 cifs_chan_update_iface() warn: unsigned '--old_iface->weight_fulfilled' is never less than zero.
vim +299 fs/cifs/sess.c
276
277 /*
278 * called when multichannel is disabled by the server
279 */
280 void
281 cifs_disable_extra_channels(struct cifs_ses *ses)
282 {
283 int i, chan_count;
284 struct cifs_server_iface *iface = NULL;
285 struct TCP_Server_Info *server = NULL;
286
287 spin_lock(&ses->chan_lock);
288 chan_count = ses->chan_count;
289 ses->chan_count = 1;
290 for (i = 1; i < chan_count; i++) {
291 iface = ses->chans[i].iface;
292 server = ses->chans[i].server;
293 spin_unlock(&ses->chan_lock);
294
295 if (iface) {
296 spin_lock(&ses->iface_lock);
297 kref_put(&iface->refcount, release_iface);
298 iface->num_channels--;
> 299 if (--iface->weight_fulfilled < 0)
300 iface->weight_fulfilled = 0;
301 spin_unlock(&ses->iface_lock);
302 }
303 cifs_put_tcp_session(server, 0);
304
305 spin_lock(&ses->chan_lock);
306 ses->chans[i].iface = NULL;
307 ses->chans[i].server = NULL;
308 }
309 spin_unlock(&ses->chan_lock);
310 }
311
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 01/11] cifs: fix tcon status change after tree connect
2023-03-10 15:32 [PATCH 01/11] cifs: fix tcon status change after tree connect Shyam Prasad N
` (9 preceding siblings ...)
2023-03-10 15:32 ` [PATCH 11/11] cifs: empty interface list when server doesn't support query interfaces Shyam Prasad N
@ 2023-03-14 22:19 ` Paulo Alcantara
2023-03-16 10:57 ` Shyam Prasad N
10 siblings, 1 reply; 31+ messages in thread
From: Paulo Alcantara @ 2023-03-14 22:19 UTC (permalink / raw)
To: Shyam Prasad N, smfrench, bharathsm.hsk, tom, linux-cifs; +Cc: Shyam Prasad N
Hi Shyam,
Shyam Prasad N <nspmangalore@gmail.com> writes:
> After cifs_tree_connect, tcon status should not be
> set to TID_GOOD. There could still be files that need
> reopen. The status should instead be changed to
> TID_NEED_FILES_INVALIDATE. That way, after reopen of
> files, the status can be changed to TID_GOOD.
>
> Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
> ---
> fs/cifs/cifsglob.h | 2 +-
> fs/cifs/connect.c | 14 ++++++++++----
> fs/cifs/dfs.c | 16 +++++++++++-----
> fs/cifs/file.c | 10 +++++-----
> 4 files changed, 27 insertions(+), 15 deletions(-)
With this patch, the status of TID_GOOD is no longer set after
reconnecting tcons. I've noticed that when the DFS cache refresher
attempted to get a new referral for updating a cached entry but the IPC
tcon status was still set to TID_NEED_FILES_INVALIDATE, therefore
skipping the I/O as it thought the IPC tcon was disconnected.
IOW, the TID_NEED_FILES_INVALIDATE status remains set for both types of
tcons after reconnect.
Besides, could you please split this patch into two? The first one
would fix the checking of tcon statuses by using the appropriate spin
lock, and the second would make use of TID_NEED_FILES_INVALIDATE on
non-IPC tcons and set the TID_GOOD status afterwards.
Thanks.
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 01/11] cifs: fix tcon status change after tree connect
2023-03-14 22:19 ` [PATCH 01/11] cifs: fix tcon status change after tree connect Paulo Alcantara
@ 2023-03-16 10:57 ` Shyam Prasad N
2023-03-16 20:59 ` Paulo Alcantara
0 siblings, 1 reply; 31+ messages in thread
From: Shyam Prasad N @ 2023-03-16 10:57 UTC (permalink / raw)
To: Paulo Alcantara; +Cc: smfrench, bharathsm.hsk, tom, linux-cifs, Shyam Prasad N
[-- Attachment #1: Type: text/plain, Size: 1739 bytes --]
On Wed, Mar 15, 2023 at 3:49 AM Paulo Alcantara <pc@manguebit.com> wrote:
>
> Hi Shyam,
>
> Shyam Prasad N <nspmangalore@gmail.com> writes:
>
> > After cifs_tree_connect, tcon status should not be
> > set to TID_GOOD. There could still be files that need
> > reopen. The status should instead be changed to
> > TID_NEED_FILES_INVALIDATE. That way, after reopen of
> > files, the status can be changed to TID_GOOD.
> >
> > Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
> > ---
> > fs/cifs/cifsglob.h | 2 +-
> > fs/cifs/connect.c | 14 ++++++++++----
> > fs/cifs/dfs.c | 16 +++++++++++-----
> > fs/cifs/file.c | 10 +++++-----
> > 4 files changed, 27 insertions(+), 15 deletions(-)
>
> With this patch, the status of TID_GOOD is no longer set after
> reconnecting tcons. I've noticed that when the DFS cache refresher
> attempted to get a new referral for updating a cached entry but the IPC
> tcon status was still set to TID_NEED_FILES_INVALIDATE, therefore
> skipping the I/O as it thought the IPC tcon was disconnected.
>
> IOW, the TID_NEED_FILES_INVALIDATE status remains set for both types of
> tcons after reconnect.
>
> Besides, could you please split this patch into two? The first one
> would fix the checking of tcon statuses by using the appropriate spin
> lock, and the second would make use of TID_NEED_FILES_INVALIDATE on
> non-IPC tcons and set the TID_GOOD status afterwards.
>
> Thanks.
Hi Paulo,
Good point. The tcon status were indeed messed up after this patch.
Should have checked it earlier. My bad.
Let me revert this one and include only the checking of tcon status
with the right lock.
Attached the patch for that.
--
Regards,
Shyam
[-- Attachment #2: 0001-cifs-check-only-tcon-status-on-tcon-related-function.patch --]
[-- Type: application/octet-stream, Size: 2655 bytes --]
From d41e84175d7956948320bab648bcea625c35fd3d Mon Sep 17 00:00:00 2001
From: Shyam Prasad N <sprasad@microsoft.com>
Date: Thu, 16 Mar 2023 10:45:12 +0000
Subject: [PATCH 01/13] cifs: check only tcon status on tcon related functions
We had a couple of checks for session in cifs_tree_connect
and cifs_mark_open_files_invalid, which were unnecessary.
And that was done with ses_lock. Changed that to tc_lock too.
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
---
fs/cifs/connect.c | 10 +++++++---
fs/cifs/dfs.c | 10 +++++++---
fs/cifs/file.c | 8 ++++----
3 files changed, 18 insertions(+), 10 deletions(-)
diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c
index 5233f14f0636..e249f6fecd26 100644
--- a/fs/cifs/connect.c
+++ b/fs/cifs/connect.c
@@ -4038,9 +4038,13 @@ int cifs_tree_connect(const unsigned int xid, struct cifs_tcon *tcon, const stru
/* only send once per connect */
spin_lock(&tcon->tc_lock);
- if (tcon->ses->ses_status != SES_GOOD ||
- (tcon->status != TID_NEW &&
- tcon->status != TID_NEED_TCON)) {
+ if (tcon->status != TID_NEW &&
+ tcon->status != TID_NEED_TCON) {
+ spin_unlock(&tcon->tc_lock);
+ return -EHOSTDOWN;
+ }
+
+ if (tcon->status == TID_GOOD) {
spin_unlock(&tcon->tc_lock);
return 0;
}
diff --git a/fs/cifs/dfs.c b/fs/cifs/dfs.c
index b64d20374b9c..83bc7e16f3a3 100644
--- a/fs/cifs/dfs.c
+++ b/fs/cifs/dfs.c
@@ -479,9 +479,13 @@ int cifs_tree_connect(const unsigned int xid, struct cifs_tcon *tcon, const stru
/* only send once per connect */
spin_lock(&tcon->tc_lock);
- if (tcon->ses->ses_status != SES_GOOD ||
- (tcon->status != TID_NEW &&
- tcon->status != TID_NEED_TCON)) {
+ if (tcon->status != TID_NEW &&
+ tcon->status != TID_NEED_TCON) {
+ spin_unlock(&tcon->tc_lock);
+ return -EHOSTDOWN;
+ }
+
+ if (tcon->status == TID_GOOD) {
spin_unlock(&tcon->tc_lock);
return 0;
}
diff --git a/fs/cifs/file.c b/fs/cifs/file.c
index 4d4a2d82636d..6831a9949c43 100644
--- a/fs/cifs/file.c
+++ b/fs/cifs/file.c
@@ -174,13 +174,13 @@ cifs_mark_open_files_invalid(struct cifs_tcon *tcon)
struct list_head *tmp1;
/* only send once per connect */
- spin_lock(&tcon->ses->ses_lock);
- if ((tcon->ses->ses_status != SES_GOOD) || (tcon->status != TID_NEED_RECON)) {
- spin_unlock(&tcon->ses->ses_lock);
+ spin_lock(&tcon->tc_lock);
+ if (tcon->status != TID_NEED_RECON) {
+ spin_unlock(&tcon->tc_lock);
return;
}
tcon->status = TID_IN_FILES_INVALIDATE;
- spin_unlock(&tcon->ses->ses_lock);
+ spin_unlock(&tcon->tc_lock);
/* list all files open on tree connection and mark them invalid */
spin_lock(&tcon->open_file_lock);
--
2.34.1
^ permalink raw reply related [flat|nested] 31+ messages in thread
* Re: [PATCH 01/11] cifs: fix tcon status change after tree connect
2023-03-16 10:57 ` Shyam Prasad N
@ 2023-03-16 20:59 ` Paulo Alcantara
2023-03-17 10:48 ` Shyam Prasad N
0 siblings, 1 reply; 31+ messages in thread
From: Paulo Alcantara @ 2023-03-16 20:59 UTC (permalink / raw)
To: Shyam Prasad N; +Cc: smfrench, bharathsm.hsk, tom, linux-cifs, Shyam Prasad N
Shyam Prasad N <nspmangalore@gmail.com> writes:
> On Wed, Mar 15, 2023 at 3:49 AM Paulo Alcantara <pc@manguebit.com> wrote:
>> Shyam Prasad N <nspmangalore@gmail.com> writes:
>>
>> > After cifs_tree_connect, tcon status should not be
>> > set to TID_GOOD. There could still be files that need
>> > reopen. The status should instead be changed to
>> > TID_NEED_FILES_INVALIDATE. That way, after reopen of
>> > files, the status can be changed to TID_GOOD.
>> >
>> > Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
>> > ---
>> > fs/cifs/cifsglob.h | 2 +-
>> > fs/cifs/connect.c | 14 ++++++++++----
>> > fs/cifs/dfs.c | 16 +++++++++++-----
>> > fs/cifs/file.c | 10 +++++-----
>> > 4 files changed, 27 insertions(+), 15 deletions(-)
>>
>> With this patch, the status of TID_GOOD is no longer set after
>> reconnecting tcons. I've noticed that when the DFS cache refresher
>> attempted to get a new referral for updating a cached entry but the IPC
>> tcon status was still set to TID_NEED_FILES_INVALIDATE, therefore
>> skipping the I/O as it thought the IPC tcon was disconnected.
>>
>> IOW, the TID_NEED_FILES_INVALIDATE status remains set for both types of
>> tcons after reconnect.
>>
>> Besides, could you please split this patch into two? The first one
>> would fix the checking of tcon statuses by using the appropriate spin
>> lock, and the second would make use of TID_NEED_FILES_INVALIDATE on
>> non-IPC tcons and set the TID_GOOD status afterwards.
>
> Good point. The tcon status were indeed messed up after this patch.
> Should have checked it earlier. My bad.
>
> Let me revert this one and include only the checking of tcon status
> with the right lock.
>
> Attached the patch for that.
Thanks for the patch! Looks good to me.
BTW, don't you think we need to get rid of unnecessary ses status check
in __refresh_tcon() as well
...
spin_lock(&ipc->tc_lock);
if (ses->ses_status != SES_GOOD || ipc->status != TID_GOOD) {
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 01/11] cifs: fix tcon status change after tree connect
2023-03-16 20:59 ` Paulo Alcantara
@ 2023-03-17 10:48 ` Shyam Prasad N
2023-03-17 12:35 ` Paulo Alcantara
0 siblings, 1 reply; 31+ messages in thread
From: Shyam Prasad N @ 2023-03-17 10:48 UTC (permalink / raw)
To: Paulo Alcantara; +Cc: smfrench, bharathsm.hsk, tom, linux-cifs, Shyam Prasad N
[-- Attachment #1: Type: text/plain, Size: 2273 bytes --]
On Fri, Mar 17, 2023 at 2:29 AM Paulo Alcantara <pc@manguebit.com> wrote:
>
> Shyam Prasad N <nspmangalore@gmail.com> writes:
>
> > On Wed, Mar 15, 2023 at 3:49 AM Paulo Alcantara <pc@manguebit.com> wrote:
> >> Shyam Prasad N <nspmangalore@gmail.com> writes:
> >>
> >> > After cifs_tree_connect, tcon status should not be
> >> > set to TID_GOOD. There could still be files that need
> >> > reopen. The status should instead be changed to
> >> > TID_NEED_FILES_INVALIDATE. That way, after reopen of
> >> > files, the status can be changed to TID_GOOD.
> >> >
> >> > Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
> >> > ---
> >> > fs/cifs/cifsglob.h | 2 +-
> >> > fs/cifs/connect.c | 14 ++++++++++----
> >> > fs/cifs/dfs.c | 16 +++++++++++-----
> >> > fs/cifs/file.c | 10 +++++-----
> >> > 4 files changed, 27 insertions(+), 15 deletions(-)
> >>
> >> With this patch, the status of TID_GOOD is no longer set after
> >> reconnecting tcons. I've noticed that when the DFS cache refresher
> >> attempted to get a new referral for updating a cached entry but the IPC
> >> tcon status was still set to TID_NEED_FILES_INVALIDATE, therefore
> >> skipping the I/O as it thought the IPC tcon was disconnected.
> >>
> >> IOW, the TID_NEED_FILES_INVALIDATE status remains set for both types of
> >> tcons after reconnect.
> >>
> >> Besides, could you please split this patch into two? The first one
> >> would fix the checking of tcon statuses by using the appropriate spin
> >> lock, and the second would make use of TID_NEED_FILES_INVALIDATE on
> >> non-IPC tcons and set the TID_GOOD status afterwards.
> >
> > Good point. The tcon status were indeed messed up after this patch.
> > Should have checked it earlier. My bad.
> >
> > Let me revert this one and include only the checking of tcon status
> > with the right lock.
> >
> > Attached the patch for that.
>
> Thanks for the patch! Looks good to me.
>
> BTW, don't you think we need to get rid of unnecessary ses status check
> in __refresh_tcon() as well
>
> ...
> spin_lock(&ipc->tc_lock);
> if (ses->ses_status != SES_GOOD || ipc->status != TID_GOOD) {
Good point.
Here's the updated patch.
--
Regards,
Shyam
[-- Attachment #2: 0001-cifs-check-only-tcon-status-on-tcon-related-function.patch --]
[-- Type: application/octet-stream, Size: 3195 bytes --]
From 787f4e2afd54c7b5865939b61f661029a4f5f6b9 Mon Sep 17 00:00:00 2001
From: Shyam Prasad N <sprasad@microsoft.com>
Date: Thu, 16 Mar 2023 10:45:12 +0000
Subject: [PATCH 01/13] cifs: check only tcon status on tcon related functions
We had a couple of checks for session in cifs_tree_connect
and cifs_mark_open_files_invalid, which were unnecessary.
And that was done with ses_lock. Changed that to tc_lock too.
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
---
fs/cifs/connect.c | 10 +++++++---
fs/cifs/dfs.c | 10 +++++++---
fs/cifs/dfs_cache.c | 2 +-
fs/cifs/file.c | 8 ++++----
4 files changed, 19 insertions(+), 11 deletions(-)
diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c
index 5233f14f0636..e249f6fecd26 100644
--- a/fs/cifs/connect.c
+++ b/fs/cifs/connect.c
@@ -4038,9 +4038,13 @@ int cifs_tree_connect(const unsigned int xid, struct cifs_tcon *tcon, const stru
/* only send once per connect */
spin_lock(&tcon->tc_lock);
- if (tcon->ses->ses_status != SES_GOOD ||
- (tcon->status != TID_NEW &&
- tcon->status != TID_NEED_TCON)) {
+ if (tcon->status != TID_NEW &&
+ tcon->status != TID_NEED_TCON) {
+ spin_unlock(&tcon->tc_lock);
+ return -EHOSTDOWN;
+ }
+
+ if (tcon->status == TID_GOOD) {
spin_unlock(&tcon->tc_lock);
return 0;
}
diff --git a/fs/cifs/dfs.c b/fs/cifs/dfs.c
index b64d20374b9c..83bc7e16f3a3 100644
--- a/fs/cifs/dfs.c
+++ b/fs/cifs/dfs.c
@@ -479,9 +479,13 @@ int cifs_tree_connect(const unsigned int xid, struct cifs_tcon *tcon, const stru
/* only send once per connect */
spin_lock(&tcon->tc_lock);
- if (tcon->ses->ses_status != SES_GOOD ||
- (tcon->status != TID_NEW &&
- tcon->status != TID_NEED_TCON)) {
+ if (tcon->status != TID_NEW &&
+ tcon->status != TID_NEED_TCON) {
+ spin_unlock(&tcon->tc_lock);
+ return -EHOSTDOWN;
+ }
+
+ if (tcon->status == TID_GOOD) {
spin_unlock(&tcon->tc_lock);
return 0;
}
diff --git a/fs/cifs/dfs_cache.c b/fs/cifs/dfs_cache.c
index ac86bd0ebd63..a03d0209720e 100644
--- a/fs/cifs/dfs_cache.c
+++ b/fs/cifs/dfs_cache.c
@@ -1326,7 +1326,7 @@ static int __refresh_tcon(const char *path, struct cifs_tcon *tcon, bool force_r
}
spin_lock(&ipc->tc_lock);
- if (ses->ses_status != SES_GOOD || ipc->status != TID_GOOD) {
+ if (ipc->status != TID_GOOD) {
spin_unlock(&ipc->tc_lock);
cifs_dbg(FYI, "%s: skip cache refresh due to disconnected ipc\n", __func__);
goto out;
diff --git a/fs/cifs/file.c b/fs/cifs/file.c
index 4d4a2d82636d..6831a9949c43 100644
--- a/fs/cifs/file.c
+++ b/fs/cifs/file.c
@@ -174,13 +174,13 @@ cifs_mark_open_files_invalid(struct cifs_tcon *tcon)
struct list_head *tmp1;
/* only send once per connect */
- spin_lock(&tcon->ses->ses_lock);
- if ((tcon->ses->ses_status != SES_GOOD) || (tcon->status != TID_NEED_RECON)) {
- spin_unlock(&tcon->ses->ses_lock);
+ spin_lock(&tcon->tc_lock);
+ if (tcon->status != TID_NEED_RECON) {
+ spin_unlock(&tcon->tc_lock);
return;
}
tcon->status = TID_IN_FILES_INVALIDATE;
- spin_unlock(&tcon->ses->ses_lock);
+ spin_unlock(&tcon->tc_lock);
/* list all files open on tree connection and mark them invalid */
spin_lock(&tcon->open_file_lock);
--
2.34.1
^ permalink raw reply related [flat|nested] 31+ messages in thread
* Re: [PATCH 01/11] cifs: fix tcon status change after tree connect
2023-03-17 10:48 ` Shyam Prasad N
@ 2023-03-17 12:35 ` Paulo Alcantara
2023-03-17 18:25 ` Steve French
0 siblings, 1 reply; 31+ messages in thread
From: Paulo Alcantara @ 2023-03-17 12:35 UTC (permalink / raw)
To: Shyam Prasad N; +Cc: smfrench, bharathsm.hsk, tom, linux-cifs, Shyam Prasad N
Shyam Prasad N <nspmangalore@gmail.com> writes:
> Here's the updated patch.
Thanks!
Reviewed-by: Paulo Alcantara (SUSE) <pc@manguebit.com>
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 01/11] cifs: fix tcon status change after tree connect
2023-03-17 12:35 ` Paulo Alcantara
@ 2023-03-17 18:25 ` Steve French
0 siblings, 0 replies; 31+ messages in thread
From: Steve French @ 2023-03-17 18:25 UTC (permalink / raw)
To: Paulo Alcantara
Cc: Shyam Prasad N, bharathsm.hsk, tom, linux-cifs, Shyam Prasad N
updated for-next with this newer version of the patch
On Fri, Mar 17, 2023 at 7:35 AM Paulo Alcantara <pc@manguebit.com> wrote:
>
> Shyam Prasad N <nspmangalore@gmail.com> writes:
>
> > Here's the updated patch.
>
> Thanks!
>
> Reviewed-by: Paulo Alcantara (SUSE) <pc@manguebit.com>
--
Thanks,
Steve
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 08/11] cifs: distribute channels across interfaces based on speed
2023-03-10 15:32 ` [PATCH 08/11] cifs: distribute channels across interfaces based on speed Shyam Prasad N
@ 2024-02-27 11:16 ` Jan Čermák
2024-02-27 16:17 ` Shyam Prasad N
0 siblings, 1 reply; 31+ messages in thread
From: Jan Čermák @ 2024-02-27 11:16 UTC (permalink / raw)
To: Shyam Prasad N, smfrench, bharathsm.hsk, pc, tom, linux-cifs
Cc: Shyam Prasad N, Stefan Agner
Hi Shyam, everyone,
On 10. 03. 23 16:32, Shyam Prasad N wrote:
> + if (!ses->iface_count) {
> + spin_unlock(&ses->iface_lock);
> + cifs_dbg(VFS, "server %s does not advertise interfaces\n", ses->server->hostname);
> + return 0;
> + }
May I ask why this is now being logged, and what can be tweaked in the
case that a server does not advertise interfaces? After updating the
kernel from 6.1 to 6.6 in Home Assistant OS, we got a report [1] of
these messages appearing, yet so far only from a single attentive user.
I wonder if we are going to see them more often, and it that case a
suggestion to users would come handy. If there's not a simple
resolution, could the verbosity be lowered to FYI, maybe?
Thanks,
Jan
[1] https://github.com/home-assistant/operating-system/issues/3201
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 08/11] cifs: distribute channels across interfaces based on speed
2024-02-27 11:16 ` Jan Čermák
@ 2024-02-27 16:17 ` Shyam Prasad N
2024-02-28 9:22 ` Jan Čermák
0 siblings, 1 reply; 31+ messages in thread
From: Shyam Prasad N @ 2024-02-27 16:17 UTC (permalink / raw)
To: Jan Čermák
Cc: smfrench, bharathsm.hsk, pc, tom, linux-cifs, Shyam Prasad N,
Stefan Agner
On Tue, Feb 27, 2024 at 4:46 PM Jan Čermák <sairon@sairon.cz> wrote:
>
> Hi Shyam, everyone,
>
> On 10. 03. 23 16:32, Shyam Prasad N wrote:
>
> > + if (!ses->iface_count) {
> > + spin_unlock(&ses->iface_lock);
> > + cifs_dbg(VFS, "server %s does not advertise interfaces\n", ses->server->hostname);
> > + return 0;
> > + }
>
> May I ask why this is now being logged, and what can be tweaked in the
> case that a server does not advertise interfaces? After updating the
> kernel from 6.1 to 6.6 in Home Assistant OS, we got a report [1] of
> these messages appearing, yet so far only from a single attentive user.
> I wonder if we are going to see them more often, and it that case a
> suggestion to users would come handy. If there's not a simple
> resolution, could the verbosity be lowered to FYI, maybe?
>
> Thanks,
> Jan
>
>
> [1] https://github.com/home-assistant/operating-system/issues/3201
Hi Jan,
These messages (in theory) should not show up if either multichannel
or max_channels are not specified mount options.
And if multichannel is enabled, then the server should support the
ioctl to return the server interface list.
These messages are meant to help the user/developer understand why
multiple channels are not getting established to the server, even
after specifying multichannel as a mount option.
The repeating nature of these messages here leads me to also believe
that there's something fishy going on here.
Either the network health is not good, or that there's some bug at play here.
--
Regards,
Shyam
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 08/11] cifs: distribute channels across interfaces based on speed
2024-02-27 16:17 ` Shyam Prasad N
@ 2024-02-28 9:22 ` Jan Čermák
2024-03-05 14:56 ` Shyam Prasad N
0 siblings, 1 reply; 31+ messages in thread
From: Jan Čermák @ 2024-02-28 9:22 UTC (permalink / raw)
To: Shyam Prasad N
Cc: smfrench, bharathsm.hsk, pc, tom, linux-cifs, Shyam Prasad N,
Stefan Agner
Hi Shyam,
On 27. 02. 24 17:17, Shyam Prasad N wrote:
> These messages (in theory) should not show up if either multichannel
> or max_channels are not specified mount options.
That shouldn't be the case here, I checked with the user and he's not
doing anything fishy himself (like interfering with the standard mount
utilities), and the userspace tools creating the mounts should not be
setting any of these options, which I confirmed by asking for his mounts
list:
//192.168.1.12/folder on /mnt/data/supervisor/mounts/folder type cifs
(rw,relatime,vers=default,cache=strict,username=user,uid=0,noforceuid,gid=0,noforcegid,addr=192.168.1.12,file_mode=0755,dir_mode=0755,soft,nounix,mapposix,rsize=4194304,wsize=4194304,bsize=1048576,echo_interval=60,actimeo=1,closetimeo=1)
//192.168.1.12/folder on /mnt/data/supervisor/media/folder type cifs
(rw,relatime,vers=default,cache=strict,username=user,uid=0,noforceuid,gid=0,noforcegid,addr=192.168.1.12,file_mode=0755,dir_mode=0755,soft,nounix,mapposix,rsize=4194304,wsize=4194304,bsize=1048576,echo_interval=60,actimeo=1,closetimeo=1)
Or am I missing anything here?
> The repeating nature of these messages here leads me to also believe
> that there's something fishy going on here.
> Either the network health is not good, or that there's some bug at play here.
Maybe, however I'm not able to reproduce the above behavior yet. But
there's been so far one more report of this happening, so it's not a
single isolated case. I appreciate any advice what to look at further.
Cheers,
Jan
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 08/11] cifs: distribute channels across interfaces based on speed
2024-02-28 9:22 ` Jan Čermák
@ 2024-03-05 14:56 ` Shyam Prasad N
2024-03-06 15:43 ` Paulo Alcantara
0 siblings, 1 reply; 31+ messages in thread
From: Shyam Prasad N @ 2024-03-05 14:56 UTC (permalink / raw)
To: Jan Čermák
Cc: smfrench, bharathsm.hsk, pc, tom, linux-cifs, Shyam Prasad N,
Stefan Agner
On Wed, Feb 28, 2024 at 2:52 PM Jan Čermák <sairon@sairon.cz> wrote:
>
> Hi Shyam,
>
Hi Jan,
Apologies for the delay.
> On 27. 02. 24 17:17, Shyam Prasad N wrote:
> > These messages (in theory) should not show up if either multichannel
> > or max_channels are not specified mount options.
>
> That shouldn't be the case here, I checked with the user and he's not
> doing anything fishy himself (like interfering with the standard mount
> utilities), and the userspace tools creating the mounts should not be
> setting any of these options, which I confirmed by asking for his mounts
> list:
>
> //192.168.1.12/folder on /mnt/data/supervisor/mounts/folder type cifs
> (rw,relatime,vers=default,cache=strict,username=user,uid=0,noforceuid,gid=0,noforcegid,addr=192.168.1.12,file_mode=0755,dir_mode=0755,soft,nounix,mapposix,rsize=4194304,wsize=4194304,bsize=1048576,echo_interval=60,actimeo=1,closetimeo=1)
> //192.168.1.12/folder on /mnt/data/supervisor/media/folder type cifs
> (rw,relatime,vers=default,cache=strict,username=user,uid=0,noforceuid,gid=0,noforcegid,addr=192.168.1.12,file_mode=0755,dir_mode=0755,soft,nounix,mapposix,rsize=4194304,wsize=4194304,bsize=1048576,echo_interval=60,actimeo=1,closetimeo=1)
Hmmm.. That seems like a bug.
Is there any chance that the user is willing to try out if the same
bug reproduces with the latest mainline kernel?
The other option is for us to try with the 6.6 kernel. But without the
steps to repro, it'll just be shots in the dark.
Let me try to go through the code and see if I can spot anything here.
>
>
> Or am I missing anything here?
>
> > The repeating nature of these messages here leads me to also believe
> > that there's something fishy going on here.
> > Either the network health is not good, or that there's some bug at play here.
>
> Maybe, however I'm not able to reproduce the above behavior yet. But
> there's been so far one more report of this happening, so it's not a
> single isolated case. I appreciate any advice what to look at further.
If a user has reproduced this issue, the one thing they can send us is
the ftrace output of cifs events when the issue is being seen.
i.e. something like this:
# trace-cmd start -e cifs
# <now wait for the issue to reproduce>
# trace-cmd stop
# trace-cmd extract > /tmp/outputs.txt
# uname -r >> /tmp/outputs,txt
# cat /proc/fs/cifs/DebugData >> /tmp/outputs,txt
# cat /proc/fs/cifs/Stats >> /tmp/outputs,txt
And then provide the outputs.txt file to us.
Going through that capture can help us understand this better.
>
> Cheers,
> Jan
--
Regards,
Shyam
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 08/11] cifs: distribute channels across interfaces based on speed
2024-03-05 14:56 ` Shyam Prasad N
@ 2024-03-06 15:43 ` Paulo Alcantara
2024-03-11 10:01 ` Jan Čermák
0 siblings, 1 reply; 31+ messages in thread
From: Paulo Alcantara @ 2024-03-06 15:43 UTC (permalink / raw)
To: Shyam Prasad N, Jan Čermák
Cc: smfrench, bharathsm.hsk, tom, linux-cifs, Shyam Prasad N, Stefan Agner
Shyam Prasad N <nspmangalore@gmail.com> writes:
> On Wed, Feb 28, 2024 at 2:52 PM Jan Čermák <sairon@sairon.cz> wrote:
>>
>> Hi Shyam,
>>
>
> Hi Jan,
> Apologies for the delay.
>
>> On 27. 02. 24 17:17, Shyam Prasad N wrote:
>> > These messages (in theory) should not show up if either multichannel
>> > or max_channels are not specified mount options.
>>
>> That shouldn't be the case here, I checked with the user and he's not
>> doing anything fishy himself (like interfering with the standard mount
>> utilities), and the userspace tools creating the mounts should not be
>> setting any of these options, which I confirmed by asking for his mounts
>> list:
>>
>> //192.168.1.12/folder on /mnt/data/supervisor/mounts/folder type cifs
>> (rw,relatime,vers=default,cache=strict,username=user,uid=0,noforceuid,gid=0,noforcegid,addr=192.168.1.12,file_mode=0755,dir_mode=0755,soft,nounix,mapposix,rsize=4194304,wsize=4194304,bsize=1048576,echo_interval=60,actimeo=1,closetimeo=1)
>> //192.168.1.12/folder on /mnt/data/supervisor/media/folder type cifs
>> (rw,relatime,vers=default,cache=strict,username=user,uid=0,noforceuid,gid=0,noforcegid,addr=192.168.1.12,file_mode=0755,dir_mode=0755,soft,nounix,mapposix,rsize=4194304,wsize=4194304,bsize=1048576,echo_interval=60,actimeo=1,closetimeo=1)
>
> Hmmm.. That seems like a bug.
Yes. I see a couple of issues here:
(1) cifs_chan_update_iface() seems to be called over reconnect for all
dialect versions and servers that do not support
FSCTL_QUERY_NETWORK_INTERFACE_INFO, so @ses->iface_count will be zero in
some cases and then VFS message will start being flooded on dmesg.
(2) __cifs_reconnect() is queueing smb2_reconnect_server() even for SMB1
mounts, so smb2_reconnect() ends up being called and then call
SMB3_request_interfaces() because SMB2_GLOBAL_CAP_MULTI_CHANNEL is mixed
with CAP_LARGE_FILES.
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 08/11] cifs: distribute channels across interfaces based on speed
2024-03-06 15:43 ` Paulo Alcantara
@ 2024-03-11 10:01 ` Jan Čermák
2024-03-11 11:14 ` Shyam Prasad N
0 siblings, 1 reply; 31+ messages in thread
From: Jan Čermák @ 2024-03-11 10:01 UTC (permalink / raw)
To: Paulo Alcantara, Shyam Prasad N
Cc: smfrench, bharathsm.hsk, tom, linux-cifs, Shyam Prasad N, Stefan Agner
Hi Paulo, Shyam,
On 06. 03. 24 16:43, Paulo Alcantara wrote:
>
> Yes. I see a couple of issues here:
>
> (1) cifs_chan_update_iface() seems to be called over reconnect for all
> dialect versions and servers that do not support
> FSCTL_QUERY_NETWORK_INTERFACE_INFO, so @ses->iface_count will be zero in
> some cases and then VFS message will start being flooded on dmesg.
>
> (2) __cifs_reconnect() is queueing smb2_reconnect_server() even for SMB1
> mounts, so smb2_reconnect() ends up being called and then call
> SMB3_request_interfaces() because SMB2_GLOBAL_CAP_MULTI_CHANNEL is mixed
> with CAP_LARGE_FILES.
thanks for looking into this! Is there anything else you'll need from me
or to test by the users who are observing the issue, or do you have
enough information for a fix?
Regards,
Jan
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 08/11] cifs: distribute channels across interfaces based on speed
2024-03-11 10:01 ` Jan Čermák
@ 2024-03-11 11:14 ` Shyam Prasad N
2024-03-12 14:20 ` Jan Čermák
0 siblings, 1 reply; 31+ messages in thread
From: Shyam Prasad N @ 2024-03-11 11:14 UTC (permalink / raw)
To: Jan Čermák
Cc: Paulo Alcantara, smfrench, bharathsm.hsk, tom, linux-cifs,
Shyam Prasad N, Stefan Agner
On Mon, Mar 11, 2024 at 3:31 PM Jan Čermák <sairon@sairon.cz> wrote:
>
> Hi Paulo, Shyam,
>
> On 06. 03. 24 16:43, Paulo Alcantara wrote:
> >
> > Yes. I see a couple of issues here:
> >
Thanks for the review, Paulo.
> > (1) cifs_chan_update_iface() seems to be called over reconnect for all
> > dialect versions and servers that do not support
> > FSCTL_QUERY_NETWORK_INTERFACE_INFO, so @ses->iface_count will be zero in
> > some cases and then VFS message will start being flooded on dmesg.
Valid point.
> >
> > (2) __cifs_reconnect() is queueing smb2_reconnect_server() even for SMB1
> > mounts, so smb2_reconnect() ends up being called and then call
> > SMB3_request_interfaces() because SMB2_GLOBAL_CAP_MULTI_CHANNEL is mixed
> > with CAP_LARGE_FILES.
Excellent catch. I did not take this into account.
It is very likely that this is going on here.
Let me submit a patch to check the exact dialect before calling these
functions to make sure we only do it for SMB3+.
>
> thanks for looking into this! Is there anything else you'll need from me
> or to test by the users who are observing the issue, or do you have
> enough information for a fix?
>
> Regards,
> Jan
Thanks for bringing this to our notice, Jan.
--
Regards,
Shyam
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 08/11] cifs: distribute channels across interfaces based on speed
2024-03-11 11:14 ` Shyam Prasad N
@ 2024-03-12 14:20 ` Jan Čermák
2024-03-13 10:45 ` Shyam Prasad N
0 siblings, 1 reply; 31+ messages in thread
From: Jan Čermák @ 2024-03-12 14:20 UTC (permalink / raw)
To: Shyam Prasad N
Cc: Paulo Alcantara, smfrench, bharathsm.hsk, tom, linux-cifs,
Shyam Prasad N, Stefan Agner
On 11. 03. 24 12:14, Shyam Prasad N wrote:
> Let me submit a patch to check the exact dialect before calling these
> functions to make sure we only do it for SMB3+.
>
FWIW, most of the reports (except the first one with macOS) are users
running their SMB server on QNAP NAS's (more details are in the GH
ticket I linked earlier).
Please CC me in the e-mail with the patch as well, I can create a build
with that patch applied to 6.6.y and ask if someone could try if it
resolves the issue in their environment.
Cheers,
Jan
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 08/11] cifs: distribute channels across interfaces based on speed
2024-03-12 14:20 ` Jan Čermák
@ 2024-03-13 10:45 ` Shyam Prasad N
2024-03-26 14:10 ` Jan Čermák
0 siblings, 1 reply; 31+ messages in thread
From: Shyam Prasad N @ 2024-03-13 10:45 UTC (permalink / raw)
To: Jan Čermák
Cc: Paulo Alcantara, smfrench, bharathsm.hsk, tom, linux-cifs,
Shyam Prasad N, Stefan Agner
On Tue, Mar 12, 2024 at 7:50 PM Jan Čermák <sairon@sairon.cz> wrote:
>
> On 11. 03. 24 12:14, Shyam Prasad N wrote:
> > Let me submit a patch to check the exact dialect before calling these
> > functions to make sure we only do it for SMB3+.
> >
>
> FWIW, most of the reports (except the first one with macOS) are users
> running their SMB server on QNAP NAS's (more details are in the GH
> ticket I linked earlier).
>
> Please CC me in the e-mail with the patch as well, I can create a build
> with that patch applied to 6.6.y and ask if someone could try if it
> resolves the issue in their environment.
>
> Cheers,
> Jan
Just sent two patches for this.
One patch is just to change the log level for this log from VFS -> FYI.
The other one is for the suggestions made by Paulo.
#1 will fix this anyway. I'm curious to know if #2 alone would fix this problem.
Also, please ask for the output of the following command while testing this out:
# cat /proc/fs/cifs/DebugData
--
Regards,
Shyam
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 08/11] cifs: distribute channels across interfaces based on speed
2024-03-13 10:45 ` Shyam Prasad N
@ 2024-03-26 14:10 ` Jan Čermák
0 siblings, 0 replies; 31+ messages in thread
From: Jan Čermák @ 2024-03-26 14:10 UTC (permalink / raw)
To: Shyam Prasad N
Cc: Paulo Alcantara, smfrench, bharathsm.hsk, tom, linux-cifs,
Shyam Prasad N, Stefan Agner
On 13. 03. 24 11:45, Shyam Prasad N wrote:
> Just sent two patches for this.
> One patch is just to change the log level for this log from VFS -> FYI.
> The other one is for the suggestions made by Paulo.
> #1 will fix this anyway. I'm curious to know if #2 alone would fix this problem.
> Also, please ask for the output of the following command while testing this out:
> # cat /proc/fs/cifs/DebugData
One of the users tried testing on the same system with only the second
patch applied and reports the issue still persists :( He supplied the
DebugData output only as a screenshot [1], not sure what to look there
for, but definitely the "Server interfaces" section is missing there
compared to my setup.
Cheers,
Jan
[1]
https://github.com/home-assistant/operating-system/issues/3201#issuecomment-2019357433
^ permalink raw reply [flat|nested] 31+ messages in thread
end of thread, other threads:[~2024-03-26 14:10 UTC | newest]
Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-10 15:32 [PATCH 01/11] cifs: fix tcon status change after tree connect Shyam Prasad N
2023-03-10 15:32 ` [PATCH 02/11] cifs: generate signkey for the channel that's reconnecting Shyam Prasad N
2023-03-10 15:32 ` [PATCH 03/11] cifs: avoid race conditions with parallel reconnects Shyam Prasad N
2023-03-10 15:32 ` [PATCH 04/11] cifs: serialize channel reconnects Shyam Prasad N
2023-03-10 22:40 ` Steve French
2023-03-10 15:32 ` [PATCH 05/11] cifs: lock chan_lock outside match_session Shyam Prasad N
2023-03-10 15:32 ` [PATCH 06/11] cifs: fix sockaddr comparison in iface_cmp Shyam Prasad N
2023-03-11 4:51 ` kernel test robot
2023-03-10 15:32 ` [PATCH 07/11] cifs: do not poll server interfaces too regularly Shyam Prasad N
2023-03-10 15:32 ` [PATCH 08/11] cifs: distribute channels across interfaces based on speed Shyam Prasad N
2024-02-27 11:16 ` Jan Čermák
2024-02-27 16:17 ` Shyam Prasad N
2024-02-28 9:22 ` Jan Čermák
2024-03-05 14:56 ` Shyam Prasad N
2024-03-06 15:43 ` Paulo Alcantara
2024-03-11 10:01 ` Jan Čermák
2024-03-11 11:14 ` Shyam Prasad N
2024-03-12 14:20 ` Jan Čermák
2024-03-13 10:45 ` Shyam Prasad N
2024-03-26 14:10 ` Jan Čermák
2023-03-10 15:32 ` [PATCH 09/11] cifs: account for primary channel in the interface list Shyam Prasad N
2023-03-13 5:27 ` kernel test robot
2023-03-10 15:32 ` [PATCH 10/11] cifs: handle when server stops supporting multichannel Shyam Prasad N
2023-03-13 6:09 ` kernel test robot
2023-03-10 15:32 ` [PATCH 11/11] cifs: empty interface list when server doesn't support query interfaces Shyam Prasad N
2023-03-14 22:19 ` [PATCH 01/11] cifs: fix tcon status change after tree connect Paulo Alcantara
2023-03-16 10:57 ` Shyam Prasad N
2023-03-16 20:59 ` Paulo Alcantara
2023-03-17 10:48 ` Shyam Prasad N
2023-03-17 12:35 ` Paulo Alcantara
2023-03-17 18:25 ` Steve French
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).