linux-cifs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/5] dfs fixes
@ 2023-01-17  0:09 Paulo Alcantara
  2023-01-17  0:09 ` [PATCH 1/5] cifs: fix potential deadlock in cache_refresh_path() Paulo Alcantara
                   ` (5 more replies)
  0 siblings, 6 replies; 15+ messages in thread
From: Paulo Alcantara @ 2023-01-17  0:09 UTC (permalink / raw)
  To: smfrench; +Cc: linux-cifs, Paulo Alcantara

Hi Steve,

The most important fix is 1/5 that should fix those random hangs that
we've observed while running dfs tests on buildbot.

I have run twice 50 dfs tests against Windows 2022 and samba 4.16 with
these mount options

	vers=3.1.1,echo_interval=10,{,hard}
	vers=3.0,echo_interval=10,{,hard}
	vers=3.0,echo_interval=10,{,sign}
	vers=3.0,echo_interval=10,{,seal}
	vers=2.1,echo_interval=10,{,hard}
	vers=1.0,echo_interval=10,{,hard}

The only tests which failed (2%) were with SMB1 UNIX extensions
against samba.  readdir(2) was getting STATUS_INVALID_LEVEL from
QUERY_PATH_INFO after failover for some reason -- I'll look into that
when time allows.  Those failures aren't related to this series,
though.

I also did some quick tests with kerberos.

Paulo Alcantara (5):
  cifs: fix potential deadlock in cache_refresh_path()
  cifs: avoid re-lookups in dfs_cache_find()
  cifs: don't take exclusive lock for updating target hints
  cifs: remove duplicate code in __refresh_tcon()
  cifs: handle cache lookup errors different than -ENOENT

 fs/cifs/dfs_cache.c | 185 ++++++++++++++++++++++----------------------
 1 file changed, 94 insertions(+), 91 deletions(-)

-- 
2.39.0


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/5] cifs: fix potential deadlock in cache_refresh_path()
  2023-01-17  0:09 [PATCH 0/5] dfs fixes Paulo Alcantara
@ 2023-01-17  0:09 ` Paulo Alcantara
  2023-01-17 17:07   ` Aurélien Aptel
  2023-01-17  0:09 ` [PATCH 2/5] cifs: avoid re-lookups in dfs_cache_find() Paulo Alcantara
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 15+ messages in thread
From: Paulo Alcantara @ 2023-01-17  0:09 UTC (permalink / raw)
  To: smfrench; +Cc: linux-cifs, Paulo Alcantara

Avoid getting DFS referral from an exclusive lock in
cache_refresh_path() because the tcon IPC used for getting the
referral could be disconnected and thus causing a deadlock as shown
below:

task A
------
cifs_demultiplex_thread()
 cifs_handle_standard()
  reconnect_dfs_server()
   dfs_cache_noreq_find()
    down_read()

task B
------
dfs_cache_find()
 cache_refresh_path()
  down_write()
   get_dfs_referral()
    smb2_get_dfs_refer()
     SMB2_ioctl()
      cifs_send_recv()
       compound_send_recv()
        wait_for_response()

where task A cannot wake up task B because it is blocked due to the
exclusive lock held in cache_refresh_path().

Fixes: c9f711039905 ("cifs: keep referral server sessions alive")
Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
---
 fs/cifs/dfs_cache.c | 37 ++++++++++++++++++-------------------
 1 file changed, 18 insertions(+), 19 deletions(-)

diff --git a/fs/cifs/dfs_cache.c b/fs/cifs/dfs_cache.c
index e20f8880363f..a8ddac1c054c 100644
--- a/fs/cifs/dfs_cache.c
+++ b/fs/cifs/dfs_cache.c
@@ -770,46 +770,45 @@ static int get_dfs_referral(const unsigned int xid, struct cifs_ses *ses, const
  */
 static int cache_refresh_path(const unsigned int xid, struct cifs_ses *ses, const char *path)
 {
-	int rc;
-	struct cache_entry *ce;
 	struct dfs_info3_param *refs = NULL;
+	struct cache_entry *ce;
 	int numrefs = 0;
-	bool newent = false;
+	int rc;
 
 	cifs_dbg(FYI, "%s: search path: %s\n", __func__, path);
 
-	down_write(&htable_rw_lock);
+	down_read(&htable_rw_lock);
 
 	ce = lookup_cache_entry(path);
-	if (!IS_ERR(ce)) {
-		if (!cache_entry_expired(ce)) {
-			dump_ce(ce);
-			up_write(&htable_rw_lock);
-			return 0;
-		}
-	} else {
-		newent = true;
+	if (!IS_ERR(ce) && !cache_entry_expired(ce)) {
+		up_read(&htable_rw_lock);
+		return 0;
 	}
 
+	up_read(&htable_rw_lock);
+
 	/*
 	 * Either the entry was not found, or it is expired.
 	 * Request a new DFS referral in order to create or update a cache entry.
 	 */
 	rc = get_dfs_referral(xid, ses, path, &refs, &numrefs);
 	if (rc)
-		goto out_unlock;
+		goto out;
 
 	dump_refs(refs, numrefs);
 
-	if (!newent) {
-		rc = update_cache_entry_locked(ce, refs, numrefs);
-		goto out_unlock;
+	down_write(&htable_rw_lock);
+	/* Re-check as another task might have it added or refreshed already */
+	ce = lookup_cache_entry(path);
+	if (!IS_ERR(ce)) {
+		if (cache_entry_expired(ce))
+			rc = update_cache_entry_locked(ce, refs, numrefs);
+	} else {
+		rc = add_cache_entry_locked(refs, numrefs);
 	}
 
-	rc = add_cache_entry_locked(refs, numrefs);
-
-out_unlock:
 	up_write(&htable_rw_lock);
+out:
 	free_dfs_info_array(refs, numrefs);
 	return rc;
 }
-- 
2.39.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 2/5] cifs: avoid re-lookups in dfs_cache_find()
  2023-01-17  0:09 [PATCH 0/5] dfs fixes Paulo Alcantara
  2023-01-17  0:09 ` [PATCH 1/5] cifs: fix potential deadlock in cache_refresh_path() Paulo Alcantara
@ 2023-01-17  0:09 ` Paulo Alcantara
  2023-01-17  0:09 ` [PATCH 3/5] cifs: don't take exclusive lock for updating target hints Paulo Alcantara
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 15+ messages in thread
From: Paulo Alcantara @ 2023-01-17  0:09 UTC (permalink / raw)
  To: smfrench; +Cc: linux-cifs, Paulo Alcantara

Simply downgrade the write lock on cache updates from
cache_refresh_path() and avoid unnecessary re-lookup in
dfs_cache_find().

Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
---
 fs/cifs/dfs_cache.c | 57 ++++++++++++++++++++++++++-------------------
 1 file changed, 33 insertions(+), 24 deletions(-)

diff --git a/fs/cifs/dfs_cache.c b/fs/cifs/dfs_cache.c
index a8ddac1c054c..c82721b3277c 100644
--- a/fs/cifs/dfs_cache.c
+++ b/fs/cifs/dfs_cache.c
@@ -558,7 +558,8 @@ static void remove_oldest_entry_locked(void)
 }
 
 /* Add a new DFS cache entry */
-static int add_cache_entry_locked(struct dfs_info3_param *refs, int numrefs)
+static struct cache_entry *add_cache_entry_locked(struct dfs_info3_param *refs,
+						  int numrefs)
 {
 	int rc;
 	struct cache_entry *ce;
@@ -573,11 +574,11 @@ static int add_cache_entry_locked(struct dfs_info3_param *refs, int numrefs)
 
 	rc = cache_entry_hash(refs[0].path_name, strlen(refs[0].path_name), &hash);
 	if (rc)
-		return rc;
+		return ERR_PTR(rc);
 
 	ce = alloc_cache_entry(refs, numrefs);
 	if (IS_ERR(ce))
-		return PTR_ERR(ce);
+		return ce;
 
 	spin_lock(&cache_ttl_lock);
 	if (!cache_ttl) {
@@ -594,7 +595,7 @@ static int add_cache_entry_locked(struct dfs_info3_param *refs, int numrefs)
 
 	atomic_inc(&cache_count);
 
-	return 0;
+	return ce;
 }
 
 /* Check if two DFS paths are equal.  @s1 and @s2 are expected to be in @cache_cp's charset */
@@ -767,8 +768,12 @@ static int get_dfs_referral(const unsigned int xid, struct cifs_ses *ses, const
  *
  * For interlinks, cifs_mount() and expand_dfs_referral() are supposed to
  * handle them properly.
+ *
+ * On success, return entry with acquired lock for reading, otherwise error ptr.
  */
-static int cache_refresh_path(const unsigned int xid, struct cifs_ses *ses, const char *path)
+static struct cache_entry *cache_refresh_path(const unsigned int xid,
+					      struct cifs_ses *ses,
+					      const char *path)
 {
 	struct dfs_info3_param *refs = NULL;
 	struct cache_entry *ce;
@@ -780,10 +785,8 @@ static int cache_refresh_path(const unsigned int xid, struct cifs_ses *ses, cons
 	down_read(&htable_rw_lock);
 
 	ce = lookup_cache_entry(path);
-	if (!IS_ERR(ce) && !cache_entry_expired(ce)) {
-		up_read(&htable_rw_lock);
-		return 0;
-	}
+	if (!IS_ERR(ce) && !cache_entry_expired(ce))
+		return ce;
 
 	up_read(&htable_rw_lock);
 
@@ -792,8 +795,10 @@ static int cache_refresh_path(const unsigned int xid, struct cifs_ses *ses, cons
 	 * Request a new DFS referral in order to create or update a cache entry.
 	 */
 	rc = get_dfs_referral(xid, ses, path, &refs, &numrefs);
-	if (rc)
+	if (rc) {
+		ce = ERR_PTR(rc);
 		goto out;
+	}
 
 	dump_refs(refs, numrefs);
 
@@ -801,16 +806,24 @@ static int cache_refresh_path(const unsigned int xid, struct cifs_ses *ses, cons
 	/* Re-check as another task might have it added or refreshed already */
 	ce = lookup_cache_entry(path);
 	if (!IS_ERR(ce)) {
-		if (cache_entry_expired(ce))
+		if (cache_entry_expired(ce)) {
 			rc = update_cache_entry_locked(ce, refs, numrefs);
+			if (rc)
+				ce = ERR_PTR(rc);
+		}
 	} else {
-		rc = add_cache_entry_locked(refs, numrefs);
+		ce = add_cache_entry_locked(refs, numrefs);
 	}
 
-	up_write(&htable_rw_lock);
+	if (IS_ERR(ce)) {
+		up_write(&htable_rw_lock);
+		goto out;
+	}
+
+	downgrade_write(&htable_rw_lock);
 out:
 	free_dfs_info_array(refs, numrefs);
-	return rc;
+	return ce;
 }
 
 /*
@@ -930,15 +943,8 @@ int dfs_cache_find(const unsigned int xid, struct cifs_ses *ses, const struct nl
 	if (IS_ERR(npath))
 		return PTR_ERR(npath);
 
-	rc = cache_refresh_path(xid, ses, npath);
-	if (rc)
-		goto out_free_path;
-
-	down_read(&htable_rw_lock);
-
-	ce = lookup_cache_entry(npath);
+	ce = cache_refresh_path(xid, ses, npath);
 	if (IS_ERR(ce)) {
-		up_read(&htable_rw_lock);
 		rc = PTR_ERR(ce);
 		goto out_free_path;
 	}
@@ -1034,10 +1040,13 @@ int dfs_cache_update_tgthint(const unsigned int xid, struct cifs_ses *ses,
 
 	cifs_dbg(FYI, "%s: update target hint - path: %s\n", __func__, npath);
 
-	rc = cache_refresh_path(xid, ses, npath);
-	if (rc)
+	ce = cache_refresh_path(xid, ses, npath);
+	if (IS_ERR(ce)) {
+		rc = PTR_ERR(ce);
 		goto out_free_path;
+	}
 
+	up_read(&htable_rw_lock);
 	down_write(&htable_rw_lock);
 
 	ce = lookup_cache_entry(npath);
-- 
2.39.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 3/5] cifs: don't take exclusive lock for updating target hints
  2023-01-17  0:09 [PATCH 0/5] dfs fixes Paulo Alcantara
  2023-01-17  0:09 ` [PATCH 1/5] cifs: fix potential deadlock in cache_refresh_path() Paulo Alcantara
  2023-01-17  0:09 ` [PATCH 2/5] cifs: avoid re-lookups in dfs_cache_find() Paulo Alcantara
@ 2023-01-17  0:09 ` Paulo Alcantara
  2023-01-17  0:09 ` [PATCH 4/5] cifs: remove duplicate code in __refresh_tcon() Paulo Alcantara
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 15+ messages in thread
From: Paulo Alcantara @ 2023-01-17  0:09 UTC (permalink / raw)
  To: smfrench; +Cc: linux-cifs, Paulo Alcantara

Avoid contention while updating dfs target hints.  This should be
perfectly fine to update them under shared locks.

Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
---
 fs/cifs/dfs_cache.c | 47 +++++++++++++++++++--------------------------
 1 file changed, 20 insertions(+), 27 deletions(-)

diff --git a/fs/cifs/dfs_cache.c b/fs/cifs/dfs_cache.c
index c82721b3277c..49d1f390a6b8 100644
--- a/fs/cifs/dfs_cache.c
+++ b/fs/cifs/dfs_cache.c
@@ -269,7 +269,7 @@ static int dfscache_proc_show(struct seq_file *m, void *v)
 			list_for_each_entry(t, &ce->tlist, list) {
 				seq_printf(m, "  %s%s\n",
 					   t->name,
-					   ce->tgthint == t ? " (target hint)" : "");
+					   READ_ONCE(ce->tgthint) == t ? " (target hint)" : "");
 			}
 		}
 	}
@@ -321,7 +321,7 @@ static inline void dump_tgts(const struct cache_entry *ce)
 	cifs_dbg(FYI, "target list:\n");
 	list_for_each_entry(t, &ce->tlist, list) {
 		cifs_dbg(FYI, "  %s%s\n", t->name,
-			 ce->tgthint == t ? " (target hint)" : "");
+			 READ_ONCE(ce->tgthint) == t ? " (target hint)" : "");
 	}
 }
 
@@ -427,7 +427,7 @@ static int cache_entry_hash(const void *data, int size, unsigned int *hash)
 /* Return target hint of a DFS cache entry */
 static inline char *get_tgt_name(const struct cache_entry *ce)
 {
-	struct cache_dfs_tgt *t = ce->tgthint;
+	struct cache_dfs_tgt *t = READ_ONCE(ce->tgthint);
 
 	return t ? t->name : ERR_PTR(-ENOENT);
 }
@@ -470,6 +470,7 @@ static struct cache_dfs_tgt *alloc_target(const char *name, int path_consumed)
 static int copy_ref_data(const struct dfs_info3_param *refs, int numrefs,
 			 struct cache_entry *ce, const char *tgthint)
 {
+	struct cache_dfs_tgt *target;
 	int i;
 
 	ce->ttl = max_t(int, refs[0].ttl, CACHE_MIN_TTL);
@@ -496,8 +497,9 @@ static int copy_ref_data(const struct dfs_info3_param *refs, int numrefs,
 		ce->numtgts++;
 	}
 
-	ce->tgthint = list_first_entry_or_null(&ce->tlist,
-					       struct cache_dfs_tgt, list);
+	target = list_first_entry_or_null(&ce->tlist, struct cache_dfs_tgt,
+					  list);
+	WRITE_ONCE(ce->tgthint, target);
 
 	return 0;
 }
@@ -712,14 +714,15 @@ void dfs_cache_destroy(void)
 static int update_cache_entry_locked(struct cache_entry *ce, const struct dfs_info3_param *refs,
 				     int numrefs)
 {
+	struct cache_dfs_tgt *target;
+	char *th = NULL;
 	int rc;
-	char *s, *th = NULL;
 
 	WARN_ON(!rwsem_is_locked(&htable_rw_lock));
 
-	if (ce->tgthint) {
-		s = ce->tgthint->name;
-		th = kstrdup(s, GFP_ATOMIC);
+	target = READ_ONCE(ce->tgthint);
+	if (target) {
+		th = kstrdup(target->name, GFP_ATOMIC);
 		if (!th)
 			return -ENOMEM;
 	}
@@ -890,7 +893,7 @@ static int get_targets(struct cache_entry *ce, struct dfs_cache_tgt_list *tl)
 		}
 		it->it_path_consumed = t->path_consumed;
 
-		if (ce->tgthint == t)
+		if (READ_ONCE(ce->tgthint) == t)
 			list_add(&it->it_list, head);
 		else
 			list_add_tail(&it->it_list, head);
@@ -1046,23 +1049,14 @@ int dfs_cache_update_tgthint(const unsigned int xid, struct cifs_ses *ses,
 		goto out_free_path;
 	}
 
-	up_read(&htable_rw_lock);
-	down_write(&htable_rw_lock);
-
-	ce = lookup_cache_entry(npath);
-	if (IS_ERR(ce)) {
-		rc = PTR_ERR(ce);
-		goto out_unlock;
-	}
-
-	t = ce->tgthint;
+	t = READ_ONCE(ce->tgthint);
 
 	if (likely(!strcasecmp(it->it_name, t->name)))
 		goto out_unlock;
 
 	list_for_each_entry(t, &ce->tlist, list) {
 		if (!strcasecmp(t->name, it->it_name)) {
-			ce->tgthint = t;
+			WRITE_ONCE(ce->tgthint, t);
 			cifs_dbg(FYI, "%s: new target hint: %s\n", __func__,
 				 it->it_name);
 			break;
@@ -1070,7 +1064,7 @@ int dfs_cache_update_tgthint(const unsigned int xid, struct cifs_ses *ses,
 	}
 
 out_unlock:
-	up_write(&htable_rw_lock);
+	up_read(&htable_rw_lock);
 out_free_path:
 	kfree(npath);
 	return rc;
@@ -1100,21 +1094,20 @@ void dfs_cache_noreq_update_tgthint(const char *path, const struct dfs_cache_tgt
 
 	cifs_dbg(FYI, "%s: path: %s\n", __func__, path);
 
-	if (!down_write_trylock(&htable_rw_lock))
-		return;
+	down_read(&htable_rw_lock);
 
 	ce = lookup_cache_entry(path);
 	if (IS_ERR(ce))
 		goto out_unlock;
 
-	t = ce->tgthint;
+	t = READ_ONCE(ce->tgthint);
 
 	if (unlikely(!strcasecmp(it->it_name, t->name)))
 		goto out_unlock;
 
 	list_for_each_entry(t, &ce->tlist, list) {
 		if (!strcasecmp(t->name, it->it_name)) {
-			ce->tgthint = t;
+			WRITE_ONCE(ce->tgthint, t);
 			cifs_dbg(FYI, "%s: new target hint: %s\n", __func__,
 				 it->it_name);
 			break;
@@ -1122,7 +1115,7 @@ void dfs_cache_noreq_update_tgthint(const char *path, const struct dfs_cache_tgt
 	}
 
 out_unlock:
-	up_write(&htable_rw_lock);
+	up_read(&htable_rw_lock);
 }
 
 /**
-- 
2.39.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 4/5] cifs: remove duplicate code in __refresh_tcon()
  2023-01-17  0:09 [PATCH 0/5] dfs fixes Paulo Alcantara
                   ` (2 preceding siblings ...)
  2023-01-17  0:09 ` [PATCH 3/5] cifs: don't take exclusive lock for updating target hints Paulo Alcantara
@ 2023-01-17  0:09 ` Paulo Alcantara
  2023-01-17  0:09 ` [PATCH 5/5] cifs: handle cache lookup errors different than -ENOENT Paulo Alcantara
  2023-01-17 22:00 ` [PATCH v2 0/5] dfs fixes Paulo Alcantara
  5 siblings, 0 replies; 15+ messages in thread
From: Paulo Alcantara @ 2023-01-17  0:09 UTC (permalink / raw)
  To: smfrench; +Cc: linux-cifs, Paulo Alcantara

The logic for creating or updating a cache entry in __refresh_tcon()
could be simply done with cache_refresh_path(), so use it instead.

Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
---
 fs/cifs/dfs_cache.c | 69 +++++++++++++++++++++------------------------
 1 file changed, 32 insertions(+), 37 deletions(-)

diff --git a/fs/cifs/dfs_cache.c b/fs/cifs/dfs_cache.c
index 49d1f390a6b8..67890960c763 100644
--- a/fs/cifs/dfs_cache.c
+++ b/fs/cifs/dfs_cache.c
@@ -776,7 +776,8 @@ static int get_dfs_referral(const unsigned int xid, struct cifs_ses *ses, const
  */
 static struct cache_entry *cache_refresh_path(const unsigned int xid,
 					      struct cifs_ses *ses,
-					      const char *path)
+					      const char *path,
+					      bool force_refresh)
 {
 	struct dfs_info3_param *refs = NULL;
 	struct cache_entry *ce;
@@ -788,13 +789,14 @@ static struct cache_entry *cache_refresh_path(const unsigned int xid,
 	down_read(&htable_rw_lock);
 
 	ce = lookup_cache_entry(path);
-	if (!IS_ERR(ce) && !cache_entry_expired(ce))
+	if (!IS_ERR(ce) && !force_refresh && !cache_entry_expired(ce))
 		return ce;
 
 	up_read(&htable_rw_lock);
 
 	/*
-	 * Either the entry was not found, or it is expired.
+	 * Either the entry was not found, or it is expired, or it is a forced
+	 * refresh.
 	 * Request a new DFS referral in order to create or update a cache entry.
 	 */
 	rc = get_dfs_referral(xid, ses, path, &refs, &numrefs);
@@ -809,7 +811,7 @@ static struct cache_entry *cache_refresh_path(const unsigned int xid,
 	/* Re-check as another task might have it added or refreshed already */
 	ce = lookup_cache_entry(path);
 	if (!IS_ERR(ce)) {
-		if (cache_entry_expired(ce)) {
+		if (force_refresh || cache_entry_expired(ce)) {
 			rc = update_cache_entry_locked(ce, refs, numrefs);
 			if (rc)
 				ce = ERR_PTR(rc);
@@ -946,7 +948,7 @@ int dfs_cache_find(const unsigned int xid, struct cifs_ses *ses, const struct nl
 	if (IS_ERR(npath))
 		return PTR_ERR(npath);
 
-	ce = cache_refresh_path(xid, ses, npath);
+	ce = cache_refresh_path(xid, ses, npath, false);
 	if (IS_ERR(ce)) {
 		rc = PTR_ERR(ce);
 		goto out_free_path;
@@ -1043,7 +1045,7 @@ int dfs_cache_update_tgthint(const unsigned int xid, struct cifs_ses *ses,
 
 	cifs_dbg(FYI, "%s: update target hint - path: %s\n", __func__, npath);
 
-	ce = cache_refresh_path(xid, ses, npath);
+	ce = cache_refresh_path(xid, ses, npath, false);
 	if (IS_ERR(ce)) {
 		rc = PTR_ERR(ce);
 		goto out_free_path;
@@ -1321,35 +1323,37 @@ static bool target_share_equal(struct TCP_Server_Info *server, const char *s1, c
  * Mark dfs tcon for reconnecting when the currently connected tcon does not match any of the new
  * target shares in @refs.
  */
-static void mark_for_reconnect_if_needed(struct cifs_tcon *tcon, struct dfs_cache_tgt_list *tl,
-					 const struct dfs_info3_param *refs, int numrefs)
+static void mark_for_reconnect_if_needed(struct TCP_Server_Info *server,
+					 struct dfs_cache_tgt_list *old_tl,
+					 struct dfs_cache_tgt_list *new_tl)
 {
-	struct dfs_cache_tgt_iterator *it;
-	int i;
+	struct dfs_cache_tgt_iterator *oit, *nit;
 
-	for (it = dfs_cache_get_tgt_iterator(tl); it; it = dfs_cache_get_next_tgt(tl, it)) {
-		for (i = 0; i < numrefs; i++) {
-			if (target_share_equal(tcon->ses->server, dfs_cache_get_tgt_name(it),
-					       refs[i].node_name))
+	for (oit = dfs_cache_get_tgt_iterator(old_tl); oit;
+	     oit = dfs_cache_get_next_tgt(old_tl, oit)) {
+		for (nit = dfs_cache_get_tgt_iterator(new_tl); nit;
+		     nit = dfs_cache_get_next_tgt(new_tl, nit)) {
+			if (target_share_equal(server,
+					       dfs_cache_get_tgt_name(oit),
+					       dfs_cache_get_tgt_name(nit)))
 				return;
 		}
 	}
 
 	cifs_dbg(FYI, "%s: no cached or matched targets. mark dfs share for reconnect.\n", __func__);
-	cifs_signal_cifsd_for_reconnect(tcon->ses->server, true);
+	cifs_signal_cifsd_for_reconnect(server, true);
 }
 
 /* Refresh dfs referral of tcon and mark it for reconnect if needed */
 static int __refresh_tcon(const char *path, struct cifs_tcon *tcon, bool force_refresh)
 {
-	struct dfs_cache_tgt_list tl = DFS_CACHE_TGT_LIST_INIT(tl);
+	struct dfs_cache_tgt_list old_tl = DFS_CACHE_TGT_LIST_INIT(old_tl);
+	struct dfs_cache_tgt_list new_tl = DFS_CACHE_TGT_LIST_INIT(new_tl);
 	struct cifs_ses *ses = CIFS_DFS_ROOT_SES(tcon->ses);
 	struct cifs_tcon *ipc = ses->tcon_ipc;
-	struct dfs_info3_param *refs = NULL;
 	bool needs_refresh = false;
 	struct cache_entry *ce;
 	unsigned int xid;
-	int numrefs = 0;
 	int rc = 0;
 
 	xid = get_xid();
@@ -1358,9 +1362,8 @@ static int __refresh_tcon(const char *path, struct cifs_tcon *tcon, bool force_r
 	ce = lookup_cache_entry(path);
 	needs_refresh = force_refresh || IS_ERR(ce) || cache_entry_expired(ce);
 	if (!IS_ERR(ce)) {
-		rc = get_targets(ce, &tl);
-		if (rc)
-			cifs_dbg(FYI, "%s: could not get dfs targets: %d\n", __func__, rc);
+		rc = get_targets(ce, &old_tl);
+		cifs_dbg(FYI, "%s: get_targets: %d\n", __func__, rc);
 	}
 	up_read(&htable_rw_lock);
 
@@ -1377,26 +1380,18 @@ static int __refresh_tcon(const char *path, struct cifs_tcon *tcon, bool force_r
 	}
 	spin_unlock(&ipc->tc_lock);
 
-	rc = get_dfs_referral(xid, ses, path, &refs, &numrefs);
-	if (!rc) {
-		/* Create or update a cache entry with the new referral */
-		dump_refs(refs, numrefs);
-
-		down_write(&htable_rw_lock);
-		ce = lookup_cache_entry(path);
-		if (IS_ERR(ce))
-			add_cache_entry_locked(refs, numrefs);
-		else if (force_refresh || cache_entry_expired(ce))
-			update_cache_entry_locked(ce, refs, numrefs);
-		up_write(&htable_rw_lock);
-
-		mark_for_reconnect_if_needed(tcon, &tl, refs, numrefs);
+	ce = cache_refresh_path(xid, ses, path, true);
+	if (!IS_ERR(ce)) {
+		rc = get_targets(ce, &new_tl);
+		up_read(&htable_rw_lock);
+		cifs_dbg(FYI, "%s: get_targets: %d\n", __func__, rc);
+		mark_for_reconnect_if_needed(tcon->ses->server, &old_tl, &new_tl);
 	}
 
 out:
 	free_xid(xid);
-	dfs_cache_free_tgts(&tl);
-	free_dfs_info_array(refs, numrefs);
+	dfs_cache_free_tgts(&old_tl);
+	dfs_cache_free_tgts(&new_tl);
 	return rc;
 }
 
-- 
2.39.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 5/5] cifs: handle cache lookup errors different than -ENOENT
  2023-01-17  0:09 [PATCH 0/5] dfs fixes Paulo Alcantara
                   ` (3 preceding siblings ...)
  2023-01-17  0:09 ` [PATCH 4/5] cifs: remove duplicate code in __refresh_tcon() Paulo Alcantara
@ 2023-01-17  0:09 ` Paulo Alcantara
  2023-01-17 22:00 ` [PATCH v2 0/5] dfs fixes Paulo Alcantara
  5 siblings, 0 replies; 15+ messages in thread
From: Paulo Alcantara @ 2023-01-17  0:09 UTC (permalink / raw)
  To: smfrench; +Cc: linux-cifs, Paulo Alcantara

lookup_cache_entry() might return an error different than -ENOENT
(e.g. from ->char2uni), so handle those as well in
cache_refresh_path().

Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
---
 fs/cifs/dfs_cache.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/fs/cifs/dfs_cache.c b/fs/cifs/dfs_cache.c
index 67890960c763..f426d1473bea 100644
--- a/fs/cifs/dfs_cache.c
+++ b/fs/cifs/dfs_cache.c
@@ -644,7 +644,9 @@ static struct cache_entry *__lookup_cache_entry(const char *path, unsigned int h
  *
  * Use whole path components in the match.  Must be called with htable_rw_lock held.
  *
+ * Return cached entry if successful.
  * Return ERR_PTR(-ENOENT) if the entry is not found.
+ * Return error ptr otherwise.
  */
 static struct cache_entry *lookup_cache_entry(const char *path)
 {
@@ -789,8 +791,13 @@ static struct cache_entry *cache_refresh_path(const unsigned int xid,
 	down_read(&htable_rw_lock);
 
 	ce = lookup_cache_entry(path);
-	if (!IS_ERR(ce) && !force_refresh && !cache_entry_expired(ce))
+	if (!IS_ERR(ce)) {
+		if (!force_refresh && !cache_entry_expired(ce))
+			return ce;
+	} else if (PTR_ERR(ce) != -ENOENT) {
+		up_read(&htable_rw_lock);
 		return ce;
+	}
 
 	up_read(&htable_rw_lock);
 
@@ -816,7 +823,7 @@ static struct cache_entry *cache_refresh_path(const unsigned int xid,
 			if (rc)
 				ce = ERR_PTR(rc);
 		}
-	} else {
+	} else if (PTR_ERR(ce) == -ENOENT) {
 		ce = add_cache_entry_locked(refs, numrefs);
 	}
 
-- 
2.39.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/5] cifs: fix potential deadlock in cache_refresh_path()
  2023-01-17  0:09 ` [PATCH 1/5] cifs: fix potential deadlock in cache_refresh_path() Paulo Alcantara
@ 2023-01-17 17:07   ` Aurélien Aptel
  2023-01-17 18:03     ` Paulo Alcantara
  0 siblings, 1 reply; 15+ messages in thread
From: Aurélien Aptel @ 2023-01-17 17:07 UTC (permalink / raw)
  To: Paulo Alcantara; +Cc: smfrench, linux-cifs

On Tue, Jan 17, 2023 at 1:35 AM Paulo Alcantara <pc@cjr.nz> wrote:
> -       down_write(&htable_rw_lock);
> +       down_read(&htable_rw_lock);
>
>         ce = lookup_cache_entry(path);
> -       if (!IS_ERR(ce)) {
> -               if (!cache_entry_expired(ce)) {
> -                       dump_ce(ce);
> -                       up_write(&htable_rw_lock);
> -                       return 0;
> -               }
> -       } else {
> -               newent = true;
> +       if (!IS_ERR(ce) && !cache_entry_expired(ce)) {
> +               up_read(&htable_rw_lock);
> +               return 0;
>         }
>
> +       up_read(&htable_rw_lock);

Please add a comment before the up_read() to say why you do it here
and where is the dead lock.

Otherwise looks good

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/5] cifs: fix potential deadlock in cache_refresh_path()
  2023-01-17 17:07   ` Aurélien Aptel
@ 2023-01-17 18:03     ` Paulo Alcantara
  0 siblings, 0 replies; 15+ messages in thread
From: Paulo Alcantara @ 2023-01-17 18:03 UTC (permalink / raw)
  To: Aurélien Aptel; +Cc: smfrench, linux-cifs

Aurélien Aptel <aurelien.aptel@gmail.com> writes:

> On Tue, Jan 17, 2023 at 1:35 AM Paulo Alcantara <pc@cjr.nz> wrote:
>> -       down_write(&htable_rw_lock);
>> +       down_read(&htable_rw_lock);
>>
>>         ce = lookup_cache_entry(path);
>> -       if (!IS_ERR(ce)) {
>> -               if (!cache_entry_expired(ce)) {
>> -                       dump_ce(ce);
>> -                       up_write(&htable_rw_lock);
>> -                       return 0;
>> -               }
>> -       } else {
>> -               newent = true;
>> +       if (!IS_ERR(ce) && !cache_entry_expired(ce)) {
>> +               up_read(&htable_rw_lock);
>> +               return 0;
>>         }
>>
>> +       up_read(&htable_rw_lock);
>
> Please add a comment before the up_read() to say why you do it here
> and where is the dead lock.

Ok, thanks.  Will send v2 with your suggestions.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v2 0/5] dfs fixes
  2023-01-17  0:09 [PATCH 0/5] dfs fixes Paulo Alcantara
                   ` (4 preceding siblings ...)
  2023-01-17  0:09 ` [PATCH 5/5] cifs: handle cache lookup errors different than -ENOENT Paulo Alcantara
@ 2023-01-17 22:00 ` Paulo Alcantara
  2023-01-17 22:00   ` [PATCH v2 1/5] cifs: fix potential deadlock in cache_refresh_path() Paulo Alcantara
                     ` (5 more replies)
  5 siblings, 6 replies; 15+ messages in thread
From: Paulo Alcantara @ 2023-01-17 22:00 UTC (permalink / raw)
  To: smfrench; +Cc: linux-cifs, aurelien.aptel, Paulo Alcantara

Hi Steve,

The most important fix is 1/5 that should fix those random hangs that
we've observed while running dfs tests on buildbot.

I have run twice 50 dfs tests against Windows 2022 and samba 4.16 with
these mount options

	vers=3.1.1,echo_interval=10,{,hard}
	vers=3.0,echo_interval=10,{,hard}
	vers=3.0,echo_interval=10,{,sign}
	vers=3.0,echo_interval=10,{,seal}
	vers=2.1,echo_interval=10,{,hard}
	vers=1.0,echo_interval=10,{,hard}

The only tests which failed (2%) were with SMB1 UNIX extensions
against samba.  readdir(2) was getting STATUS_INVALID_LEVEL from
QUERY_PATH_INFO after failover for some reason -- I'll look into that
when time allows.  Those failures aren't related to this series,
though.

I also did some quick tests with kerberos.

---
v1 -> v2: add comments in patch 1/5 as suggested by Aurelien

Paulo Alcantara (5):
  cifs: fix potential deadlock in cache_refresh_path()
  cifs: avoid re-lookups in dfs_cache_find()
  cifs: don't take exclusive lock for updating target hints
  cifs: remove duplicate code in __refresh_tcon()
  cifs: handle cache lookup errors different than -ENOENT

 fs/cifs/dfs_cache.c | 191 +++++++++++++++++++++++---------------------
 1 file changed, 100 insertions(+), 91 deletions(-)

-- 
2.39.0


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v2 1/5] cifs: fix potential deadlock in cache_refresh_path()
  2023-01-17 22:00 ` [PATCH v2 0/5] dfs fixes Paulo Alcantara
@ 2023-01-17 22:00   ` Paulo Alcantara
  2023-01-17 22:00   ` [PATCH v2 2/5] cifs: avoid re-lookups in dfs_cache_find() Paulo Alcantara
                     ` (4 subsequent siblings)
  5 siblings, 0 replies; 15+ messages in thread
From: Paulo Alcantara @ 2023-01-17 22:00 UTC (permalink / raw)
  To: smfrench; +Cc: linux-cifs, aurelien.aptel, Paulo Alcantara

Avoid getting DFS referral from an exclusive lock in
cache_refresh_path() because the tcon IPC used for getting the
referral could be disconnected and thus causing a deadlock as shown
below:

task A                       task B
======                       ======
cifs_demultiplex_thread()    dfs_cache_find()
 cifs_handle_standard()       cache_refresh_path()
  reconnect_dfs_server()       down_write()
   dfs_cache_noreq_find()       get_dfs_referral()
    down_read() <- deadlock      smb2_get_dfs_refer()
                                  SMB2_ioctl()
				   cifs_send_recv()
				    compound_send_recv()
				     wait_for_response()

where task A cannot wake up task B because it is blocked on
down_read() due to the exclusive lock held in cache_refresh_path() and
therefore not being able to make progress.

Fixes: c9f711039905 ("cifs: keep referral server sessions alive")
Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
---
 fs/cifs/dfs_cache.c | 42 +++++++++++++++++++++++-------------------
 1 file changed, 23 insertions(+), 19 deletions(-)

diff --git a/fs/cifs/dfs_cache.c b/fs/cifs/dfs_cache.c
index e20f8880363f..a6d7ae5f49a4 100644
--- a/fs/cifs/dfs_cache.c
+++ b/fs/cifs/dfs_cache.c
@@ -770,26 +770,27 @@ static int get_dfs_referral(const unsigned int xid, struct cifs_ses *ses, const
  */
 static int cache_refresh_path(const unsigned int xid, struct cifs_ses *ses, const char *path)
 {
-	int rc;
-	struct cache_entry *ce;
 	struct dfs_info3_param *refs = NULL;
+	struct cache_entry *ce;
 	int numrefs = 0;
-	bool newent = false;
+	int rc;
 
 	cifs_dbg(FYI, "%s: search path: %s\n", __func__, path);
 
-	down_write(&htable_rw_lock);
+	down_read(&htable_rw_lock);
 
 	ce = lookup_cache_entry(path);
-	if (!IS_ERR(ce)) {
-		if (!cache_entry_expired(ce)) {
-			dump_ce(ce);
-			up_write(&htable_rw_lock);
-			return 0;
-		}
-	} else {
-		newent = true;
+	if (!IS_ERR(ce) && !cache_entry_expired(ce)) {
+		up_read(&htable_rw_lock);
+		return 0;
 	}
+	/*
+	 * Unlock shared access as we don't want to hold any locks while getting
+	 * a new referral.  The @ses used for performing the I/O could be
+	 * reconnecting and it acquires @htable_rw_lock to look up the dfs cache
+	 * in order to failover -- if necessary.
+	 */
+	up_read(&htable_rw_lock);
 
 	/*
 	 * Either the entry was not found, or it is expired.
@@ -797,19 +798,22 @@ static int cache_refresh_path(const unsigned int xid, struct cifs_ses *ses, cons
 	 */
 	rc = get_dfs_referral(xid, ses, path, &refs, &numrefs);
 	if (rc)
-		goto out_unlock;
+		goto out;
 
 	dump_refs(refs, numrefs);
 
-	if (!newent) {
-		rc = update_cache_entry_locked(ce, refs, numrefs);
-		goto out_unlock;
+	down_write(&htable_rw_lock);
+	/* Re-check as another task might have it added or refreshed already */
+	ce = lookup_cache_entry(path);
+	if (!IS_ERR(ce)) {
+		if (cache_entry_expired(ce))
+			rc = update_cache_entry_locked(ce, refs, numrefs);
+	} else {
+		rc = add_cache_entry_locked(refs, numrefs);
 	}
 
-	rc = add_cache_entry_locked(refs, numrefs);
-
-out_unlock:
 	up_write(&htable_rw_lock);
+out:
 	free_dfs_info_array(refs, numrefs);
 	return rc;
 }
-- 
2.39.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v2 2/5] cifs: avoid re-lookups in dfs_cache_find()
  2023-01-17 22:00 ` [PATCH v2 0/5] dfs fixes Paulo Alcantara
  2023-01-17 22:00   ` [PATCH v2 1/5] cifs: fix potential deadlock in cache_refresh_path() Paulo Alcantara
@ 2023-01-17 22:00   ` Paulo Alcantara
  2023-01-17 22:00   ` [PATCH v2 3/5] cifs: don't take exclusive lock for updating target hints Paulo Alcantara
                     ` (3 subsequent siblings)
  5 siblings, 0 replies; 15+ messages in thread
From: Paulo Alcantara @ 2023-01-17 22:00 UTC (permalink / raw)
  To: smfrench; +Cc: linux-cifs, aurelien.aptel, Paulo Alcantara

Simply downgrade the write lock on cache updates from
cache_refresh_path() and avoid unnecessary re-lookup in
dfs_cache_find().

Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
---
 fs/cifs/dfs_cache.c | 58 ++++++++++++++++++++++++++-------------------
 1 file changed, 34 insertions(+), 24 deletions(-)

diff --git a/fs/cifs/dfs_cache.c b/fs/cifs/dfs_cache.c
index a6d7ae5f49a4..755a00c4cba1 100644
--- a/fs/cifs/dfs_cache.c
+++ b/fs/cifs/dfs_cache.c
@@ -558,7 +558,8 @@ static void remove_oldest_entry_locked(void)
 }
 
 /* Add a new DFS cache entry */
-static int add_cache_entry_locked(struct dfs_info3_param *refs, int numrefs)
+static struct cache_entry *add_cache_entry_locked(struct dfs_info3_param *refs,
+						  int numrefs)
 {
 	int rc;
 	struct cache_entry *ce;
@@ -573,11 +574,11 @@ static int add_cache_entry_locked(struct dfs_info3_param *refs, int numrefs)
 
 	rc = cache_entry_hash(refs[0].path_name, strlen(refs[0].path_name), &hash);
 	if (rc)
-		return rc;
+		return ERR_PTR(rc);
 
 	ce = alloc_cache_entry(refs, numrefs);
 	if (IS_ERR(ce))
-		return PTR_ERR(ce);
+		return ce;
 
 	spin_lock(&cache_ttl_lock);
 	if (!cache_ttl) {
@@ -594,7 +595,7 @@ static int add_cache_entry_locked(struct dfs_info3_param *refs, int numrefs)
 
 	atomic_inc(&cache_count);
 
-	return 0;
+	return ce;
 }
 
 /* Check if two DFS paths are equal.  @s1 and @s2 are expected to be in @cache_cp's charset */
@@ -767,8 +768,12 @@ static int get_dfs_referral(const unsigned int xid, struct cifs_ses *ses, const
  *
  * For interlinks, cifs_mount() and expand_dfs_referral() are supposed to
  * handle them properly.
+ *
+ * On success, return entry with acquired lock for reading, otherwise error ptr.
  */
-static int cache_refresh_path(const unsigned int xid, struct cifs_ses *ses, const char *path)
+static struct cache_entry *cache_refresh_path(const unsigned int xid,
+					      struct cifs_ses *ses,
+					      const char *path)
 {
 	struct dfs_info3_param *refs = NULL;
 	struct cache_entry *ce;
@@ -780,10 +785,9 @@ static int cache_refresh_path(const unsigned int xid, struct cifs_ses *ses, cons
 	down_read(&htable_rw_lock);
 
 	ce = lookup_cache_entry(path);
-	if (!IS_ERR(ce) && !cache_entry_expired(ce)) {
-		up_read(&htable_rw_lock);
-		return 0;
-	}
+	if (!IS_ERR(ce) && !cache_entry_expired(ce))
+		return ce;
+
 	/*
 	 * Unlock shared access as we don't want to hold any locks while getting
 	 * a new referral.  The @ses used for performing the I/O could be
@@ -797,8 +801,10 @@ static int cache_refresh_path(const unsigned int xid, struct cifs_ses *ses, cons
 	 * Request a new DFS referral in order to create or update a cache entry.
 	 */
 	rc = get_dfs_referral(xid, ses, path, &refs, &numrefs);
-	if (rc)
+	if (rc) {
+		ce = ERR_PTR(rc);
 		goto out;
+	}
 
 	dump_refs(refs, numrefs);
 
@@ -806,16 +812,24 @@ static int cache_refresh_path(const unsigned int xid, struct cifs_ses *ses, cons
 	/* Re-check as another task might have it added or refreshed already */
 	ce = lookup_cache_entry(path);
 	if (!IS_ERR(ce)) {
-		if (cache_entry_expired(ce))
+		if (cache_entry_expired(ce)) {
 			rc = update_cache_entry_locked(ce, refs, numrefs);
+			if (rc)
+				ce = ERR_PTR(rc);
+		}
 	} else {
-		rc = add_cache_entry_locked(refs, numrefs);
+		ce = add_cache_entry_locked(refs, numrefs);
 	}
 
-	up_write(&htable_rw_lock);
+	if (IS_ERR(ce)) {
+		up_write(&htable_rw_lock);
+		goto out;
+	}
+
+	downgrade_write(&htable_rw_lock);
 out:
 	free_dfs_info_array(refs, numrefs);
-	return rc;
+	return ce;
 }
 
 /*
@@ -935,15 +949,8 @@ int dfs_cache_find(const unsigned int xid, struct cifs_ses *ses, const struct nl
 	if (IS_ERR(npath))
 		return PTR_ERR(npath);
 
-	rc = cache_refresh_path(xid, ses, npath);
-	if (rc)
-		goto out_free_path;
-
-	down_read(&htable_rw_lock);
-
-	ce = lookup_cache_entry(npath);
+	ce = cache_refresh_path(xid, ses, npath);
 	if (IS_ERR(ce)) {
-		up_read(&htable_rw_lock);
 		rc = PTR_ERR(ce);
 		goto out_free_path;
 	}
@@ -1039,10 +1046,13 @@ int dfs_cache_update_tgthint(const unsigned int xid, struct cifs_ses *ses,
 
 	cifs_dbg(FYI, "%s: update target hint - path: %s\n", __func__, npath);
 
-	rc = cache_refresh_path(xid, ses, npath);
-	if (rc)
+	ce = cache_refresh_path(xid, ses, npath);
+	if (IS_ERR(ce)) {
+		rc = PTR_ERR(ce);
 		goto out_free_path;
+	}
 
+	up_read(&htable_rw_lock);
 	down_write(&htable_rw_lock);
 
 	ce = lookup_cache_entry(npath);
-- 
2.39.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v2 3/5] cifs: don't take exclusive lock for updating target hints
  2023-01-17 22:00 ` [PATCH v2 0/5] dfs fixes Paulo Alcantara
  2023-01-17 22:00   ` [PATCH v2 1/5] cifs: fix potential deadlock in cache_refresh_path() Paulo Alcantara
  2023-01-17 22:00   ` [PATCH v2 2/5] cifs: avoid re-lookups in dfs_cache_find() Paulo Alcantara
@ 2023-01-17 22:00   ` Paulo Alcantara
  2023-01-17 22:00   ` [PATCH v2 4/5] cifs: remove duplicate code in __refresh_tcon() Paulo Alcantara
                     ` (2 subsequent siblings)
  5 siblings, 0 replies; 15+ messages in thread
From: Paulo Alcantara @ 2023-01-17 22:00 UTC (permalink / raw)
  To: smfrench; +Cc: linux-cifs, aurelien.aptel, Paulo Alcantara

Avoid contention while updating dfs target hints.  This should be
perfectly fine to update them under shared locks.

Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
---
 fs/cifs/dfs_cache.c | 47 +++++++++++++++++++--------------------------
 1 file changed, 20 insertions(+), 27 deletions(-)

diff --git a/fs/cifs/dfs_cache.c b/fs/cifs/dfs_cache.c
index 755a00c4cba1..19847f9114ba 100644
--- a/fs/cifs/dfs_cache.c
+++ b/fs/cifs/dfs_cache.c
@@ -269,7 +269,7 @@ static int dfscache_proc_show(struct seq_file *m, void *v)
 			list_for_each_entry(t, &ce->tlist, list) {
 				seq_printf(m, "  %s%s\n",
 					   t->name,
-					   ce->tgthint == t ? " (target hint)" : "");
+					   READ_ONCE(ce->tgthint) == t ? " (target hint)" : "");
 			}
 		}
 	}
@@ -321,7 +321,7 @@ static inline void dump_tgts(const struct cache_entry *ce)
 	cifs_dbg(FYI, "target list:\n");
 	list_for_each_entry(t, &ce->tlist, list) {
 		cifs_dbg(FYI, "  %s%s\n", t->name,
-			 ce->tgthint == t ? " (target hint)" : "");
+			 READ_ONCE(ce->tgthint) == t ? " (target hint)" : "");
 	}
 }
 
@@ -427,7 +427,7 @@ static int cache_entry_hash(const void *data, int size, unsigned int *hash)
 /* Return target hint of a DFS cache entry */
 static inline char *get_tgt_name(const struct cache_entry *ce)
 {
-	struct cache_dfs_tgt *t = ce->tgthint;
+	struct cache_dfs_tgt *t = READ_ONCE(ce->tgthint);
 
 	return t ? t->name : ERR_PTR(-ENOENT);
 }
@@ -470,6 +470,7 @@ static struct cache_dfs_tgt *alloc_target(const char *name, int path_consumed)
 static int copy_ref_data(const struct dfs_info3_param *refs, int numrefs,
 			 struct cache_entry *ce, const char *tgthint)
 {
+	struct cache_dfs_tgt *target;
 	int i;
 
 	ce->ttl = max_t(int, refs[0].ttl, CACHE_MIN_TTL);
@@ -496,8 +497,9 @@ static int copy_ref_data(const struct dfs_info3_param *refs, int numrefs,
 		ce->numtgts++;
 	}
 
-	ce->tgthint = list_first_entry_or_null(&ce->tlist,
-					       struct cache_dfs_tgt, list);
+	target = list_first_entry_or_null(&ce->tlist, struct cache_dfs_tgt,
+					  list);
+	WRITE_ONCE(ce->tgthint, target);
 
 	return 0;
 }
@@ -712,14 +714,15 @@ void dfs_cache_destroy(void)
 static int update_cache_entry_locked(struct cache_entry *ce, const struct dfs_info3_param *refs,
 				     int numrefs)
 {
+	struct cache_dfs_tgt *target;
+	char *th = NULL;
 	int rc;
-	char *s, *th = NULL;
 
 	WARN_ON(!rwsem_is_locked(&htable_rw_lock));
 
-	if (ce->tgthint) {
-		s = ce->tgthint->name;
-		th = kstrdup(s, GFP_ATOMIC);
+	target = READ_ONCE(ce->tgthint);
+	if (target) {
+		th = kstrdup(target->name, GFP_ATOMIC);
 		if (!th)
 			return -ENOMEM;
 	}
@@ -896,7 +899,7 @@ static int get_targets(struct cache_entry *ce, struct dfs_cache_tgt_list *tl)
 		}
 		it->it_path_consumed = t->path_consumed;
 
-		if (ce->tgthint == t)
+		if (READ_ONCE(ce->tgthint) == t)
 			list_add(&it->it_list, head);
 		else
 			list_add_tail(&it->it_list, head);
@@ -1052,23 +1055,14 @@ int dfs_cache_update_tgthint(const unsigned int xid, struct cifs_ses *ses,
 		goto out_free_path;
 	}
 
-	up_read(&htable_rw_lock);
-	down_write(&htable_rw_lock);
-
-	ce = lookup_cache_entry(npath);
-	if (IS_ERR(ce)) {
-		rc = PTR_ERR(ce);
-		goto out_unlock;
-	}
-
-	t = ce->tgthint;
+	t = READ_ONCE(ce->tgthint);
 
 	if (likely(!strcasecmp(it->it_name, t->name)))
 		goto out_unlock;
 
 	list_for_each_entry(t, &ce->tlist, list) {
 		if (!strcasecmp(t->name, it->it_name)) {
-			ce->tgthint = t;
+			WRITE_ONCE(ce->tgthint, t);
 			cifs_dbg(FYI, "%s: new target hint: %s\n", __func__,
 				 it->it_name);
 			break;
@@ -1076,7 +1070,7 @@ int dfs_cache_update_tgthint(const unsigned int xid, struct cifs_ses *ses,
 	}
 
 out_unlock:
-	up_write(&htable_rw_lock);
+	up_read(&htable_rw_lock);
 out_free_path:
 	kfree(npath);
 	return rc;
@@ -1106,21 +1100,20 @@ void dfs_cache_noreq_update_tgthint(const char *path, const struct dfs_cache_tgt
 
 	cifs_dbg(FYI, "%s: path: %s\n", __func__, path);
 
-	if (!down_write_trylock(&htable_rw_lock))
-		return;
+	down_read(&htable_rw_lock);
 
 	ce = lookup_cache_entry(path);
 	if (IS_ERR(ce))
 		goto out_unlock;
 
-	t = ce->tgthint;
+	t = READ_ONCE(ce->tgthint);
 
 	if (unlikely(!strcasecmp(it->it_name, t->name)))
 		goto out_unlock;
 
 	list_for_each_entry(t, &ce->tlist, list) {
 		if (!strcasecmp(t->name, it->it_name)) {
-			ce->tgthint = t;
+			WRITE_ONCE(ce->tgthint, t);
 			cifs_dbg(FYI, "%s: new target hint: %s\n", __func__,
 				 it->it_name);
 			break;
@@ -1128,7 +1121,7 @@ void dfs_cache_noreq_update_tgthint(const char *path, const struct dfs_cache_tgt
 	}
 
 out_unlock:
-	up_write(&htable_rw_lock);
+	up_read(&htable_rw_lock);
 }
 
 /**
-- 
2.39.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v2 4/5] cifs: remove duplicate code in __refresh_tcon()
  2023-01-17 22:00 ` [PATCH v2 0/5] dfs fixes Paulo Alcantara
                     ` (2 preceding siblings ...)
  2023-01-17 22:00   ` [PATCH v2 3/5] cifs: don't take exclusive lock for updating target hints Paulo Alcantara
@ 2023-01-17 22:00   ` Paulo Alcantara
  2023-01-17 22:00   ` [PATCH v2 5/5] cifs: handle cache lookup errors different than -ENOENT Paulo Alcantara
  2023-01-18  1:33   ` [PATCH v2 0/5] dfs fixes Steve French
  5 siblings, 0 replies; 15+ messages in thread
From: Paulo Alcantara @ 2023-01-17 22:00 UTC (permalink / raw)
  To: smfrench; +Cc: linux-cifs, aurelien.aptel, Paulo Alcantara

The logic for creating or updating a cache entry in __refresh_tcon()
could be simply done with cache_refresh_path(), so use it instead.

Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
---
 fs/cifs/dfs_cache.c | 69 +++++++++++++++++++++------------------------
 1 file changed, 32 insertions(+), 37 deletions(-)

diff --git a/fs/cifs/dfs_cache.c b/fs/cifs/dfs_cache.c
index 19847f9114ba..58d11be9d020 100644
--- a/fs/cifs/dfs_cache.c
+++ b/fs/cifs/dfs_cache.c
@@ -776,7 +776,8 @@ static int get_dfs_referral(const unsigned int xid, struct cifs_ses *ses, const
  */
 static struct cache_entry *cache_refresh_path(const unsigned int xid,
 					      struct cifs_ses *ses,
-					      const char *path)
+					      const char *path,
+					      bool force_refresh)
 {
 	struct dfs_info3_param *refs = NULL;
 	struct cache_entry *ce;
@@ -788,7 +789,7 @@ static struct cache_entry *cache_refresh_path(const unsigned int xid,
 	down_read(&htable_rw_lock);
 
 	ce = lookup_cache_entry(path);
-	if (!IS_ERR(ce) && !cache_entry_expired(ce))
+	if (!IS_ERR(ce) && !force_refresh && !cache_entry_expired(ce))
 		return ce;
 
 	/*
@@ -800,7 +801,8 @@ static struct cache_entry *cache_refresh_path(const unsigned int xid,
 	up_read(&htable_rw_lock);
 
 	/*
-	 * Either the entry was not found, or it is expired.
+	 * Either the entry was not found, or it is expired, or it is a forced
+	 * refresh.
 	 * Request a new DFS referral in order to create or update a cache entry.
 	 */
 	rc = get_dfs_referral(xid, ses, path, &refs, &numrefs);
@@ -815,7 +817,7 @@ static struct cache_entry *cache_refresh_path(const unsigned int xid,
 	/* Re-check as another task might have it added or refreshed already */
 	ce = lookup_cache_entry(path);
 	if (!IS_ERR(ce)) {
-		if (cache_entry_expired(ce)) {
+		if (force_refresh || cache_entry_expired(ce)) {
 			rc = update_cache_entry_locked(ce, refs, numrefs);
 			if (rc)
 				ce = ERR_PTR(rc);
@@ -952,7 +954,7 @@ int dfs_cache_find(const unsigned int xid, struct cifs_ses *ses, const struct nl
 	if (IS_ERR(npath))
 		return PTR_ERR(npath);
 
-	ce = cache_refresh_path(xid, ses, npath);
+	ce = cache_refresh_path(xid, ses, npath, false);
 	if (IS_ERR(ce)) {
 		rc = PTR_ERR(ce);
 		goto out_free_path;
@@ -1049,7 +1051,7 @@ int dfs_cache_update_tgthint(const unsigned int xid, struct cifs_ses *ses,
 
 	cifs_dbg(FYI, "%s: update target hint - path: %s\n", __func__, npath);
 
-	ce = cache_refresh_path(xid, ses, npath);
+	ce = cache_refresh_path(xid, ses, npath, false);
 	if (IS_ERR(ce)) {
 		rc = PTR_ERR(ce);
 		goto out_free_path;
@@ -1327,35 +1329,37 @@ static bool target_share_equal(struct TCP_Server_Info *server, const char *s1, c
  * Mark dfs tcon for reconnecting when the currently connected tcon does not match any of the new
  * target shares in @refs.
  */
-static void mark_for_reconnect_if_needed(struct cifs_tcon *tcon, struct dfs_cache_tgt_list *tl,
-					 const struct dfs_info3_param *refs, int numrefs)
+static void mark_for_reconnect_if_needed(struct TCP_Server_Info *server,
+					 struct dfs_cache_tgt_list *old_tl,
+					 struct dfs_cache_tgt_list *new_tl)
 {
-	struct dfs_cache_tgt_iterator *it;
-	int i;
+	struct dfs_cache_tgt_iterator *oit, *nit;
 
-	for (it = dfs_cache_get_tgt_iterator(tl); it; it = dfs_cache_get_next_tgt(tl, it)) {
-		for (i = 0; i < numrefs; i++) {
-			if (target_share_equal(tcon->ses->server, dfs_cache_get_tgt_name(it),
-					       refs[i].node_name))
+	for (oit = dfs_cache_get_tgt_iterator(old_tl); oit;
+	     oit = dfs_cache_get_next_tgt(old_tl, oit)) {
+		for (nit = dfs_cache_get_tgt_iterator(new_tl); nit;
+		     nit = dfs_cache_get_next_tgt(new_tl, nit)) {
+			if (target_share_equal(server,
+					       dfs_cache_get_tgt_name(oit),
+					       dfs_cache_get_tgt_name(nit)))
 				return;
 		}
 	}
 
 	cifs_dbg(FYI, "%s: no cached or matched targets. mark dfs share for reconnect.\n", __func__);
-	cifs_signal_cifsd_for_reconnect(tcon->ses->server, true);
+	cifs_signal_cifsd_for_reconnect(server, true);
 }
 
 /* Refresh dfs referral of tcon and mark it for reconnect if needed */
 static int __refresh_tcon(const char *path, struct cifs_tcon *tcon, bool force_refresh)
 {
-	struct dfs_cache_tgt_list tl = DFS_CACHE_TGT_LIST_INIT(tl);
+	struct dfs_cache_tgt_list old_tl = DFS_CACHE_TGT_LIST_INIT(old_tl);
+	struct dfs_cache_tgt_list new_tl = DFS_CACHE_TGT_LIST_INIT(new_tl);
 	struct cifs_ses *ses = CIFS_DFS_ROOT_SES(tcon->ses);
 	struct cifs_tcon *ipc = ses->tcon_ipc;
-	struct dfs_info3_param *refs = NULL;
 	bool needs_refresh = false;
 	struct cache_entry *ce;
 	unsigned int xid;
-	int numrefs = 0;
 	int rc = 0;
 
 	xid = get_xid();
@@ -1364,9 +1368,8 @@ static int __refresh_tcon(const char *path, struct cifs_tcon *tcon, bool force_r
 	ce = lookup_cache_entry(path);
 	needs_refresh = force_refresh || IS_ERR(ce) || cache_entry_expired(ce);
 	if (!IS_ERR(ce)) {
-		rc = get_targets(ce, &tl);
-		if (rc)
-			cifs_dbg(FYI, "%s: could not get dfs targets: %d\n", __func__, rc);
+		rc = get_targets(ce, &old_tl);
+		cifs_dbg(FYI, "%s: get_targets: %d\n", __func__, rc);
 	}
 	up_read(&htable_rw_lock);
 
@@ -1383,26 +1386,18 @@ static int __refresh_tcon(const char *path, struct cifs_tcon *tcon, bool force_r
 	}
 	spin_unlock(&ipc->tc_lock);
 
-	rc = get_dfs_referral(xid, ses, path, &refs, &numrefs);
-	if (!rc) {
-		/* Create or update a cache entry with the new referral */
-		dump_refs(refs, numrefs);
-
-		down_write(&htable_rw_lock);
-		ce = lookup_cache_entry(path);
-		if (IS_ERR(ce))
-			add_cache_entry_locked(refs, numrefs);
-		else if (force_refresh || cache_entry_expired(ce))
-			update_cache_entry_locked(ce, refs, numrefs);
-		up_write(&htable_rw_lock);
-
-		mark_for_reconnect_if_needed(tcon, &tl, refs, numrefs);
+	ce = cache_refresh_path(xid, ses, path, true);
+	if (!IS_ERR(ce)) {
+		rc = get_targets(ce, &new_tl);
+		up_read(&htable_rw_lock);
+		cifs_dbg(FYI, "%s: get_targets: %d\n", __func__, rc);
+		mark_for_reconnect_if_needed(tcon->ses->server, &old_tl, &new_tl);
 	}
 
 out:
 	free_xid(xid);
-	dfs_cache_free_tgts(&tl);
-	free_dfs_info_array(refs, numrefs);
+	dfs_cache_free_tgts(&old_tl);
+	dfs_cache_free_tgts(&new_tl);
 	return rc;
 }
 
-- 
2.39.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v2 5/5] cifs: handle cache lookup errors different than -ENOENT
  2023-01-17 22:00 ` [PATCH v2 0/5] dfs fixes Paulo Alcantara
                     ` (3 preceding siblings ...)
  2023-01-17 22:00   ` [PATCH v2 4/5] cifs: remove duplicate code in __refresh_tcon() Paulo Alcantara
@ 2023-01-17 22:00   ` Paulo Alcantara
  2023-01-18  1:33   ` [PATCH v2 0/5] dfs fixes Steve French
  5 siblings, 0 replies; 15+ messages in thread
From: Paulo Alcantara @ 2023-01-17 22:00 UTC (permalink / raw)
  To: smfrench; +Cc: linux-cifs, aurelien.aptel, Paulo Alcantara

lookup_cache_entry() might return an error different than -ENOENT
(e.g. from ->char2uni), so handle those as well in
cache_refresh_path().

Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
---
 fs/cifs/dfs_cache.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/fs/cifs/dfs_cache.c b/fs/cifs/dfs_cache.c
index 58d11be9d020..308101d90006 100644
--- a/fs/cifs/dfs_cache.c
+++ b/fs/cifs/dfs_cache.c
@@ -644,7 +644,9 @@ static struct cache_entry *__lookup_cache_entry(const char *path, unsigned int h
  *
  * Use whole path components in the match.  Must be called with htable_rw_lock held.
  *
+ * Return cached entry if successful.
  * Return ERR_PTR(-ENOENT) if the entry is not found.
+ * Return error ptr otherwise.
  */
 static struct cache_entry *lookup_cache_entry(const char *path)
 {
@@ -789,8 +791,13 @@ static struct cache_entry *cache_refresh_path(const unsigned int xid,
 	down_read(&htable_rw_lock);
 
 	ce = lookup_cache_entry(path);
-	if (!IS_ERR(ce) && !force_refresh && !cache_entry_expired(ce))
+	if (!IS_ERR(ce)) {
+		if (!force_refresh && !cache_entry_expired(ce))
+			return ce;
+	} else if (PTR_ERR(ce) != -ENOENT) {
+		up_read(&htable_rw_lock);
 		return ce;
+	}
 
 	/*
 	 * Unlock shared access as we don't want to hold any locks while getting
@@ -822,7 +829,7 @@ static struct cache_entry *cache_refresh_path(const unsigned int xid,
 			if (rc)
 				ce = ERR_PTR(rc);
 		}
-	} else {
+	} else if (PTR_ERR(ce) == -ENOENT) {
 		ce = add_cache_entry_locked(refs, numrefs);
 	}
 
-- 
2.39.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 0/5] dfs fixes
  2023-01-17 22:00 ` [PATCH v2 0/5] dfs fixes Paulo Alcantara
                     ` (4 preceding siblings ...)
  2023-01-17 22:00   ` [PATCH v2 5/5] cifs: handle cache lookup errors different than -ENOENT Paulo Alcantara
@ 2023-01-18  1:33   ` Steve French
  5 siblings, 0 replies; 15+ messages in thread
From: Steve French @ 2023-01-18  1:33 UTC (permalink / raw)
  To: Paulo Alcantara; +Cc: linux-cifs, aurelien.aptel

tentatively merged into cifs-2.6.git for-next pending more review/testing

On Tue, Jan 17, 2023 at 4:00 PM Paulo Alcantara <pc@cjr.nz> wrote:
>
> Hi Steve,
>
> The most important fix is 1/5 that should fix those random hangs that
> we've observed while running dfs tests on buildbot.
>
> I have run twice 50 dfs tests against Windows 2022 and samba 4.16 with
> these mount options
>
>         vers=3.1.1,echo_interval=10,{,hard}
>         vers=3.0,echo_interval=10,{,hard}
>         vers=3.0,echo_interval=10,{,sign}
>         vers=3.0,echo_interval=10,{,seal}
>         vers=2.1,echo_interval=10,{,hard}
>         vers=1.0,echo_interval=10,{,hard}
>
> The only tests which failed (2%) were with SMB1 UNIX extensions
> against samba.  readdir(2) was getting STATUS_INVALID_LEVEL from
> QUERY_PATH_INFO after failover for some reason -- I'll look into that
> when time allows.  Those failures aren't related to this series,
> though.
>
> I also did some quick tests with kerberos.
>
> ---
> v1 -> v2: add comments in patch 1/5 as suggested by Aurelien
>
> Paulo Alcantara (5):
>   cifs: fix potential deadlock in cache_refresh_path()
>   cifs: avoid re-lookups in dfs_cache_find()
>   cifs: don't take exclusive lock for updating target hints
>   cifs: remove duplicate code in __refresh_tcon()
>   cifs: handle cache lookup errors different than -ENOENT
>
>  fs/cifs/dfs_cache.c | 191 +++++++++++++++++++++++---------------------
>  1 file changed, 100 insertions(+), 91 deletions(-)
>
> --
> 2.39.0
>


-- 
Thanks,

Steve

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2023-01-18  1:33 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-17  0:09 [PATCH 0/5] dfs fixes Paulo Alcantara
2023-01-17  0:09 ` [PATCH 1/5] cifs: fix potential deadlock in cache_refresh_path() Paulo Alcantara
2023-01-17 17:07   ` Aurélien Aptel
2023-01-17 18:03     ` Paulo Alcantara
2023-01-17  0:09 ` [PATCH 2/5] cifs: avoid re-lookups in dfs_cache_find() Paulo Alcantara
2023-01-17  0:09 ` [PATCH 3/5] cifs: don't take exclusive lock for updating target hints Paulo Alcantara
2023-01-17  0:09 ` [PATCH 4/5] cifs: remove duplicate code in __refresh_tcon() Paulo Alcantara
2023-01-17  0:09 ` [PATCH 5/5] cifs: handle cache lookup errors different than -ENOENT Paulo Alcantara
2023-01-17 22:00 ` [PATCH v2 0/5] dfs fixes Paulo Alcantara
2023-01-17 22:00   ` [PATCH v2 1/5] cifs: fix potential deadlock in cache_refresh_path() Paulo Alcantara
2023-01-17 22:00   ` [PATCH v2 2/5] cifs: avoid re-lookups in dfs_cache_find() Paulo Alcantara
2023-01-17 22:00   ` [PATCH v2 3/5] cifs: don't take exclusive lock for updating target hints Paulo Alcantara
2023-01-17 22:00   ` [PATCH v2 4/5] cifs: remove duplicate code in __refresh_tcon() Paulo Alcantara
2023-01-17 22:00   ` [PATCH v2 5/5] cifs: handle cache lookup errors different than -ENOENT Paulo Alcantara
2023-01-18  1:33   ` [PATCH v2 0/5] dfs fixes Steve French

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).