All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/5] dfs fixes
@ 2023-01-17  0:09 Paulo Alcantara
  2023-01-17  0:09 ` [PATCH 1/5] cifs: fix potential deadlock in cache_refresh_path() Paulo Alcantara
                   ` (5 more replies)
  0 siblings, 6 replies; 15+ messages in thread
From: Paulo Alcantara @ 2023-01-17  0:09 UTC (permalink / raw)
  To: smfrench; +Cc: linux-cifs, Paulo Alcantara

Hi Steve,

The most important fix is 1/5 that should fix those random hangs that
we've observed while running dfs tests on buildbot.

I have run twice 50 dfs tests against Windows 2022 and samba 4.16 with
these mount options

	vers=3.1.1,echo_interval=10,{,hard}
	vers=3.0,echo_interval=10,{,hard}
	vers=3.0,echo_interval=10,{,sign}
	vers=3.0,echo_interval=10,{,seal}
	vers=2.1,echo_interval=10,{,hard}
	vers=1.0,echo_interval=10,{,hard}

The only tests which failed (2%) were with SMB1 UNIX extensions
against samba.  readdir(2) was getting STATUS_INVALID_LEVEL from
QUERY_PATH_INFO after failover for some reason -- I'll look into that
when time allows.  Those failures aren't related to this series,
though.

I also did some quick tests with kerberos.

Paulo Alcantara (5):
  cifs: fix potential deadlock in cache_refresh_path()
  cifs: avoid re-lookups in dfs_cache_find()
  cifs: don't take exclusive lock for updating target hints
  cifs: remove duplicate code in __refresh_tcon()
  cifs: handle cache lookup errors different than -ENOENT

 fs/cifs/dfs_cache.c | 185 ++++++++++++++++++++++----------------------
 1 file changed, 94 insertions(+), 91 deletions(-)

-- 
2.39.0


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/5] cifs: fix potential deadlock in cache_refresh_path()
  2023-01-17  0:09 [PATCH 0/5] dfs fixes Paulo Alcantara
@ 2023-01-17  0:09 ` Paulo Alcantara
  2023-01-17 17:07   ` Aurélien Aptel
  2023-01-17  0:09 ` [PATCH 2/5] cifs: avoid re-lookups in dfs_cache_find() Paulo Alcantara
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 15+ messages in thread
From: Paulo Alcantara @ 2023-01-17  0:09 UTC (permalink / raw)
  To: smfrench; +Cc: linux-cifs, Paulo Alcantara

Avoid getting DFS referral from an exclusive lock in
cache_refresh_path() because the tcon IPC used for getting the
referral could be disconnected and thus causing a deadlock as shown
below:

task A
------
cifs_demultiplex_thread()
 cifs_handle_standard()
  reconnect_dfs_server()
   dfs_cache_noreq_find()
    down_read()

task B
------
dfs_cache_find()
 cache_refresh_path()
  down_write()
   get_dfs_referral()
    smb2_get_dfs_refer()
     SMB2_ioctl()
      cifs_send_recv()
       compound_send_recv()
        wait_for_response()

where task A cannot wake up task B because it is blocked due to the
exclusive lock held in cache_refresh_path().

Fixes: c9f711039905 ("cifs: keep referral server sessions alive")
Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
---
 fs/cifs/dfs_cache.c | 37 ++++++++++++++++++-------------------
 1 file changed, 18 insertions(+), 19 deletions(-)

diff --git a/fs/cifs/dfs_cache.c b/fs/cifs/dfs_cache.c
index e20f8880363f..a8ddac1c054c 100644
--- a/fs/cifs/dfs_cache.c
+++ b/fs/cifs/dfs_cache.c
@@ -770,46 +770,45 @@ static int get_dfs_referral(const unsigned int xid, struct cifs_ses *ses, const
  */
 static int cache_refresh_path(const unsigned int xid, struct cifs_ses *ses, const char *path)
 {
-	int rc;
-	struct cache_entry *ce;
 	struct dfs_info3_param *refs = NULL;
+	struct cache_entry *ce;
 	int numrefs = 0;
-	bool newent = false;
+	int rc;
 
 	cifs_dbg(FYI, "%s: search path: %s\n", __func__, path);
 
-	down_write(&htable_rw_lock);
+	down_read(&htable_rw_lock);
 
 	ce = lookup_cache_entry(path);
-	if (!IS_ERR(ce)) {
-		if (!cache_entry_expired(ce)) {
-			dump_ce(ce);
-			up_write(&htable_rw_lock);
-			return 0;
-		}
-	} else {
-		newent = true;
+	if (!IS_ERR(ce) && !cache_entry_expired(ce)) {
+		up_read(&htable_rw_lock);
+		return 0;
 	}
 
+	up_read(&htable_rw_lock);
+
 	/*
 	 * Either the entry was not found, or it is expired.
 	 * Request a new DFS referral in order to create or update a cache entry.
 	 */
 	rc = get_dfs_referral(xid, ses, path, &refs, &numrefs);
 	if (rc)
-		goto out_unlock;
+		goto out;
 
 	dump_refs(refs, numrefs);
 
-	if (!newent) {
-		rc = update_cache_entry_locked(ce, refs, numrefs);
-		goto out_unlock;
+	down_write(&htable_rw_lock);
+	/* Re-check as another task might have it added or refreshed already */
+	ce = lookup_cache_entry(path);
+	if (!IS_ERR(ce)) {
+		if (cache_entry_expired(ce))
+			rc = update_cache_entry_locked(ce, refs, numrefs);
+	} else {
+		rc = add_cache_entry_locked(refs, numrefs);
 	}
 
-	rc = add_cache_entry_locked(refs, numrefs);
-
-out_unlock:
 	up_write(&htable_rw_lock);
+out:
 	free_dfs_info_array(refs, numrefs);
 	return rc;
 }
-- 
2.39.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 2/5] cifs: avoid re-lookups in dfs_cache_find()
  2023-01-17  0:09 [PATCH 0/5] dfs fixes Paulo Alcantara
  2023-01-17  0:09 ` [PATCH 1/5] cifs: fix potential deadlock in cache_refresh_path() Paulo Alcantara
@ 2023-01-17  0:09 ` Paulo Alcantara
  2023-01-17  0:09 ` [PATCH 3/5] cifs: don't take exclusive lock for updating target hints Paulo Alcantara
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 15+ messages in thread
From: Paulo Alcantara @ 2023-01-17  0:09 UTC (permalink / raw)
  To: smfrench; +Cc: linux-cifs, Paulo Alcantara

Simply downgrade the write lock on cache updates from
cache_refresh_path() and avoid unnecessary re-lookup in
dfs_cache_find().

Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
---
 fs/cifs/dfs_cache.c | 57 ++++++++++++++++++++++++++-------------------
 1 file changed, 33 insertions(+), 24 deletions(-)

diff --git a/fs/cifs/dfs_cache.c b/fs/cifs/dfs_cache.c
index a8ddac1c054c..c82721b3277c 100644
--- a/fs/cifs/dfs_cache.c
+++ b/fs/cifs/dfs_cache.c
@@ -558,7 +558,8 @@ static void remove_oldest_entry_locked(void)
 }
 
 /* Add a new DFS cache entry */
-static int add_cache_entry_locked(struct dfs_info3_param *refs, int numrefs)
+static struct cache_entry *add_cache_entry_locked(struct dfs_info3_param *refs,
+						  int numrefs)
 {
 	int rc;
 	struct cache_entry *ce;
@@ -573,11 +574,11 @@ static int add_cache_entry_locked(struct dfs_info3_param *refs, int numrefs)
 
 	rc = cache_entry_hash(refs[0].path_name, strlen(refs[0].path_name), &hash);
 	if (rc)
-		return rc;
+		return ERR_PTR(rc);
 
 	ce = alloc_cache_entry(refs, numrefs);
 	if (IS_ERR(ce))
-		return PTR_ERR(ce);
+		return ce;
 
 	spin_lock(&cache_ttl_lock);
 	if (!cache_ttl) {
@@ -594,7 +595,7 @@ static int add_cache_entry_locked(struct dfs_info3_param *refs, int numrefs)
 
 	atomic_inc(&cache_count);
 
-	return 0;
+	return ce;
 }
 
 /* Check if two DFS paths are equal.  @s1 and @s2 are expected to be in @cache_cp's charset */
@@ -767,8 +768,12 @@ static int get_dfs_referral(const unsigned int xid, struct cifs_ses *ses, const
  *
  * For interlinks, cifs_mount() and expand_dfs_referral() are supposed to
  * handle them properly.
+ *
+ * On success, return entry with acquired lock for reading, otherwise error ptr.
  */
-static int cache_refresh_path(const unsigned int xid, struct cifs_ses *ses, const char *path)
+static struct cache_entry *cache_refresh_path(const unsigned int xid,
+					      struct cifs_ses *ses,
+					      const char *path)
 {
 	struct dfs_info3_param *refs = NULL;
 	struct cache_entry *ce;
@@ -780,10 +785,8 @@ static int cache_refresh_path(const unsigned int xid, struct cifs_ses *ses, cons
 	down_read(&htable_rw_lock);
 
 	ce = lookup_cache_entry(path);
-	if (!IS_ERR(ce) && !cache_entry_expired(ce)) {
-		up_read(&htable_rw_lock);
-		return 0;
-	}
+	if (!IS_ERR(ce) && !cache_entry_expired(ce))
+		return ce;
 
 	up_read(&htable_rw_lock);
 
@@ -792,8 +795,10 @@ static int cache_refresh_path(const unsigned int xid, struct cifs_ses *ses, cons
 	 * Request a new DFS referral in order to create or update a cache entry.
 	 */
 	rc = get_dfs_referral(xid, ses, path, &refs, &numrefs);
-	if (rc)
+	if (rc) {
+		ce = ERR_PTR(rc);
 		goto out;
+	}
 
 	dump_refs(refs, numrefs);
 
@@ -801,16 +806,24 @@ static int cache_refresh_path(const unsigned int xid, struct cifs_ses *ses, cons
 	/* Re-check as another task might have it added or refreshed already */
 	ce = lookup_cache_entry(path);
 	if (!IS_ERR(ce)) {
-		if (cache_entry_expired(ce))
+		if (cache_entry_expired(ce)) {
 			rc = update_cache_entry_locked(ce, refs, numrefs);
+			if (rc)
+				ce = ERR_PTR(rc);
+		}
 	} else {
-		rc = add_cache_entry_locked(refs, numrefs);
+		ce = add_cache_entry_locked(refs, numrefs);
 	}
 
-	up_write(&htable_rw_lock);
+	if (IS_ERR(ce)) {
+		up_write(&htable_rw_lock);
+		goto out;
+	}
+
+	downgrade_write(&htable_rw_lock);
 out:
 	free_dfs_info_array(refs, numrefs);
-	return rc;
+	return ce;
 }
 
 /*
@@ -930,15 +943,8 @@ int dfs_cache_find(const unsigned int xid, struct cifs_ses *ses, const struct nl
 	if (IS_ERR(npath))
 		return PTR_ERR(npath);
 
-	rc = cache_refresh_path(xid, ses, npath);
-	if (rc)
-		goto out_free_path;
-
-	down_read(&htable_rw_lock);
-
-	ce = lookup_cache_entry(npath);
+	ce = cache_refresh_path(xid, ses, npath);
 	if (IS_ERR(ce)) {
-		up_read(&htable_rw_lock);
 		rc = PTR_ERR(ce);
 		goto out_free_path;
 	}
@@ -1034,10 +1040,13 @@ int dfs_cache_update_tgthint(const unsigned int xid, struct cifs_ses *ses,
 
 	cifs_dbg(FYI, "%s: update target hint - path: %s\n", __func__, npath);
 
-	rc = cache_refresh_path(xid, ses, npath);
-	if (rc)
+	ce = cache_refresh_path(xid, ses, npath);
+	if (IS_ERR(ce)) {
+		rc = PTR_ERR(ce);
 		goto out_free_path;
+	}
 
+	up_read(&htable_rw_lock);
 	down_write(&htable_rw_lock);
 
 	ce = lookup_cache_entry(npath);
-- 
2.39.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 3/5] cifs: don't take exclusive lock for updating target hints
  2023-01-17  0:09 [PATCH 0/5] dfs fixes Paulo Alcantara
  2023-01-17  0:09 ` [PATCH 1/5] cifs: fix potential deadlock in cache_refresh_path() Paulo Alcantara
  2023-01-17  0:09 ` [PATCH 2/5] cifs: avoid re-lookups in dfs_cache_find() Paulo Alcantara
@ 2023-01-17  0:09 ` Paulo Alcantara
  2023-01-17  0:09 ` [PATCH 4/5] cifs: remove duplicate code in __refresh_tcon() Paulo Alcantara
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 15+ messages in thread
From: Paulo Alcantara @ 2023-01-17  0:09 UTC (permalink / raw)
  To: smfrench; +Cc: linux-cifs, Paulo Alcantara

Avoid contention while updating dfs target hints.  This should be
perfectly fine to update them under shared locks.

Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
---
 fs/cifs/dfs_cache.c | 47 +++++++++++++++++++--------------------------
 1 file changed, 20 insertions(+), 27 deletions(-)

diff --git a/fs/cifs/dfs_cache.c b/fs/cifs/dfs_cache.c
index c82721b3277c..49d1f390a6b8 100644
--- a/fs/cifs/dfs_cache.c
+++ b/fs/cifs/dfs_cache.c
@@ -269,7 +269,7 @@ static int dfscache_proc_show(struct seq_file *m, void *v)
 			list_for_each_entry(t, &ce->tlist, list) {
 				seq_printf(m, "  %s%s\n",
 					   t->name,
-					   ce->tgthint == t ? " (target hint)" : "");
+					   READ_ONCE(ce->tgthint) == t ? " (target hint)" : "");
 			}
 		}
 	}
@@ -321,7 +321,7 @@ static inline void dump_tgts(const struct cache_entry *ce)
 	cifs_dbg(FYI, "target list:\n");
 	list_for_each_entry(t, &ce->tlist, list) {
 		cifs_dbg(FYI, "  %s%s\n", t->name,
-			 ce->tgthint == t ? " (target hint)" : "");
+			 READ_ONCE(ce->tgthint) == t ? " (target hint)" : "");
 	}
 }
 
@@ -427,7 +427,7 @@ static int cache_entry_hash(const void *data, int size, unsigned int *hash)
 /* Return target hint of a DFS cache entry */
 static inline char *get_tgt_name(const struct cache_entry *ce)
 {
-	struct cache_dfs_tgt *t = ce->tgthint;
+	struct cache_dfs_tgt *t = READ_ONCE(ce->tgthint);
 
 	return t ? t->name : ERR_PTR(-ENOENT);
 }
@@ -470,6 +470,7 @@ static struct cache_dfs_tgt *alloc_target(const char *name, int path_consumed)
 static int copy_ref_data(const struct dfs_info3_param *refs, int numrefs,
 			 struct cache_entry *ce, const char *tgthint)
 {
+	struct cache_dfs_tgt *target;
 	int i;
 
 	ce->ttl = max_t(int, refs[0].ttl, CACHE_MIN_TTL);
@@ -496,8 +497,9 @@ static int copy_ref_data(const struct dfs_info3_param *refs, int numrefs,
 		ce->numtgts++;
 	}
 
-	ce->tgthint = list_first_entry_or_null(&ce->tlist,
-					       struct cache_dfs_tgt, list);
+	target = list_first_entry_or_null(&ce->tlist, struct cache_dfs_tgt,
+					  list);
+	WRITE_ONCE(ce->tgthint, target);
 
 	return 0;
 }
@@ -712,14 +714,15 @@ void dfs_cache_destroy(void)
 static int update_cache_entry_locked(struct cache_entry *ce, const struct dfs_info3_param *refs,
 				     int numrefs)
 {
+	struct cache_dfs_tgt *target;
+	char *th = NULL;
 	int rc;
-	char *s, *th = NULL;
 
 	WARN_ON(!rwsem_is_locked(&htable_rw_lock));
 
-	if (ce->tgthint) {
-		s = ce->tgthint->name;
-		th = kstrdup(s, GFP_ATOMIC);
+	target = READ_ONCE(ce->tgthint);
+	if (target) {
+		th = kstrdup(target->name, GFP_ATOMIC);
 		if (!th)
 			return -ENOMEM;
 	}
@@ -890,7 +893,7 @@ static int get_targets(struct cache_entry *ce, struct dfs_cache_tgt_list *tl)
 		}
 		it->it_path_consumed = t->path_consumed;
 
-		if (ce->tgthint == t)
+		if (READ_ONCE(ce->tgthint) == t)
 			list_add(&it->it_list, head);
 		else
 			list_add_tail(&it->it_list, head);
@@ -1046,23 +1049,14 @@ int dfs_cache_update_tgthint(const unsigned int xid, struct cifs_ses *ses,
 		goto out_free_path;
 	}
 
-	up_read(&htable_rw_lock);
-	down_write(&htable_rw_lock);
-
-	ce = lookup_cache_entry(npath);
-	if (IS_ERR(ce)) {
-		rc = PTR_ERR(ce);
-		goto out_unlock;
-	}
-
-	t = ce->tgthint;
+	t = READ_ONCE(ce->tgthint);
 
 	if (likely(!strcasecmp(it->it_name, t->name)))
 		goto out_unlock;
 
 	list_for_each_entry(t, &ce->tlist, list) {
 		if (!strcasecmp(t->name, it->it_name)) {
-			ce->tgthint = t;
+			WRITE_ONCE(ce->tgthint, t);
 			cifs_dbg(FYI, "%s: new target hint: %s\n", __func__,
 				 it->it_name);
 			break;
@@ -1070,7 +1064,7 @@ int dfs_cache_update_tgthint(const unsigned int xid, struct cifs_ses *ses,
 	}
 
 out_unlock:
-	up_write(&htable_rw_lock);
+	up_read(&htable_rw_lock);
 out_free_path:
 	kfree(npath);
 	return rc;
@@ -1100,21 +1094,20 @@ void dfs_cache_noreq_update_tgthint(const char *path, const struct dfs_cache_tgt
 
 	cifs_dbg(FYI, "%s: path: %s\n", __func__, path);
 
-	if (!down_write_trylock(&htable_rw_lock))
-		return;
+	down_read(&htable_rw_lock);
 
 	ce = lookup_cache_entry(path);
 	if (IS_ERR(ce))
 		goto out_unlock;
 
-	t = ce->tgthint;
+	t = READ_ONCE(ce->tgthint);
 
 	if (unlikely(!strcasecmp(it->it_name, t->name)))
 		goto out_unlock;
 
 	list_for_each_entry(t, &ce->tlist, list) {
 		if (!strcasecmp(t->name, it->it_name)) {
-			ce->tgthint = t;
+			WRITE_ONCE(ce->tgthint, t);
 			cifs_dbg(FYI, "%s: new target hint: %s\n", __func__,
 				 it->it_name);
 			break;
@@ -1122,7 +1115,7 @@ void dfs_cache_noreq_update_tgthint(const char *path, const struct dfs_cache_tgt
 	}
 
 out_unlock:
-	up_write(&htable_rw_lock);
+	up_read(&htable_rw_lock);
 }
 
 /**
-- 
2.39.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 4/5] cifs: remove duplicate code in __refresh_tcon()
  2023-01-17  0:09 [PATCH 0/5] dfs fixes Paulo Alcantara
                   ` (2 preceding siblings ...)
  2023-01-17  0:09 ` [PATCH 3/5] cifs: don't take exclusive lock for updating target hints Paulo Alcantara
@ 2023-01-17  0:09 ` Paulo Alcantara
  2023-01-17  0:09 ` [PATCH 5/5] cifs: handle cache lookup errors different than -ENOENT Paulo Alcantara
  2023-01-17 22:00 ` [PATCH v2 0/5] dfs fixes Paulo Alcantara
  5 siblings, 0 replies; 15+ messages in thread
From: Paulo Alcantara @ 2023-01-17  0:09 UTC (permalink / raw)
  To: smfrench; +Cc: linux-cifs, Paulo Alcantara

The logic for creating or updating a cache entry in __refresh_tcon()
could be simply done with cache_refresh_path(), so use it instead.

Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
---
 fs/cifs/dfs_cache.c | 69 +++++++++++++++++++++------------------------
 1 file changed, 32 insertions(+), 37 deletions(-)

diff --git a/fs/cifs/dfs_cache.c b/fs/cifs/dfs_cache.c
index 49d1f390a6b8..67890960c763 100644
--- a/fs/cifs/dfs_cache.c
+++ b/fs/cifs/dfs_cache.c
@@ -776,7 +776,8 @@ static int get_dfs_referral(const unsigned int xid, struct cifs_ses *ses, const
  */
 static struct cache_entry *cache_refresh_path(const unsigned int xid,
 					      struct cifs_ses *ses,
-					      const char *path)
+					      const char *path,
+					      bool force_refresh)
 {
 	struct dfs_info3_param *refs = NULL;
 	struct cache_entry *ce;
@@ -788,13 +789,14 @@ static struct cache_entry *cache_refresh_path(const unsigned int xid,
 	down_read(&htable_rw_lock);
 
 	ce = lookup_cache_entry(path);
-	if (!IS_ERR(ce) && !cache_entry_expired(ce))
+	if (!IS_ERR(ce) && !force_refresh && !cache_entry_expired(ce))
 		return ce;
 
 	up_read(&htable_rw_lock);
 
 	/*
-	 * Either the entry was not found, or it is expired.
+	 * Either the entry was not found, or it is expired, or it is a forced
+	 * refresh.
 	 * Request a new DFS referral in order to create or update a cache entry.
 	 */
 	rc = get_dfs_referral(xid, ses, path, &refs, &numrefs);
@@ -809,7 +811,7 @@ static struct cache_entry *cache_refresh_path(const unsigned int xid,
 	/* Re-check as another task might have it added or refreshed already */
 	ce = lookup_cache_entry(path);
 	if (!IS_ERR(ce)) {
-		if (cache_entry_expired(ce)) {
+		if (force_refresh || cache_entry_expired(ce)) {
 			rc = update_cache_entry_locked(ce, refs, numrefs);
 			if (rc)
 				ce = ERR_PTR(rc);
@@ -946,7 +948,7 @@ int dfs_cache_find(const unsigned int xid, struct cifs_ses *ses, const struct nl
 	if (IS_ERR(npath))
 		return PTR_ERR(npath);
 
-	ce = cache_refresh_path(xid, ses, npath);
+	ce = cache_refresh_path(xid, ses, npath, false);
 	if (IS_ERR(ce)) {
 		rc = PTR_ERR(ce);
 		goto out_free_path;
@@ -1043,7 +1045,7 @@ int dfs_cache_update_tgthint(const unsigned int xid, struct cifs_ses *ses,
 
 	cifs_dbg(FYI, "%s: update target hint - path: %s\n", __func__, npath);
 
-	ce = cache_refresh_path(xid, ses, npath);
+	ce = cache_refresh_path(xid, ses, npath, false);
 	if (IS_ERR(ce)) {
 		rc = PTR_ERR(ce);
 		goto out_free_path;
@@ -1321,35 +1323,37 @@ static bool target_share_equal(struct TCP_Server_Info *server, const char *s1, c
  * Mark dfs tcon for reconnecting when the currently connected tcon does not match any of the new
  * target shares in @refs.
  */
-static void mark_for_reconnect_if_needed(struct cifs_tcon *tcon, struct dfs_cache_tgt_list *tl,
-					 const struct dfs_info3_param *refs, int numrefs)
+static void mark_for_reconnect_if_needed(struct TCP_Server_Info *server,
+					 struct dfs_cache_tgt_list *old_tl,
+					 struct dfs_cache_tgt_list *new_tl)
 {
-	struct dfs_cache_tgt_iterator *it;
-	int i;
+	struct dfs_cache_tgt_iterator *oit, *nit;
 
-	for (it = dfs_cache_get_tgt_iterator(tl); it; it = dfs_cache_get_next_tgt(tl, it)) {
-		for (i = 0; i < numrefs; i++) {
-			if (target_share_equal(tcon->ses->server, dfs_cache_get_tgt_name(it),
-					       refs[i].node_name))
+	for (oit = dfs_cache_get_tgt_iterator(old_tl); oit;
+	     oit = dfs_cache_get_next_tgt(old_tl, oit)) {
+		for (nit = dfs_cache_get_tgt_iterator(new_tl); nit;
+		     nit = dfs_cache_get_next_tgt(new_tl, nit)) {
+			if (target_share_equal(server,
+					       dfs_cache_get_tgt_name(oit),
+					       dfs_cache_get_tgt_name(nit)))
 				return;
 		}
 	}
 
 	cifs_dbg(FYI, "%s: no cached or matched targets. mark dfs share for reconnect.\n", __func__);
-	cifs_signal_cifsd_for_reconnect(tcon->ses->server, true);
+	cifs_signal_cifsd_for_reconnect(server, true);
 }
 
 /* Refresh dfs referral of tcon and mark it for reconnect if needed */
 static int __refresh_tcon(const char *path, struct cifs_tcon *tcon, bool force_refresh)
 {
-	struct dfs_cache_tgt_list tl = DFS_CACHE_TGT_LIST_INIT(tl);
+	struct dfs_cache_tgt_list old_tl = DFS_CACHE_TGT_LIST_INIT(old_tl);
+	struct dfs_cache_tgt_list new_tl = DFS_CACHE_TGT_LIST_INIT(new_tl);
 	struct cifs_ses *ses = CIFS_DFS_ROOT_SES(tcon->ses);
 	struct cifs_tcon *ipc = ses->tcon_ipc;
-	struct dfs_info3_param *refs = NULL;
 	bool needs_refresh = false;
 	struct cache_entry *ce;
 	unsigned int xid;
-	int numrefs = 0;
 	int rc = 0;
 
 	xid = get_xid();
@@ -1358,9 +1362,8 @@ static int __refresh_tcon(const char *path, struct cifs_tcon *tcon, bool force_r
 	ce = lookup_cache_entry(path);
 	needs_refresh = force_refresh || IS_ERR(ce) || cache_entry_expired(ce);
 	if (!IS_ERR(ce)) {
-		rc = get_targets(ce, &tl);
-		if (rc)
-			cifs_dbg(FYI, "%s: could not get dfs targets: %d\n", __func__, rc);
+		rc = get_targets(ce, &old_tl);
+		cifs_dbg(FYI, "%s: get_targets: %d\n", __func__, rc);
 	}
 	up_read(&htable_rw_lock);
 
@@ -1377,26 +1380,18 @@ static int __refresh_tcon(const char *path, struct cifs_tcon *tcon, bool force_r
 	}
 	spin_unlock(&ipc->tc_lock);
 
-	rc = get_dfs_referral(xid, ses, path, &refs, &numrefs);
-	if (!rc) {
-		/* Create or update a cache entry with the new referral */
-		dump_refs(refs, numrefs);
-
-		down_write(&htable_rw_lock);
-		ce = lookup_cache_entry(path);
-		if (IS_ERR(ce))
-			add_cache_entry_locked(refs, numrefs);
-		else if (force_refresh || cache_entry_expired(ce))
-			update_cache_entry_locked(ce, refs, numrefs);
-		up_write(&htable_rw_lock);
-
-		mark_for_reconnect_if_needed(tcon, &tl, refs, numrefs);
+	ce = cache_refresh_path(xid, ses, path, true);
+	if (!IS_ERR(ce)) {
+		rc = get_targets(ce, &new_tl);
+		up_read(&htable_rw_lock);
+		cifs_dbg(FYI, "%s: get_targets: %d\n", __func__, rc);
+		mark_for_reconnect_if_needed(tcon->ses->server, &old_tl, &new_tl);
 	}
 
 out:
 	free_xid(xid);
-	dfs_cache_free_tgts(&tl);
-	free_dfs_info_array(refs, numrefs);
+	dfs_cache_free_tgts(&old_tl);
+	dfs_cache_free_tgts(&new_tl);
 	return rc;
 }
 
-- 
2.39.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 5/5] cifs: handle cache lookup errors different than -ENOENT
  2023-01-17  0:09 [PATCH 0/5] dfs fixes Paulo Alcantara
                   ` (3 preceding siblings ...)
  2023-01-17  0:09 ` [PATCH 4/5] cifs: remove duplicate code in __refresh_tcon() Paulo Alcantara
@ 2023-01-17  0:09 ` Paulo Alcantara
  2023-01-17 22:00 ` [PATCH v2 0/5] dfs fixes Paulo Alcantara
  5 siblings, 0 replies; 15+ messages in thread
From: Paulo Alcantara @ 2023-01-17  0:09 UTC (permalink / raw)
  To: smfrench; +Cc: linux-cifs, Paulo Alcantara

lookup_cache_entry() might return an error different than -ENOENT
(e.g. from ->char2uni), so handle those as well in
cache_refresh_path().

Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
---
 fs/cifs/dfs_cache.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/fs/cifs/dfs_cache.c b/fs/cifs/dfs_cache.c
index 67890960c763..f426d1473bea 100644
--- a/fs/cifs/dfs_cache.c
+++ b/fs/cifs/dfs_cache.c
@@ -644,7 +644,9 @@ static struct cache_entry *__lookup_cache_entry(const char *path, unsigned int h
  *
  * Use whole path components in the match.  Must be called with htable_rw_lock held.
  *
+ * Return cached entry if successful.
  * Return ERR_PTR(-ENOENT) if the entry is not found.
+ * Return error ptr otherwise.
  */
 static struct cache_entry *lookup_cache_entry(const char *path)
 {
@@ -789,8 +791,13 @@ static struct cache_entry *cache_refresh_path(const unsigned int xid,
 	down_read(&htable_rw_lock);
 
 	ce = lookup_cache_entry(path);
-	if (!IS_ERR(ce) && !force_refresh && !cache_entry_expired(ce))
+	if (!IS_ERR(ce)) {
+		if (!force_refresh && !cache_entry_expired(ce))
+			return ce;
+	} else if (PTR_ERR(ce) != -ENOENT) {
+		up_read(&htable_rw_lock);
 		return ce;
+	}
 
 	up_read(&htable_rw_lock);
 
@@ -816,7 +823,7 @@ static struct cache_entry *cache_refresh_path(const unsigned int xid,
 			if (rc)
 				ce = ERR_PTR(rc);
 		}
-	} else {
+	} else if (PTR_ERR(ce) == -ENOENT) {
 		ce = add_cache_entry_locked(refs, numrefs);
 	}
 
-- 
2.39.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/5] cifs: fix potential deadlock in cache_refresh_path()
  2023-01-17  0:09 ` [PATCH 1/5] cifs: fix potential deadlock in cache_refresh_path() Paulo Alcantara
@ 2023-01-17 17:07   ` Aurélien Aptel
  2023-01-17 18:03     ` Paulo Alcantara
  0 siblings, 1 reply; 15+ messages in thread
From: Aurélien Aptel @ 2023-01-17 17:07 UTC (permalink / raw)
  To: Paulo Alcantara; +Cc: smfrench, linux-cifs

On Tue, Jan 17, 2023 at 1:35 AM Paulo Alcantara <pc@cjr.nz> wrote:
> -       down_write(&htable_rw_lock);
> +       down_read(&htable_rw_lock);
>
>         ce = lookup_cache_entry(path);
> -       if (!IS_ERR(ce)) {
> -               if (!cache_entry_expired(ce)) {
> -                       dump_ce(ce);
> -                       up_write(&htable_rw_lock);
> -                       return 0;
> -               }
> -       } else {
> -               newent = true;
> +       if (!IS_ERR(ce) && !cache_entry_expired(ce)) {
> +               up_read(&htable_rw_lock);
> +               return 0;
>         }
>
> +       up_read(&htable_rw_lock);

Please add a comment before the up_read() to say why you do it here
and where is the dead lock.

Otherwise looks good

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/5] cifs: fix potential deadlock in cache_refresh_path()
  2023-01-17 17:07   ` Aurélien Aptel
@ 2023-01-17 18:03     ` Paulo Alcantara
  0 siblings, 0 replies; 15+ messages in thread
From: Paulo Alcantara @ 2023-01-17 18:03 UTC (permalink / raw)
  To: Aurélien Aptel; +Cc: smfrench, linux-cifs

Aurélien Aptel <aurelien.aptel@gmail.com> writes:

> On Tue, Jan 17, 2023 at 1:35 AM Paulo Alcantara <pc@cjr.nz> wrote:
>> -       down_write(&htable_rw_lock);
>> +       down_read(&htable_rw_lock);
>>
>>         ce = lookup_cache_entry(path);
>> -       if (!IS_ERR(ce)) {
>> -               if (!cache_entry_expired(ce)) {
>> -                       dump_ce(ce);
>> -                       up_write(&htable_rw_lock);
>> -                       return 0;
>> -               }
>> -       } else {
>> -               newent = true;
>> +       if (!IS_ERR(ce) && !cache_entry_expired(ce)) {
>> +               up_read(&htable_rw_lock);
>> +               return 0;
>>         }
>>
>> +       up_read(&htable_rw_lock);
>
> Please add a comment before the up_read() to say why you do it here
> and where is the dead lock.

Ok, thanks.  Will send v2 with your suggestions.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v2 0/5] dfs fixes
  2023-01-17  0:09 [PATCH 0/5] dfs fixes Paulo Alcantara
                   ` (4 preceding siblings ...)
  2023-01-17  0:09 ` [PATCH 5/5] cifs: handle cache lookup errors different than -ENOENT Paulo Alcantara
@ 2023-01-17 22:00 ` Paulo Alcantara
  2023-01-17 22:00   ` [PATCH v2 1/5] cifs: fix potential deadlock in cache_refresh_path() Paulo Alcantara
                     ` (5 more replies)
  5 siblings, 6 replies; 15+ messages in thread
From: Paulo Alcantara @ 2023-01-17 22:00 UTC (permalink / raw)
  To: smfrench; +Cc: linux-cifs, aurelien.aptel, Paulo Alcantara

Hi Steve,

The most important fix is 1/5 that should fix those random hangs that
we've observed while running dfs tests on buildbot.

I have run twice 50 dfs tests against Windows 2022 and samba 4.16 with
these mount options

	vers=3.1.1,echo_interval=10,{,hard}
	vers=3.0,echo_interval=10,{,hard}
	vers=3.0,echo_interval=10,{,sign}
	vers=3.0,echo_interval=10,{,seal}
	vers=2.1,echo_interval=10,{,hard}
	vers=1.0,echo_interval=10,{,hard}

The only tests which failed (2%) were with SMB1 UNIX extensions
against samba.  readdir(2) was getting STATUS_INVALID_LEVEL from
QUERY_PATH_INFO after failover for some reason -- I'll look into that
when time allows.  Those failures aren't related to this series,
though.

I also did some quick tests with kerberos.

---
v1 -> v2: add comments in patch 1/5 as suggested by Aurelien

Paulo Alcantara (5):
  cifs: fix potential deadlock in cache_refresh_path()
  cifs: avoid re-lookups in dfs_cache_find()
  cifs: don't take exclusive lock for updating target hints
  cifs: remove duplicate code in __refresh_tcon()
  cifs: handle cache lookup errors different than -ENOENT

 fs/cifs/dfs_cache.c | 191 +++++++++++++++++++++++---------------------
 1 file changed, 100 insertions(+), 91 deletions(-)

-- 
2.39.0


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v2 1/5] cifs: fix potential deadlock in cache_refresh_path()
  2023-01-17 22:00 ` [PATCH v2 0/5] dfs fixes Paulo Alcantara
@ 2023-01-17 22:00   ` Paulo Alcantara
  2023-01-17 22:00   ` [PATCH v2 2/5] cifs: avoid re-lookups in dfs_cache_find() Paulo Alcantara
                     ` (4 subsequent siblings)
  5 siblings, 0 replies; 15+ messages in thread
From: Paulo Alcantara @ 2023-01-17 22:00 UTC (permalink / raw)
  To: smfrench; +Cc: linux-cifs, aurelien.aptel, Paulo Alcantara

Avoid getting DFS referral from an exclusive lock in
cache_refresh_path() because the tcon IPC used for getting the
referral could be disconnected and thus causing a deadlock as shown
below:

task A                       task B
======                       ======
cifs_demultiplex_thread()    dfs_cache_find()
 cifs_handle_standard()       cache_refresh_path()
  reconnect_dfs_server()       down_write()
   dfs_cache_noreq_find()       get_dfs_referral()
    down_read() <- deadlock      smb2_get_dfs_refer()
                                  SMB2_ioctl()
				   cifs_send_recv()
				    compound_send_recv()
				     wait_for_response()

where task A cannot wake up task B because it is blocked on
down_read() due to the exclusive lock held in cache_refresh_path() and
therefore not being able to make progress.

Fixes: c9f711039905 ("cifs: keep referral server sessions alive")
Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
---
 fs/cifs/dfs_cache.c | 42 +++++++++++++++++++++++-------------------
 1 file changed, 23 insertions(+), 19 deletions(-)

diff --git a/fs/cifs/dfs_cache.c b/fs/cifs/dfs_cache.c
index e20f8880363f..a6d7ae5f49a4 100644
--- a/fs/cifs/dfs_cache.c
+++ b/fs/cifs/dfs_cache.c
@@ -770,26 +770,27 @@ static int get_dfs_referral(const unsigned int xid, struct cifs_ses *ses, const
  */
 static int cache_refresh_path(const unsigned int xid, struct cifs_ses *ses, const char *path)
 {
-	int rc;
-	struct cache_entry *ce;
 	struct dfs_info3_param *refs = NULL;
+	struct cache_entry *ce;
 	int numrefs = 0;
-	bool newent = false;
+	int rc;
 
 	cifs_dbg(FYI, "%s: search path: %s\n", __func__, path);
 
-	down_write(&htable_rw_lock);
+	down_read(&htable_rw_lock);
 
 	ce = lookup_cache_entry(path);
-	if (!IS_ERR(ce)) {
-		if (!cache_entry_expired(ce)) {
-			dump_ce(ce);
-			up_write(&htable_rw_lock);
-			return 0;
-		}
-	} else {
-		newent = true;
+	if (!IS_ERR(ce) && !cache_entry_expired(ce)) {
+		up_read(&htable_rw_lock);
+		return 0;
 	}
+	/*
+	 * Unlock shared access as we don't want to hold any locks while getting
+	 * a new referral.  The @ses used for performing the I/O could be
+	 * reconnecting and it acquires @htable_rw_lock to look up the dfs cache
+	 * in order to failover -- if necessary.
+	 */
+	up_read(&htable_rw_lock);
 
 	/*
 	 * Either the entry was not found, or it is expired.
@@ -797,19 +798,22 @@ static int cache_refresh_path(const unsigned int xid, struct cifs_ses *ses, cons
 	 */
 	rc = get_dfs_referral(xid, ses, path, &refs, &numrefs);
 	if (rc)
-		goto out_unlock;
+		goto out;
 
 	dump_refs(refs, numrefs);
 
-	if (!newent) {
-		rc = update_cache_entry_locked(ce, refs, numrefs);
-		goto out_unlock;
+	down_write(&htable_rw_lock);
+	/* Re-check as another task might have it added or refreshed already */
+	ce = lookup_cache_entry(path);
+	if (!IS_ERR(ce)) {
+		if (cache_entry_expired(ce))
+			rc = update_cache_entry_locked(ce, refs, numrefs);
+	} else {
+		rc = add_cache_entry_locked(refs, numrefs);
 	}
 
-	rc = add_cache_entry_locked(refs, numrefs);
-
-out_unlock:
 	up_write(&htable_rw_lock);
+out:
 	free_dfs_info_array(refs, numrefs);
 	return rc;
 }
-- 
2.39.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v2 2/5] cifs: avoid re-lookups in dfs_cache_find()
  2023-01-17 22:00 ` [PATCH v2 0/5] dfs fixes Paulo Alcantara
  2023-01-17 22:00   ` [PATCH v2 1/5] cifs: fix potential deadlock in cache_refresh_path() Paulo Alcantara
@ 2023-01-17 22:00   ` Paulo Alcantara
  2023-01-17 22:00   ` [PATCH v2 3/5] cifs: don't take exclusive lock for updating target hints Paulo Alcantara
                     ` (3 subsequent siblings)
  5 siblings, 0 replies; 15+ messages in thread
From: Paulo Alcantara @ 2023-01-17 22:00 UTC (permalink / raw)
  To: smfrench; +Cc: linux-cifs, aurelien.aptel, Paulo Alcantara

Simply downgrade the write lock on cache updates from
cache_refresh_path() and avoid unnecessary re-lookup in
dfs_cache_find().

Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
---
 fs/cifs/dfs_cache.c | 58 ++++++++++++++++++++++++++-------------------
 1 file changed, 34 insertions(+), 24 deletions(-)

diff --git a/fs/cifs/dfs_cache.c b/fs/cifs/dfs_cache.c
index a6d7ae5f49a4..755a00c4cba1 100644
--- a/fs/cifs/dfs_cache.c
+++ b/fs/cifs/dfs_cache.c
@@ -558,7 +558,8 @@ static void remove_oldest_entry_locked(void)
 }
 
 /* Add a new DFS cache entry */
-static int add_cache_entry_locked(struct dfs_info3_param *refs, int numrefs)
+static struct cache_entry *add_cache_entry_locked(struct dfs_info3_param *refs,
+						  int numrefs)
 {
 	int rc;
 	struct cache_entry *ce;
@@ -573,11 +574,11 @@ static int add_cache_entry_locked(struct dfs_info3_param *refs, int numrefs)
 
 	rc = cache_entry_hash(refs[0].path_name, strlen(refs[0].path_name), &hash);
 	if (rc)
-		return rc;
+		return ERR_PTR(rc);
 
 	ce = alloc_cache_entry(refs, numrefs);
 	if (IS_ERR(ce))
-		return PTR_ERR(ce);
+		return ce;
 
 	spin_lock(&cache_ttl_lock);
 	if (!cache_ttl) {
@@ -594,7 +595,7 @@ static int add_cache_entry_locked(struct dfs_info3_param *refs, int numrefs)
 
 	atomic_inc(&cache_count);
 
-	return 0;
+	return ce;
 }
 
 /* Check if two DFS paths are equal.  @s1 and @s2 are expected to be in @cache_cp's charset */
@@ -767,8 +768,12 @@ static int get_dfs_referral(const unsigned int xid, struct cifs_ses *ses, const
  *
  * For interlinks, cifs_mount() and expand_dfs_referral() are supposed to
  * handle them properly.
+ *
+ * On success, return entry with acquired lock for reading, otherwise error ptr.
  */
-static int cache_refresh_path(const unsigned int xid, struct cifs_ses *ses, const char *path)
+static struct cache_entry *cache_refresh_path(const unsigned int xid,
+					      struct cifs_ses *ses,
+					      const char *path)
 {
 	struct dfs_info3_param *refs = NULL;
 	struct cache_entry *ce;
@@ -780,10 +785,9 @@ static int cache_refresh_path(const unsigned int xid, struct cifs_ses *ses, cons
 	down_read(&htable_rw_lock);
 
 	ce = lookup_cache_entry(path);
-	if (!IS_ERR(ce) && !cache_entry_expired(ce)) {
-		up_read(&htable_rw_lock);
-		return 0;
-	}
+	if (!IS_ERR(ce) && !cache_entry_expired(ce))
+		return ce;
+
 	/*
 	 * Unlock shared access as we don't want to hold any locks while getting
 	 * a new referral.  The @ses used for performing the I/O could be
@@ -797,8 +801,10 @@ static int cache_refresh_path(const unsigned int xid, struct cifs_ses *ses, cons
 	 * Request a new DFS referral in order to create or update a cache entry.
 	 */
 	rc = get_dfs_referral(xid, ses, path, &refs, &numrefs);
-	if (rc)
+	if (rc) {
+		ce = ERR_PTR(rc);
 		goto out;
+	}
 
 	dump_refs(refs, numrefs);
 
@@ -806,16 +812,24 @@ static int cache_refresh_path(const unsigned int xid, struct cifs_ses *ses, cons
 	/* Re-check as another task might have it added or refreshed already */
 	ce = lookup_cache_entry(path);
 	if (!IS_ERR(ce)) {
-		if (cache_entry_expired(ce))
+		if (cache_entry_expired(ce)) {
 			rc = update_cache_entry_locked(ce, refs, numrefs);
+			if (rc)
+				ce = ERR_PTR(rc);
+		}
 	} else {
-		rc = add_cache_entry_locked(refs, numrefs);
+		ce = add_cache_entry_locked(refs, numrefs);
 	}
 
-	up_write(&htable_rw_lock);
+	if (IS_ERR(ce)) {
+		up_write(&htable_rw_lock);
+		goto out;
+	}
+
+	downgrade_write(&htable_rw_lock);
 out:
 	free_dfs_info_array(refs, numrefs);
-	return rc;
+	return ce;
 }
 
 /*
@@ -935,15 +949,8 @@ int dfs_cache_find(const unsigned int xid, struct cifs_ses *ses, const struct nl
 	if (IS_ERR(npath))
 		return PTR_ERR(npath);
 
-	rc = cache_refresh_path(xid, ses, npath);
-	if (rc)
-		goto out_free_path;
-
-	down_read(&htable_rw_lock);
-
-	ce = lookup_cache_entry(npath);
+	ce = cache_refresh_path(xid, ses, npath);
 	if (IS_ERR(ce)) {
-		up_read(&htable_rw_lock);
 		rc = PTR_ERR(ce);
 		goto out_free_path;
 	}
@@ -1039,10 +1046,13 @@ int dfs_cache_update_tgthint(const unsigned int xid, struct cifs_ses *ses,
 
 	cifs_dbg(FYI, "%s: update target hint - path: %s\n", __func__, npath);
 
-	rc = cache_refresh_path(xid, ses, npath);
-	if (rc)
+	ce = cache_refresh_path(xid, ses, npath);
+	if (IS_ERR(ce)) {
+		rc = PTR_ERR(ce);
 		goto out_free_path;
+	}
 
+	up_read(&htable_rw_lock);
 	down_write(&htable_rw_lock);
 
 	ce = lookup_cache_entry(npath);
-- 
2.39.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v2 3/5] cifs: don't take exclusive lock for updating target hints
  2023-01-17 22:00 ` [PATCH v2 0/5] dfs fixes Paulo Alcantara
  2023-01-17 22:00   ` [PATCH v2 1/5] cifs: fix potential deadlock in cache_refresh_path() Paulo Alcantara
  2023-01-17 22:00   ` [PATCH v2 2/5] cifs: avoid re-lookups in dfs_cache_find() Paulo Alcantara
@ 2023-01-17 22:00   ` Paulo Alcantara
  2023-01-17 22:00   ` [PATCH v2 4/5] cifs: remove duplicate code in __refresh_tcon() Paulo Alcantara
                     ` (2 subsequent siblings)
  5 siblings, 0 replies; 15+ messages in thread
From: Paulo Alcantara @ 2023-01-17 22:00 UTC (permalink / raw)
  To: smfrench; +Cc: linux-cifs, aurelien.aptel, Paulo Alcantara

Avoid contention while updating dfs target hints.  This should be
perfectly fine to update them under shared locks.

Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
---
 fs/cifs/dfs_cache.c | 47 +++++++++++++++++++--------------------------
 1 file changed, 20 insertions(+), 27 deletions(-)

diff --git a/fs/cifs/dfs_cache.c b/fs/cifs/dfs_cache.c
index 755a00c4cba1..19847f9114ba 100644
--- a/fs/cifs/dfs_cache.c
+++ b/fs/cifs/dfs_cache.c
@@ -269,7 +269,7 @@ static int dfscache_proc_show(struct seq_file *m, void *v)
 			list_for_each_entry(t, &ce->tlist, list) {
 				seq_printf(m, "  %s%s\n",
 					   t->name,
-					   ce->tgthint == t ? " (target hint)" : "");
+					   READ_ONCE(ce->tgthint) == t ? " (target hint)" : "");
 			}
 		}
 	}
@@ -321,7 +321,7 @@ static inline void dump_tgts(const struct cache_entry *ce)
 	cifs_dbg(FYI, "target list:\n");
 	list_for_each_entry(t, &ce->tlist, list) {
 		cifs_dbg(FYI, "  %s%s\n", t->name,
-			 ce->tgthint == t ? " (target hint)" : "");
+			 READ_ONCE(ce->tgthint) == t ? " (target hint)" : "");
 	}
 }
 
@@ -427,7 +427,7 @@ static int cache_entry_hash(const void *data, int size, unsigned int *hash)
 /* Return target hint of a DFS cache entry */
 static inline char *get_tgt_name(const struct cache_entry *ce)
 {
-	struct cache_dfs_tgt *t = ce->tgthint;
+	struct cache_dfs_tgt *t = READ_ONCE(ce->tgthint);
 
 	return t ? t->name : ERR_PTR(-ENOENT);
 }
@@ -470,6 +470,7 @@ static struct cache_dfs_tgt *alloc_target(const char *name, int path_consumed)
 static int copy_ref_data(const struct dfs_info3_param *refs, int numrefs,
 			 struct cache_entry *ce, const char *tgthint)
 {
+	struct cache_dfs_tgt *target;
 	int i;
 
 	ce->ttl = max_t(int, refs[0].ttl, CACHE_MIN_TTL);
@@ -496,8 +497,9 @@ static int copy_ref_data(const struct dfs_info3_param *refs, int numrefs,
 		ce->numtgts++;
 	}
 
-	ce->tgthint = list_first_entry_or_null(&ce->tlist,
-					       struct cache_dfs_tgt, list);
+	target = list_first_entry_or_null(&ce->tlist, struct cache_dfs_tgt,
+					  list);
+	WRITE_ONCE(ce->tgthint, target);
 
 	return 0;
 }
@@ -712,14 +714,15 @@ void dfs_cache_destroy(void)
 static int update_cache_entry_locked(struct cache_entry *ce, const struct dfs_info3_param *refs,
 				     int numrefs)
 {
+	struct cache_dfs_tgt *target;
+	char *th = NULL;
 	int rc;
-	char *s, *th = NULL;
 
 	WARN_ON(!rwsem_is_locked(&htable_rw_lock));
 
-	if (ce->tgthint) {
-		s = ce->tgthint->name;
-		th = kstrdup(s, GFP_ATOMIC);
+	target = READ_ONCE(ce->tgthint);
+	if (target) {
+		th = kstrdup(target->name, GFP_ATOMIC);
 		if (!th)
 			return -ENOMEM;
 	}
@@ -896,7 +899,7 @@ static int get_targets(struct cache_entry *ce, struct dfs_cache_tgt_list *tl)
 		}
 		it->it_path_consumed = t->path_consumed;
 
-		if (ce->tgthint == t)
+		if (READ_ONCE(ce->tgthint) == t)
 			list_add(&it->it_list, head);
 		else
 			list_add_tail(&it->it_list, head);
@@ -1052,23 +1055,14 @@ int dfs_cache_update_tgthint(const unsigned int xid, struct cifs_ses *ses,
 		goto out_free_path;
 	}
 
-	up_read(&htable_rw_lock);
-	down_write(&htable_rw_lock);
-
-	ce = lookup_cache_entry(npath);
-	if (IS_ERR(ce)) {
-		rc = PTR_ERR(ce);
-		goto out_unlock;
-	}
-
-	t = ce->tgthint;
+	t = READ_ONCE(ce->tgthint);
 
 	if (likely(!strcasecmp(it->it_name, t->name)))
 		goto out_unlock;
 
 	list_for_each_entry(t, &ce->tlist, list) {
 		if (!strcasecmp(t->name, it->it_name)) {
-			ce->tgthint = t;
+			WRITE_ONCE(ce->tgthint, t);
 			cifs_dbg(FYI, "%s: new target hint: %s\n", __func__,
 				 it->it_name);
 			break;
@@ -1076,7 +1070,7 @@ int dfs_cache_update_tgthint(const unsigned int xid, struct cifs_ses *ses,
 	}
 
 out_unlock:
-	up_write(&htable_rw_lock);
+	up_read(&htable_rw_lock);
 out_free_path:
 	kfree(npath);
 	return rc;
@@ -1106,21 +1100,20 @@ void dfs_cache_noreq_update_tgthint(const char *path, const struct dfs_cache_tgt
 
 	cifs_dbg(FYI, "%s: path: %s\n", __func__, path);
 
-	if (!down_write_trylock(&htable_rw_lock))
-		return;
+	down_read(&htable_rw_lock);
 
 	ce = lookup_cache_entry(path);
 	if (IS_ERR(ce))
 		goto out_unlock;
 
-	t = ce->tgthint;
+	t = READ_ONCE(ce->tgthint);
 
 	if (unlikely(!strcasecmp(it->it_name, t->name)))
 		goto out_unlock;
 
 	list_for_each_entry(t, &ce->tlist, list) {
 		if (!strcasecmp(t->name, it->it_name)) {
-			ce->tgthint = t;
+			WRITE_ONCE(ce->tgthint, t);
 			cifs_dbg(FYI, "%s: new target hint: %s\n", __func__,
 				 it->it_name);
 			break;
@@ -1128,7 +1121,7 @@ void dfs_cache_noreq_update_tgthint(const char *path, const struct dfs_cache_tgt
 	}
 
 out_unlock:
-	up_write(&htable_rw_lock);
+	up_read(&htable_rw_lock);
 }
 
 /**
-- 
2.39.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v2 4/5] cifs: remove duplicate code in __refresh_tcon()
  2023-01-17 22:00 ` [PATCH v2 0/5] dfs fixes Paulo Alcantara
                     ` (2 preceding siblings ...)
  2023-01-17 22:00   ` [PATCH v2 3/5] cifs: don't take exclusive lock for updating target hints Paulo Alcantara
@ 2023-01-17 22:00   ` Paulo Alcantara
  2023-01-17 22:00   ` [PATCH v2 5/5] cifs: handle cache lookup errors different than -ENOENT Paulo Alcantara
  2023-01-18  1:33   ` [PATCH v2 0/5] dfs fixes Steve French
  5 siblings, 0 replies; 15+ messages in thread
From: Paulo Alcantara @ 2023-01-17 22:00 UTC (permalink / raw)
  To: smfrench; +Cc: linux-cifs, aurelien.aptel, Paulo Alcantara

The logic for creating or updating a cache entry in __refresh_tcon()
could be simply done with cache_refresh_path(), so use it instead.

Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
---
 fs/cifs/dfs_cache.c | 69 +++++++++++++++++++++------------------------
 1 file changed, 32 insertions(+), 37 deletions(-)

diff --git a/fs/cifs/dfs_cache.c b/fs/cifs/dfs_cache.c
index 19847f9114ba..58d11be9d020 100644
--- a/fs/cifs/dfs_cache.c
+++ b/fs/cifs/dfs_cache.c
@@ -776,7 +776,8 @@ static int get_dfs_referral(const unsigned int xid, struct cifs_ses *ses, const
  */
 static struct cache_entry *cache_refresh_path(const unsigned int xid,
 					      struct cifs_ses *ses,
-					      const char *path)
+					      const char *path,
+					      bool force_refresh)
 {
 	struct dfs_info3_param *refs = NULL;
 	struct cache_entry *ce;
@@ -788,7 +789,7 @@ static struct cache_entry *cache_refresh_path(const unsigned int xid,
 	down_read(&htable_rw_lock);
 
 	ce = lookup_cache_entry(path);
-	if (!IS_ERR(ce) && !cache_entry_expired(ce))
+	if (!IS_ERR(ce) && !force_refresh && !cache_entry_expired(ce))
 		return ce;
 
 	/*
@@ -800,7 +801,8 @@ static struct cache_entry *cache_refresh_path(const unsigned int xid,
 	up_read(&htable_rw_lock);
 
 	/*
-	 * Either the entry was not found, or it is expired.
+	 * Either the entry was not found, or it is expired, or it is a forced
+	 * refresh.
 	 * Request a new DFS referral in order to create or update a cache entry.
 	 */
 	rc = get_dfs_referral(xid, ses, path, &refs, &numrefs);
@@ -815,7 +817,7 @@ static struct cache_entry *cache_refresh_path(const unsigned int xid,
 	/* Re-check as another task might have it added or refreshed already */
 	ce = lookup_cache_entry(path);
 	if (!IS_ERR(ce)) {
-		if (cache_entry_expired(ce)) {
+		if (force_refresh || cache_entry_expired(ce)) {
 			rc = update_cache_entry_locked(ce, refs, numrefs);
 			if (rc)
 				ce = ERR_PTR(rc);
@@ -952,7 +954,7 @@ int dfs_cache_find(const unsigned int xid, struct cifs_ses *ses, const struct nl
 	if (IS_ERR(npath))
 		return PTR_ERR(npath);
 
-	ce = cache_refresh_path(xid, ses, npath);
+	ce = cache_refresh_path(xid, ses, npath, false);
 	if (IS_ERR(ce)) {
 		rc = PTR_ERR(ce);
 		goto out_free_path;
@@ -1049,7 +1051,7 @@ int dfs_cache_update_tgthint(const unsigned int xid, struct cifs_ses *ses,
 
 	cifs_dbg(FYI, "%s: update target hint - path: %s\n", __func__, npath);
 
-	ce = cache_refresh_path(xid, ses, npath);
+	ce = cache_refresh_path(xid, ses, npath, false);
 	if (IS_ERR(ce)) {
 		rc = PTR_ERR(ce);
 		goto out_free_path;
@@ -1327,35 +1329,37 @@ static bool target_share_equal(struct TCP_Server_Info *server, const char *s1, c
  * Mark dfs tcon for reconnecting when the currently connected tcon does not match any of the new
  * target shares in @refs.
  */
-static void mark_for_reconnect_if_needed(struct cifs_tcon *tcon, struct dfs_cache_tgt_list *tl,
-					 const struct dfs_info3_param *refs, int numrefs)
+static void mark_for_reconnect_if_needed(struct TCP_Server_Info *server,
+					 struct dfs_cache_tgt_list *old_tl,
+					 struct dfs_cache_tgt_list *new_tl)
 {
-	struct dfs_cache_tgt_iterator *it;
-	int i;
+	struct dfs_cache_tgt_iterator *oit, *nit;
 
-	for (it = dfs_cache_get_tgt_iterator(tl); it; it = dfs_cache_get_next_tgt(tl, it)) {
-		for (i = 0; i < numrefs; i++) {
-			if (target_share_equal(tcon->ses->server, dfs_cache_get_tgt_name(it),
-					       refs[i].node_name))
+	for (oit = dfs_cache_get_tgt_iterator(old_tl); oit;
+	     oit = dfs_cache_get_next_tgt(old_tl, oit)) {
+		for (nit = dfs_cache_get_tgt_iterator(new_tl); nit;
+		     nit = dfs_cache_get_next_tgt(new_tl, nit)) {
+			if (target_share_equal(server,
+					       dfs_cache_get_tgt_name(oit),
+					       dfs_cache_get_tgt_name(nit)))
 				return;
 		}
 	}
 
 	cifs_dbg(FYI, "%s: no cached or matched targets. mark dfs share for reconnect.\n", __func__);
-	cifs_signal_cifsd_for_reconnect(tcon->ses->server, true);
+	cifs_signal_cifsd_for_reconnect(server, true);
 }
 
 /* Refresh dfs referral of tcon and mark it for reconnect if needed */
 static int __refresh_tcon(const char *path, struct cifs_tcon *tcon, bool force_refresh)
 {
-	struct dfs_cache_tgt_list tl = DFS_CACHE_TGT_LIST_INIT(tl);
+	struct dfs_cache_tgt_list old_tl = DFS_CACHE_TGT_LIST_INIT(old_tl);
+	struct dfs_cache_tgt_list new_tl = DFS_CACHE_TGT_LIST_INIT(new_tl);
 	struct cifs_ses *ses = CIFS_DFS_ROOT_SES(tcon->ses);
 	struct cifs_tcon *ipc = ses->tcon_ipc;
-	struct dfs_info3_param *refs = NULL;
 	bool needs_refresh = false;
 	struct cache_entry *ce;
 	unsigned int xid;
-	int numrefs = 0;
 	int rc = 0;
 
 	xid = get_xid();
@@ -1364,9 +1368,8 @@ static int __refresh_tcon(const char *path, struct cifs_tcon *tcon, bool force_r
 	ce = lookup_cache_entry(path);
 	needs_refresh = force_refresh || IS_ERR(ce) || cache_entry_expired(ce);
 	if (!IS_ERR(ce)) {
-		rc = get_targets(ce, &tl);
-		if (rc)
-			cifs_dbg(FYI, "%s: could not get dfs targets: %d\n", __func__, rc);
+		rc = get_targets(ce, &old_tl);
+		cifs_dbg(FYI, "%s: get_targets: %d\n", __func__, rc);
 	}
 	up_read(&htable_rw_lock);
 
@@ -1383,26 +1386,18 @@ static int __refresh_tcon(const char *path, struct cifs_tcon *tcon, bool force_r
 	}
 	spin_unlock(&ipc->tc_lock);
 
-	rc = get_dfs_referral(xid, ses, path, &refs, &numrefs);
-	if (!rc) {
-		/* Create or update a cache entry with the new referral */
-		dump_refs(refs, numrefs);
-
-		down_write(&htable_rw_lock);
-		ce = lookup_cache_entry(path);
-		if (IS_ERR(ce))
-			add_cache_entry_locked(refs, numrefs);
-		else if (force_refresh || cache_entry_expired(ce))
-			update_cache_entry_locked(ce, refs, numrefs);
-		up_write(&htable_rw_lock);
-
-		mark_for_reconnect_if_needed(tcon, &tl, refs, numrefs);
+	ce = cache_refresh_path(xid, ses, path, true);
+	if (!IS_ERR(ce)) {
+		rc = get_targets(ce, &new_tl);
+		up_read(&htable_rw_lock);
+		cifs_dbg(FYI, "%s: get_targets: %d\n", __func__, rc);
+		mark_for_reconnect_if_needed(tcon->ses->server, &old_tl, &new_tl);
 	}
 
 out:
 	free_xid(xid);
-	dfs_cache_free_tgts(&tl);
-	free_dfs_info_array(refs, numrefs);
+	dfs_cache_free_tgts(&old_tl);
+	dfs_cache_free_tgts(&new_tl);
 	return rc;
 }
 
-- 
2.39.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v2 5/5] cifs: handle cache lookup errors different than -ENOENT
  2023-01-17 22:00 ` [PATCH v2 0/5] dfs fixes Paulo Alcantara
                     ` (3 preceding siblings ...)
  2023-01-17 22:00   ` [PATCH v2 4/5] cifs: remove duplicate code in __refresh_tcon() Paulo Alcantara
@ 2023-01-17 22:00   ` Paulo Alcantara
  2023-01-18  1:33   ` [PATCH v2 0/5] dfs fixes Steve French
  5 siblings, 0 replies; 15+ messages in thread
From: Paulo Alcantara @ 2023-01-17 22:00 UTC (permalink / raw)
  To: smfrench; +Cc: linux-cifs, aurelien.aptel, Paulo Alcantara

lookup_cache_entry() might return an error different than -ENOENT
(e.g. from ->char2uni), so handle those as well in
cache_refresh_path().

Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
---
 fs/cifs/dfs_cache.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/fs/cifs/dfs_cache.c b/fs/cifs/dfs_cache.c
index 58d11be9d020..308101d90006 100644
--- a/fs/cifs/dfs_cache.c
+++ b/fs/cifs/dfs_cache.c
@@ -644,7 +644,9 @@ static struct cache_entry *__lookup_cache_entry(const char *path, unsigned int h
  *
  * Use whole path components in the match.  Must be called with htable_rw_lock held.
  *
+ * Return cached entry if successful.
  * Return ERR_PTR(-ENOENT) if the entry is not found.
+ * Return error ptr otherwise.
  */
 static struct cache_entry *lookup_cache_entry(const char *path)
 {
@@ -789,8 +791,13 @@ static struct cache_entry *cache_refresh_path(const unsigned int xid,
 	down_read(&htable_rw_lock);
 
 	ce = lookup_cache_entry(path);
-	if (!IS_ERR(ce) && !force_refresh && !cache_entry_expired(ce))
+	if (!IS_ERR(ce)) {
+		if (!force_refresh && !cache_entry_expired(ce))
+			return ce;
+	} else if (PTR_ERR(ce) != -ENOENT) {
+		up_read(&htable_rw_lock);
 		return ce;
+	}
 
 	/*
 	 * Unlock shared access as we don't want to hold any locks while getting
@@ -822,7 +829,7 @@ static struct cache_entry *cache_refresh_path(const unsigned int xid,
 			if (rc)
 				ce = ERR_PTR(rc);
 		}
-	} else {
+	} else if (PTR_ERR(ce) == -ENOENT) {
 		ce = add_cache_entry_locked(refs, numrefs);
 	}
 
-- 
2.39.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 0/5] dfs fixes
  2023-01-17 22:00 ` [PATCH v2 0/5] dfs fixes Paulo Alcantara
                     ` (4 preceding siblings ...)
  2023-01-17 22:00   ` [PATCH v2 5/5] cifs: handle cache lookup errors different than -ENOENT Paulo Alcantara
@ 2023-01-18  1:33   ` Steve French
  5 siblings, 0 replies; 15+ messages in thread
From: Steve French @ 2023-01-18  1:33 UTC (permalink / raw)
  To: Paulo Alcantara; +Cc: linux-cifs, aurelien.aptel

tentatively merged into cifs-2.6.git for-next pending more review/testing

On Tue, Jan 17, 2023 at 4:00 PM Paulo Alcantara <pc@cjr.nz> wrote:
>
> Hi Steve,
>
> The most important fix is 1/5 that should fix those random hangs that
> we've observed while running dfs tests on buildbot.
>
> I have run twice 50 dfs tests against Windows 2022 and samba 4.16 with
> these mount options
>
>         vers=3.1.1,echo_interval=10,{,hard}
>         vers=3.0,echo_interval=10,{,hard}
>         vers=3.0,echo_interval=10,{,sign}
>         vers=3.0,echo_interval=10,{,seal}
>         vers=2.1,echo_interval=10,{,hard}
>         vers=1.0,echo_interval=10,{,hard}
>
> The only tests which failed (2%) were with SMB1 UNIX extensions
> against samba.  readdir(2) was getting STATUS_INVALID_LEVEL from
> QUERY_PATH_INFO after failover for some reason -- I'll look into that
> when time allows.  Those failures aren't related to this series,
> though.
>
> I also did some quick tests with kerberos.
>
> ---
> v1 -> v2: add comments in patch 1/5 as suggested by Aurelien
>
> Paulo Alcantara (5):
>   cifs: fix potential deadlock in cache_refresh_path()
>   cifs: avoid re-lookups in dfs_cache_find()
>   cifs: don't take exclusive lock for updating target hints
>   cifs: remove duplicate code in __refresh_tcon()
>   cifs: handle cache lookup errors different than -ENOENT
>
>  fs/cifs/dfs_cache.c | 191 +++++++++++++++++++++++---------------------
>  1 file changed, 100 insertions(+), 91 deletions(-)
>
> --
> 2.39.0
>


-- 
Thanks,

Steve

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2023-01-18  1:33 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-17  0:09 [PATCH 0/5] dfs fixes Paulo Alcantara
2023-01-17  0:09 ` [PATCH 1/5] cifs: fix potential deadlock in cache_refresh_path() Paulo Alcantara
2023-01-17 17:07   ` Aurélien Aptel
2023-01-17 18:03     ` Paulo Alcantara
2023-01-17  0:09 ` [PATCH 2/5] cifs: avoid re-lookups in dfs_cache_find() Paulo Alcantara
2023-01-17  0:09 ` [PATCH 3/5] cifs: don't take exclusive lock for updating target hints Paulo Alcantara
2023-01-17  0:09 ` [PATCH 4/5] cifs: remove duplicate code in __refresh_tcon() Paulo Alcantara
2023-01-17  0:09 ` [PATCH 5/5] cifs: handle cache lookup errors different than -ENOENT Paulo Alcantara
2023-01-17 22:00 ` [PATCH v2 0/5] dfs fixes Paulo Alcantara
2023-01-17 22:00   ` [PATCH v2 1/5] cifs: fix potential deadlock in cache_refresh_path() Paulo Alcantara
2023-01-17 22:00   ` [PATCH v2 2/5] cifs: avoid re-lookups in dfs_cache_find() Paulo Alcantara
2023-01-17 22:00   ` [PATCH v2 3/5] cifs: don't take exclusive lock for updating target hints Paulo Alcantara
2023-01-17 22:00   ` [PATCH v2 4/5] cifs: remove duplicate code in __refresh_tcon() Paulo Alcantara
2023-01-17 22:00   ` [PATCH v2 5/5] cifs: handle cache lookup errors different than -ENOENT Paulo Alcantara
2023-01-18  1:33   ` [PATCH v2 0/5] dfs fixes Steve French

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.