cluster-devel.redhat.com archive mirror
 help / color / mirror / Atom feed
From: Alexander Aring <aahringo@redhat.com>
To: teigland@redhat.com
Cc: cluster-devel@redhat.com, gfs2@lists.linux.dev,
	christophe.jaillet@wanadoo.fr, stable@vger.kernel.org
Subject: [Cluster-devel] [PATCH RESEND 5/8] dlm: fix remove member after close call
Date: Tue, 10 Oct 2023 18:04:45 -0400	[thread overview]
Message-ID: <20231010220448.2978176-5-aahringo@redhat.com> (raw)
In-Reply-To: <20231010220448.2978176-1-aahringo@redhat.com>

The idea of commit 63e711b08160 ("fs: dlm: create midcomms nodes when
configure") is to set the midcomms node lifetime when a node joins or
leaves the cluster. Currently we can hit the following warning:

[10844.611495] ------------[ cut here ]------------
[10844.615913] WARNING: CPU: 4 PID: 84304 at fs/dlm/midcomms.c:1263
dlm_midcomms_remove_member+0x13f/0x180 [dlm]

or running in a state where we hit a midcomms node usage count in a
negative value:

[  260.830782] node 2 users dec count -1

The first warning happens when the a specific node does not exists and
it was probably removed but dlm_midcomms_close() which is called when a
node leaves the cluster. The second kernel log message is probably in a
case when dlm_midcomms_addr() is called when a joined the cluster but
due fencing a node leaved the cluster without getting removed from the
lockspace. If the node joins the cluster and it was removed from the
cluster due fencing the first call is to remove the node from lockspaces
triggered by the user space. In both cases if the node wasn't found or
the user count is zero, we should ignore any additional midcomms handling
of dlm_midcomms_remove_member().

Cc: stable@vger.kernel.org
Fixes: 63e711b08160 ("fs: dlm: create midcomms nodes when configure")
Signed-off-by: Alexander Aring <aahringo@redhat.com>
---
 fs/dlm/midcomms.c | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/fs/dlm/midcomms.c b/fs/dlm/midcomms.c
index 455265c6ba53..4ad71e97cec2 100644
--- a/fs/dlm/midcomms.c
+++ b/fs/dlm/midcomms.c
@@ -1268,12 +1268,23 @@ void dlm_midcomms_remove_member(int nodeid)
 
 	idx = srcu_read_lock(&nodes_srcu);
 	node = nodeid2node(nodeid);
-	if (WARN_ON_ONCE(!node)) {
+	/* in case of dlm_midcomms_close() removes node */
+	if (!node) {
 		srcu_read_unlock(&nodes_srcu, idx);
 		return;
 	}
 
 	spin_lock(&node->state_lock);
+	/* case of dlm_midcomms_addr() created node but
+	 * was not added before because dlm_midcomms_close()
+	 * removed the node
+	 */
+	if (!node->users) {
+		spin_unlock(&node->state_lock);
+		srcu_read_unlock(&nodes_srcu, idx);
+		return;
+	}
+
 	node->users--;
 	pr_debug("node %d users dec count %d\n", nodeid, node->users);
 
-- 
2.39.3


  parent reply	other threads:[~2023-10-10 22:05 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-10 22:04 [Cluster-devel] [PATCH RESEND 1/8] fs: dlm: Simplify buffer size computation in dlm_create_debug_file() Alexander Aring
2023-10-10 22:04 ` [Cluster-devel] [PATCH RESEND 2/8] fs: dlm: Fix the size of a buffer " Alexander Aring
2023-10-11  6:24   ` Greg KH
2023-10-10 22:04 ` [Cluster-devel] [PATCH RESEND 3/8] fs: dlm: Remove some useless memset() Alexander Aring
2023-10-11  6:24   ` Greg KH
2023-10-10 22:04 ` [Cluster-devel] [PATCH RESEND 4/8] dlm: fix creating multiple node structures Alexander Aring
2023-10-11  6:25   ` Greg KH
2023-10-10 22:04 ` Alexander Aring [this message]
2023-10-10 22:04 ` [Cluster-devel] [PATCH RESEND 6/8] dlm: be sure we reset all nodes at forced shutdown Alexander Aring
2023-10-10 22:04 ` [Cluster-devel] [PATCH RESEND 7/8] dlm: fix no ack after final message Alexander Aring
2023-10-10 22:04 ` [Cluster-devel] [PATCH RESEND 8/8] dlm: slow down filling up processing queue Alexander Aring
2023-10-11  6:25   ` Greg KH
2023-10-11  6:24 ` [Cluster-devel] [PATCH RESEND 1/8] fs: dlm: Simplify buffer size computation in dlm_create_debug_file() Greg KH

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20231010220448.2978176-5-aahringo@redhat.com \
    --to=aahringo@redhat.com \
    --cc=christophe.jaillet@wanadoo.fr \
    --cc=cluster-devel@redhat.com \
    --cc=gfs2@lists.linux.dev \
    --cc=stable@vger.kernel.org \
    --cc=teigland@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).