linux-lvm.redhat.com archive mirror
 help / color / mirror / Atom feed
* [linux-lvm] can we change cluster_ringid_seq in libdlm from uint32_t to uint64_t?
@ 2020-04-26  7:24 Gang He
  2020-04-27 15:03 ` David Teigland
  0 siblings, 1 reply; 3+ messages in thread
From: Gang He @ 2020-04-26  7:24 UTC (permalink / raw)
  To: LVM general discussion and development

Hello List,

In libdlm code, cluster_ringid_seq variable is defined with uint32_t in dlm_controld/dlm_daemon.h,
the corosync API returns uinit64_t ring_id, in the current code, we use type cast to get the low-32bit ring-id.
But, in some case, the corosync returns a huge ring-id (greater than 32bit), the DLM daemon does not work normally (looks stuck).
Then, I want to know if we can change cluster_ringid_seq in libdlm from uint32_t to uint64_t?

If you'd like to know the details about why corosync ran into such a huge ringid, 
you could check the info from: https://github.com/corosync/corosync/pull/532#issuecomment-617647233

Thanks
Gang

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [linux-lvm] can we change cluster_ringid_seq in libdlm from uint32_t to uint64_t?
  2020-04-26  7:24 [linux-lvm] can we change cluster_ringid_seq in libdlm from uint32_t to uint64_t? Gang He
@ 2020-04-27 15:03 ` David Teigland
  2020-05-15  7:42   ` Gang He
  0 siblings, 1 reply; 3+ messages in thread
From: David Teigland @ 2020-04-27 15:03 UTC (permalink / raw)
  To: Gang He; +Cc: LVM general discussion and development

[-- Attachment #1: Type: text/plain, Size: 570 bytes --]

On Sun, Apr 26, 2020 at 07:24:33AM +0000, Gang He wrote:
> Hello List,
> 
> In libdlm code, cluster_ringid_seq variable is defined with uint32_t in dlm_controld/dlm_daemon.h,
> the corosync API returns uinit64_t ring_id, in the current code, we use type cast to get the low-32bit ring-id.
> But, in some case, the corosync returns a huge ring-id (greater than 32bit), the DLM daemon does not work normally (looks stuck).
> Then, I want to know if we can change cluster_ringid_seq in libdlm from uint32_t to uint64_t?

That looks ok, please try the attached patch.
Dave


[-- Attachment #2: cluster-ringid-seq.patch --]
[-- Type: text/plain, Size: 3488 bytes --]

diff --git a/dlm_controld/cpg.c b/dlm_controld/cpg.c
index e4b27fac3cd5..5b5c52fc09bc 100644
--- a/dlm_controld/cpg.c
+++ b/dlm_controld/cpg.c
@@ -450,21 +450,24 @@ static int check_ringid_done(struct lockspace *ls)
 	   but that's probably not guaranteed.) */
 
 	if (ls->cpg_ringid_wait) {
-		log_group(ls, "check_ringid wait cluster %u cpg %u:%llu",
-			  cluster_ringid_seq, ls->cpg_ringid.nodeid,
+		log_group(ls, "check_ringid wait cluster %u cpg %llu:%llu",
+			  (unsigned long long)cluster_ringid_seq,
+			  ls->cpg_ringid.nodeid,
 			  (unsigned long long)ls->cpg_ringid.seq);
 		return 0;
 	}
 
-	if (cluster_ringid_seq != (uint32_t)ls->cpg_ringid.seq) {
-		log_group(ls, "check_ringid cluster %u cpg %u:%llu",
-			  cluster_ringid_seq, ls->cpg_ringid.nodeid,
+	if (cluster_ringid_seq != ls->cpg_ringid.seq) {
+		log_group(ls, "check_ringid cluster %llu cpg %u:%llu",
+			  (unsigned long long)cluster_ringid_seq,
+			  ls->cpg_ringid.nodeid,
 			  (unsigned long long)ls->cpg_ringid.seq);
 		return 0;
 	}
 
-	log_limit(ls, "check_ringid done cluster %u cpg %u:%llu",
-		  cluster_ringid_seq, ls->cpg_ringid.nodeid,
+	log_limit(ls, "check_ringid done cluster %llu cpg %u:%llu",
+		  (unsigned long long)cluster_ringid_seq,
+		  ls->cpg_ringid.nodeid,
 		  (unsigned long long)ls->cpg_ringid.seq);
 
 	return 1;
diff --git a/dlm_controld/dlm_daemon.h b/dlm_controld/dlm_daemon.h
index 3221e19cb7be..5b9a52dae750 100644
--- a/dlm_controld/dlm_daemon.h
+++ b/dlm_controld/dlm_daemon.h
@@ -179,7 +179,7 @@ EXTERN uint64_t fence_delay_begin;
 EXTERN uint64_t cluster_quorate_monotime;
 EXTERN uint64_t cluster_joined_monotime;
 EXTERN uint64_t cluster_joined_walltime;
-EXTERN uint32_t cluster_ringid_seq;
+EXTERN uint64_t cluster_ringid_seq;
 EXTERN char cluster_name[DLM_LOCKSPACE_LEN+1];
 EXTERN int our_nodeid;
 EXTERN uint32_t control_minor;
diff --git a/dlm_controld/member.c b/dlm_controld/member.c
index 10351ec41d6d..da3a1f5b0e90 100644
--- a/dlm_controld/member.c
+++ b/dlm_controld/member.c
@@ -122,10 +122,10 @@ static void quorum_callback(quorum_handle_t h, uint32_t quorate,
 		cluster_quorate_monotime = now;
 
 	cluster_quorate = quorate;
-	cluster_ringid_seq = (uint32_t)ring_seq;
+	cluster_ringid_seq = ring_seq;
 
-	log_debug("cluster quorum %u seq %u nodes %u",
-		  cluster_quorate, cluster_ringid_seq, node_list_entries);
+	log_debug("cluster quorum %u seq %llu nodes %u",
+		  cluster_quorate, (unsigned long long)cluster_ringid_seq, node_list_entries);
 
 	old_node_count = quorum_node_count;
 	memcpy(&old_nodes, &quorum_nodes, sizeof(old_nodes));
@@ -139,8 +139,8 @@ static void quorum_callback(quorum_handle_t h, uint32_t quorate,
 
 	for (i = 0; i < old_node_count; i++) {
 		if (!is_cluster_member(old_nodes[i])) {
-			log_debug("cluster node %u removed seq %u",
-				  old_nodes[i], cluster_ringid_seq);
+			log_debug("cluster node %u removed seq %llu",
+				  old_nodes[i], (unsigned long long)cluster_ringid_seq);
 			rem_cluster_node(old_nodes[i], now);
 			del_configfs_node(old_nodes[i]);
 		}
@@ -148,8 +148,8 @@ static void quorum_callback(quorum_handle_t h, uint32_t quorate,
 
 	for (i = 0; i < quorum_node_count; i++) {
 		if (!is_old_member(quorum_nodes[i])) {
-			log_debug("cluster node %u added seq %u",
-				  quorum_nodes[i], cluster_ringid_seq);
+			log_debug("cluster node %u added seq %llu",
+				  quorum_nodes[i], (unsigned long long)cluster_ringid_seq);
 			add_cluster_node(quorum_nodes[i], now);
 
 			fence_delay_begin = now;

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [linux-lvm] can we change cluster_ringid_seq in libdlm from uint32_t to uint64_t?
  2020-04-27 15:03 ` David Teigland
@ 2020-05-15  7:42   ` Gang He
  0 siblings, 0 replies; 3+ messages in thread
From: Gang He @ 2020-05-15  7:42 UTC (permalink / raw)
  To: David Teigland; +Cc: LVM general discussion and development

Hi David,

I did some testing with this patch.
It looks OK. Do you plan to commit the patch into the git tree?


Thanks
Gang

> -----Original Message-----
> From: David Teigland [mailto:teigland@redhat.com]
> Sent: 2020��4��27�� 23:04
> To: Gang He <GHe@suse.com>
> Cc: LVM general discussion and development <linux-lvm@redhat.com>
> Subject: Re: [linux-lvm] can we change cluster_ringid_seq in libdlm from
> uint32_t to uint64_t?
> 
> On Sun, Apr 26, 2020 at 07:24:33AM +0000, Gang He wrote:
> > Hello List,
> >
> > In libdlm code, cluster_ringid_seq variable is defined with uint32_t
> > in dlm_controld/dlm_daemon.h, the corosync API returns uinit64_t ring_id,
> in the current code, we use type cast to get the low-32bit ring-id.
> > But, in some case, the corosync returns a huge ring-id (greater than 32bit),
> the DLM daemon does not work normally (looks stuck).
> > Then, I want to know if we can change cluster_ringid_seq in libdlm from
> uint32_t to uint64_t?
> 
> That looks ok, please try the attached patch.
> Dave

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2020-05-15  7:44 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-26  7:24 [linux-lvm] can we change cluster_ringid_seq in libdlm from uint32_t to uint64_t? Gang He
2020-04-27 15:03 ` David Teigland
2020-05-15  7:42   ` Gang He

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).