All of lore.kernel.org
 help / color / mirror / Atom feed
* [linux-lvm] [PATCH 00/10] Enhancements to a clustered logical volume activation
@ 2013-03-19 13:32 Vladislav Bogdanov
  2013-03-19 13:32 ` [linux-lvm] [PATCH 01/10] lvchange: Allow cluster lock conversion Vladislav Bogdanov
                   ` (10 more replies)
  0 siblings, 11 replies; 36+ messages in thread
From: Vladislav Bogdanov @ 2013-03-19 13:32 UTC (permalink / raw)
  To: linux-lvm

When using volumes in clustered volume groups, there is a need to convert
cluster locks between shared and exclusive.
Shared lock is needed to be able to perform live VM migration,
exclusive lock is needed to take LV shapshot (and just for safety).

Also, to use clvm as a locking mechanism for libvirt/qemu, there is a need to
be able to perform LV lock conversion on a remote node (because last phase
of live VM migration, when 'source' qemu process is actually killed and LV
becomes not opened, is performed on a migration source node and there is no
event available on a destination node to do local shared->exclusive conversion).

Vladislav Bogdanov (10):
  lvchange: Allow cluster lock conversion
  clvmd: Fix buffer size
  clvmd: Allow node names to be obtained from corosync's CMAP
  clvmd: fix positive return value is not an error in csid->name
    translation
  clvmd: use correct flags for local command execution
  clvmd: additional debugging - print message bodies
  locking: Allow lock management (activation, deactivation, conversion)
    on a remote nodes
  lvchange: implement remote lock management
  man: document --force option to lvchange, provide examples
  man: document --node option to lvchange

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [linux-lvm] [PATCH 01/10] lvchange: Allow cluster lock conversion
  2013-03-19 13:32 [linux-lvm] [PATCH 00/10] Enhancements to a clustered logical volume activation Vladislav Bogdanov
@ 2013-03-19 13:32 ` Vladislav Bogdanov
  2013-03-19 15:23   ` David Teigland
  2013-03-19 13:32 ` [linux-lvm] [PATCH 02/10] clvmd: Fix buffer size Vladislav Bogdanov
                   ` (9 subsequent siblings)
  10 siblings, 1 reply; 36+ messages in thread
From: Vladislav Bogdanov @ 2013-03-19 13:32 UTC (permalink / raw)
  To: linux-lvm

Allow clvm locks to be converted shared <-> exclusive with corosync/dlm.

Without this it is impossible to alow both
* VM migration (shared lock is required)
* host-side snapshots of VM disks (exlusive lock is required)

Locks are only converted if --force parameter passed to lvchange.

Internally LKF_CONVERT flag is passed to dlm, so that is a real lock
conversion.

Also deny release of an exclusive lock on a remote note without --force
flag to lvchange -an.

Signed-off-by: Vladislav Bogdanov <bubble@hoster-ok.com>
---
 lib/locking/cluster_locking.c |    6 +++++-
 lib/locking/locking.c         |   32 ++++++++++++++++++++++++++++++++
 lib/locking/locking.h         |   10 +++++++++-
 tools/lvchange.c              |   30 ++++++++++++++++++++++--------
 4 files changed, 68 insertions(+), 10 deletions(-)

diff --git a/lib/locking/cluster_locking.c b/lib/locking/cluster_locking.c
index f9d6328..c3bd202 100644
--- a/lib/locking/cluster_locking.c
+++ b/lib/locking/cluster_locking.c
@@ -330,6 +330,9 @@ static int _lock_for_cluster(struct cmd_context *cmd, unsigned char clvmd_cmd,
 	if (flags & LCK_REVERT)
 		args[1] |= LCK_REVERT_MODE;
 
+	if (flags & LCK_TRY_CONVERT)
+		args[1] |= LCK_CONVERT;
+
 	if (mirror_in_sync())
 		args[1] |= LCK_MIRROR_NOSYNC_MODE;
 
@@ -491,7 +494,7 @@ int lock_resource(struct cmd_context *cmd, const char *resource, uint32_t flags)
 		return 0;
 	}
 
-	log_very_verbose("Locking %s %s %s (%s%s%s%s%s%s%s%s%s) (0x%x)", lock_scope, lockname,
+	log_very_verbose("Locking %s %s %s (%s%s%s%s%s%s%s%s%s%s) (0x%x)", lock_scope, lockname,
 			 lock_type, lock_scope,
 			 flags & LCK_NONBLOCK ? "|NONBLOCK" : "",
 			 flags & LCK_HOLD ? "|HOLD" : "",
@@ -501,6 +504,7 @@ int lock_resource(struct cmd_context *cmd, const char *resource, uint32_t flags)
 			 flags & LCK_CACHE ? "|CACHE" : "",
 			 flags & LCK_ORIGIN_ONLY ? "|ORIGIN_ONLY" : "",
 			 flags & LCK_REVERT ? "|REVERT" : "",
+			 flags & LCK_TRY_CONVERT ? "|CONVERT" : "",
 			 flags);
 
 	/* Send a message to the cluster manager */
diff --git a/lib/locking/locking.c b/lib/locking/locking.c
index 7aa519b..ff46046 100644
--- a/lib/locking/locking.c
+++ b/lib/locking/locking.c
@@ -540,6 +540,16 @@ int suspend_lvs(struct cmd_context *cmd, struct dm_list *lvs,
 	return 1;
 }
 
+int deactivate_lv(struct cmd_context *cmd, struct logical_volume *lv)
+{
+	if (vg_is_clustered(lv->vg)) {
+		if (lv_is_active_exclusive(lv) && ! lv_is_active_exclusive_locally(lv)) {
+			return_0;
+		}
+	}
+	lock_lv_vol(cmd, lv, LCK_LV_DEACTIVATE);
+}
+
 /*
  * First try to activate exclusively locally.
  * Then if the VG is clustered and the LV is not yet active (e.g. due to 
@@ -567,6 +577,28 @@ int activate_lv_excl(struct cmd_context *cmd, struct logical_volume *lv)
 	return lv_is_active_exclusive(lv);
 }
 
+int activate_lv_excl_force(struct cmd_context *cmd, struct logical_volume *lv) 
+{
+	/* Non-clustered VGs are only activated locally. */
+	if (!vg_is_clustered(lv->vg))
+		return activate_lv_excl_local(cmd, lv);
+
+	if (lv_is_active_exclusive_locally(lv))
+		return 1;
+
+	if (!activate_lv_excl_local_force(cmd, lv))
+		return_0;
+
+	if (lv_is_active_exclusive(lv))
+		return 1;
+
+	/* FIXME Deal with error return codes. */
+	if (activate_lv_excl_remote_force(cmd, lv))
+		stack;
+
+	return lv_is_active_exclusive(lv);
+}
+
 /* Lock a list of LVs */
 int activate_lvs(struct cmd_context *cmd, struct dm_list *lvs, unsigned exclusive)
 {
diff --git a/lib/locking/locking.h b/lib/locking/locking.h
index 23c312d..b441a6c 100644
--- a/lib/locking/locking.h
+++ b/lib/locking/locking.h
@@ -101,6 +101,7 @@ int check_lvm1_vg_inactive(struct cmd_context *cmd, const char *vgname);
 #define LCK_CACHE	0x00000100U	/* Operation on cache only using P_ lock */
 #define LCK_ORIGIN_ONLY	0x00000200U	/* Operation should bypass any snapshots */
 #define LCK_REVERT	0x00000400U	/* Revert any incomplete change */
+#define LCK_TRY_CONVERT			0x00004000U	/* Convert existing lock */
 
 /*
  * Additional lock bits for cluster communication via args[1]
@@ -176,19 +177,26 @@ int check_lvm1_vg_inactive(struct cmd_context *cmd, const char *vgname);
 #define revert_lv(cmd, lv)	lock_lv_vol(cmd, lv, LCK_LV_RESUME | LCK_REVERT)
 #define suspend_lv(cmd, lv)	lock_lv_vol(cmd, lv, LCK_LV_SUSPEND | LCK_HOLD)
 #define suspend_lv_origin(cmd, lv)	lock_lv_vol(cmd, lv, LCK_LV_SUSPEND | LCK_HOLD | LCK_ORIGIN_ONLY)
-#define deactivate_lv(cmd, lv)	lock_lv_vol(cmd, lv, LCK_LV_DEACTIVATE)
 
 #define activate_lv(cmd, lv)	lock_lv_vol(cmd, lv, LCK_LV_ACTIVATE | LCK_HOLD)
 #define activate_lv_excl_local(cmd, lv)	\
 				lock_lv_vol(cmd, lv, LCK_LV_EXCLUSIVE | LCK_HOLD | LCK_LOCAL)
 #define activate_lv_excl_remote(cmd, lv)	\
 				lock_lv_vol(cmd, lv, LCK_LV_EXCLUSIVE | LCK_HOLD | LCK_REMOTE)
+#define activate_lv_excl_local_force(cmd, lv)	\
+				lock_lv_vol(cmd, lv, LCK_LV_EXCLUSIVE | LCK_HOLD | LCK_LOCAL | (lv_is_active(lv) && ! lv_is_active_exclusive(lv) ? LCK_TRY_CONVERT : 0))
+#define activate_lv_excl_remote_force(cmd, lv)	\
+				lock_lv_vol(cmd, lv, LCK_LV_EXCLUSIVE | LCK_HOLD | LCK_REMOTE | (lv_is_active(lv) && ! lv_is_active_exclusive(lv) ? LCK_TRY_CONVERT : 0))
 
 struct logical_volume;
 int activate_lv_excl(struct cmd_context *cmd, struct logical_volume *lv);
+int activate_lv_excl_force(struct cmd_context *cmd, struct logical_volume *lv);
+int deactivate_lv(struct cmd_context *cmd, struct logical_volume *lv);
 
 #define activate_lv_local(cmd, lv)	\
 	lock_lv_vol(cmd, lv, LCK_LV_ACTIVATE | LCK_HOLD | LCK_LOCAL)
+#define activate_lv_local_force(cmd, lv)	\
+	lock_lv_vol(cmd, lv, LCK_LV_ACTIVATE | LCK_HOLD | LCK_LOCAL | (lv_is_active_exclusive_locally(lv) ? LCK_TRY_CONVERT : 0))
 #define deactivate_lv_local(cmd, lv)	\
 	lock_lv_vol(cmd, lv, LCK_LV_DEACTIVATE | LCK_LOCAL)
 #define drop_cached_metadata(vg)	\
diff --git a/tools/lvchange.c b/tools/lvchange.c
index 04facdd..5740bea 100644
--- a/tools/lvchange.c
+++ b/tools/lvchange.c
@@ -238,15 +238,29 @@ static int _lvchange_activate(struct cmd_context *cmd, struct logical_volume *lv
 		if ((activate == CHANGE_AE) ||
 		    lv_is_origin(lv) ||
 		    lv_is_thin_type(lv)) {
-			log_verbose("Activating logical volume \"%s\" "
-				    "exclusively", lv->name);
-			if (!activate_lv_excl(cmd, lv))
-				return_0;
+			if (arg_count(cmd, force_ARG)) {
+				log_verbose("Activating logical volume \"%s\" "
+					    "exclusively (forced)", lv->name);
+				if (!activate_lv_excl_force(cmd, lv))
+					return_0;
+			} else {
+				log_verbose("Activating logical volume \"%s\" "
+					    "exclusively", lv->name);
+				if (!activate_lv_excl(cmd, lv))
+					return_0;
+			}
 		} else if (activate == CHANGE_ALY) {
-			log_verbose("Activating logical volume \"%s\" locally",
-				    lv->name);
-			if (!activate_lv_local(cmd, lv))
-				return_0;
+			if (arg_count(cmd, force_ARG)) {
+				log_verbose("Activating logical volume \"%s\" locally (forced)",
+					    lv->name);
+				if (!activate_lv_local_force(cmd, lv))
+					return_0;
+			} else {
+				log_verbose("Activating logical volume \"%s\" locally",
+					    lv->name);
+				if (!activate_lv_local(cmd, lv))
+					return_0;
+			}
 		} else {
 			log_verbose("Activating logical volume \"%s\"",
 				    lv->name);
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [linux-lvm] [PATCH 02/10] clvmd: Fix buffer size
  2013-03-19 13:32 [linux-lvm] [PATCH 00/10] Enhancements to a clustered logical volume activation Vladislav Bogdanov
  2013-03-19 13:32 ` [linux-lvm] [PATCH 01/10] lvchange: Allow cluster lock conversion Vladislav Bogdanov
@ 2013-03-19 13:32 ` Vladislav Bogdanov
  2013-03-19 13:32 ` [linux-lvm] [PATCH 03/10] clvmd: Allow node names to be obtained from corosync's CMAP Vladislav Bogdanov
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 36+ messages in thread
From: Vladislav Bogdanov @ 2013-03-19 13:32 UTC (permalink / raw)
  To: linux-lvm

Signed-off-by: Vladislav Bogdanov <bubble@hoster-ok.com>
---
 daemons/clvmd/clvmd.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/daemons/clvmd/clvmd.c b/daemons/clvmd/clvmd.c
index ac465d9..07a318e 100644
--- a/daemons/clvmd/clvmd.c
+++ b/daemons/clvmd/clvmd.c
@@ -1488,7 +1488,7 @@ static void process_remote_command(struct clvm_header *msg, int msglen, int fd,
 	   clvmd-command.c */
 	if (msg->cmd == CLVMD_CMD_VERSION) {
 		int version_nums[3];
-		char node[256];
+		char node[max_cluster_member_name_len];
 
 		memcpy(version_nums, msg->args, sizeof(version_nums));
 
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [linux-lvm] [PATCH 03/10] clvmd: Allow node names to be obtained from corosync's CMAP
  2013-03-19 13:32 [linux-lvm] [PATCH 00/10] Enhancements to a clustered logical volume activation Vladislav Bogdanov
  2013-03-19 13:32 ` [linux-lvm] [PATCH 01/10] lvchange: Allow cluster lock conversion Vladislav Bogdanov
  2013-03-19 13:32 ` [linux-lvm] [PATCH 02/10] clvmd: Fix buffer size Vladislav Bogdanov
@ 2013-03-19 13:32 ` Vladislav Bogdanov
  2013-03-19 13:32 ` [linux-lvm] [PATCH 04/10] clvmd: fix positive return value is not an error in csid->name translation Vladislav Bogdanov
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 36+ messages in thread
From: Vladislav Bogdanov @ 2013-03-19 13:32 UTC (permalink / raw)
  To: linux-lvm

Use pacemaker-like algorithm to obtain node names from CMAP (corosync2).
Fix incorrect parameter is passed to dm_hash_lookup_binary() in
_csid_from_name() in clvmd-corosync.c
Do not close CMAP connection after local node ID is obtained, and reuse
it to get node names.

Signed-off-by: Vladislav Bogdanov <bubble@hoster-ok.com>
---
 daemons/clvmd/clvmd-corosync.c |  147 +++++++++++++++++++++++++++++++++++----
 1 files changed, 132 insertions(+), 15 deletions(-)

diff --git a/daemons/clvmd/clvmd-corosync.c b/daemons/clvmd/clvmd-corosync.c
index d85ec1e..0af02a2 100644
--- a/daemons/clvmd/clvmd-corosync.c
+++ b/daemons/clvmd/clvmd-corosync.c
@@ -71,6 +71,9 @@ static struct local_client *cluster_client;
 /* Corosync handles */
 static cpg_handle_t cpg_handle;
 static quorum_handle_t quorum_handle;
+#if defined HAVE_COROSYNC_CMAP_H
+static cmap_handle_t cmap_handle = 0;
+#endif
 
 /* DLM Handle */
 static dlm_lshandle_t *lockspace;
@@ -208,7 +211,7 @@ static void corosync_cpg_deliver_callback (cpg_handle_t handle,
 	memcpy(&target_nodeid, msg, COROSYNC_CSID_LEN);
 
 	DEBUGLOG("%u got message from nodeid %d for %d. len %zd\n",
-		 our_nodeid, nodeid, target_nodeid, msg_len-4);
+		 our_nodeid, nodeid, target_nodeid, msg_len-COROSYNC_CSID_LEN);
 
 	if (nodeid != our_nodeid)
 		if (target_nodeid == our_nodeid || target_nodeid == 0)
@@ -297,6 +300,17 @@ static int _init_cluster(void)
 		return cs_to_errno(err);
 	}
 
+#if defined HAVE_COROSYNC_CMAP_H
+	err = cmap_initialize(&cmap_handle);
+	if (err != CS_OK) {
+		syslog(LOG_ERR, "Cannot initialise Corosync CMAP service: %d",
+		       err);
+		DEBUGLOG("Cannot initialise Corosync CMAP service: %d", err);
+		cpg_finalize(cpg_handle);
+		return cs_to_errno(err);
+	}
+#endif
+
 #ifdef QUORUM_SET
 	err = quorum_initialize(&quorum_handle,
 				&quorum_callbacks,
@@ -305,6 +319,10 @@ static int _init_cluster(void)
 	if (quorum_type != QUORUM_SET) {
 		syslog(LOG_ERR, "Corosync quorum service is not configured");
 		DEBUGLOG("Corosync quorum service is not configured");
+#if defined HAVE_COROSYNC_CMAP_H
+		cmap_finalize(cmap_handle);
+#endif
+		cpg_finalize(cpg_handle);
 		return EINVAL;
 	}
 #else
@@ -315,6 +333,10 @@ static int _init_cluster(void)
 	if (err != CS_OK) {
 		syslog(LOG_ERR, "Cannot initialise Corosync quorum service: %d",
 		       err);
+#if defined HAVE_COROSYNC_CMAP_H
+		cmap_finalize(cmap_handle);
+#endif
+		cpg_finalize(cpg_handle);
 		DEBUGLOG("Cannot initialise Corosync quorum service: %d", err);
 		return cs_to_errno(err);
 	}
@@ -325,6 +347,10 @@ static int _init_cluster(void)
 		lockspace = dlm_create_lockspace(LOCKSPACE_NAME, 0600);
 		if (!lockspace) {
 			syslog(LOG_ERR, "Unable to create DLM lockspace for CLVM: %m");
+#if defined HAVE_COROSYNC_CMAP_H
+			cmap_finalize(cmap_handle);
+#endif
+			cpg_finalize(cpg_handle);
 			return -1;
 		}
 		DEBUGLOG("Created DLM lockspace for CLVMD.\n");
@@ -339,6 +365,9 @@ static int _init_cluster(void)
 	cpg_group_name.length = strlen((char *)cpg_group_name.value);
 	err = cpg_join(cpg_handle, &cpg_group_name);
 	if (err != CS_OK) {
+#if defined HAVE_COROSYNC_CMAP_H
+		cmap_finalize(cmap_handle);
+#endif
 		cpg_finalize(cpg_handle);
 		quorum_finalize(quorum_handle);
 		dlm_release_lockspace(LOCKSPACE_NAME, lockspace, 1);
@@ -350,6 +379,9 @@ static int _init_cluster(void)
 	err = cpg_local_get(cpg_handle,
 			    &our_nodeid);
 	if (err != CS_OK) {
+#if defined HAVE_COROSYNC_CMAP_H
+		cmap_finalize(cmap_handle);
+#endif
 		cpg_finalize(cpg_handle);
 		quorum_finalize(quorum_handle);
 		dlm_release_lockspace(LOCKSPACE_NAME, lockspace, 1);
@@ -371,6 +403,9 @@ static void _cluster_closedown(void)
 	dlm_release_lockspace(LOCKSPACE_NAME, lockspace, 1);
 	cpg_finalize(cpg_handle);
 	quorum_finalize(quorum_handle);
+#if defined HAVE_COROSYNC_CMAP_H
+	cmap_finalize(cmap_handle);
+#endif
 }
 
 static void _get_our_csid(char *csid)
@@ -379,17 +414,56 @@ static void _get_our_csid(char *csid)
 }
 
 /* Corosync doesn't really have nmode names so we
-   just use the node ID in hex instead */
+   just use the node ID in hex instead.
+   The only case when we have name is when nodelist is configured for
+   pacemaker the way described in pacemaker docs (has nodelist.node.X.name) */
 static int _csid_from_name(char *csid, const char *name)
 {
-	int nodeid;
+	int nodeid = 0;
 	struct node_info *ninfo;
 
-	if (sscanf(name, "%x", &nodeid) == 1) {
-		ninfo = dm_hash_lookup_binary(node_hash, csid, COROSYNC_CSID_LEN);
-		if (ninfo)
+#if defined HAVE_COROSYNC_CMAP_H
+	int idx = 0;
+	char key[128];
+
+	while (1) {
+		cs_error_t rc;
+		char *n_name = NULL;
+
+		snprintf(key, 128, "nodelist.node.%d.name", idx);
+		rc = cmap_get_string(cmap_handle, key, &n_name);
+
+		if (rc != CS_OK) {
+			snprintf(key, 128, "nodelist.node.%d.ring0_ipaddr", idx);
+			rc = cmap_get_string(cmap_handle, key, &n_name);
+
+			if (rc != CS_OK) {
+				break;
+			}
+		}
+		if (strcmp(name, n_name) == 0) {
+			free(n_name);
+			snprintf(key, 128, "nodelist.node.%d.nodeid", idx);
+			rc = cmap_get_uint32(cmap_handle, key, (uint32_t *)&nodeid);
+			if (rc != CS_OK) {
+				break;
+			}
+			
+		} else {
+			free(n_name);
+		}
+		idx++;
+	}
+#endif
+
+	if (nodeid || sscanf(name, "%d", &nodeid) == 1) {
+		ninfo = dm_hash_lookup_binary(node_hash, (char *)&nodeid, COROSYNC_CSID_LEN);
+		if (ninfo) {
+			memcpy(csid, &nodeid, COROSYNC_CSID_LEN);
 			return nodeid;
+		}
 	}
+	*csid = 0;
 	return -1;
 }
 
@@ -397,14 +471,64 @@ static int _name_from_csid(const char *csid, char *name)
 {
 	struct node_info *ninfo;
 
+#if defined HAVE_COROSYNC_CMAP_H
+	int nodeid = 0;
+	int idx = 0;
+	char key[128];
+
+	memcpy(&nodeid, csid, COROSYNC_CSID_LEN);
+
+	*name = '\0';
+
+	while (1) {
+		cs_error_t rc;
+		char *n_name = NULL;
+		uint32_t id;
+
+		snprintf(key, 128, "nodelist.node.%d.nodeid", idx);
+		rc = cmap_get_uint32(cmap_handle, key, &id);
+		if (rc != CS_OK) {
+			break;
+		}
+		if (id == nodeid) {
+			snprintf(key, 128, "nodelist.node.%d.name", idx);
+			rc = cmap_get_string(cmap_handle, key, &n_name);
+			if (rc != CS_OK || !n_name) {
+				int octet;
+				snprintf(key, 128, "nodelist.node.%d.ring0_ipaddr", idx);
+				rc = cmap_get_string(cmap_handle, key, &n_name);
+
+				if (rc != CS_OK || !n_name) {
+					break;
+				}
+				if (sscanf(n_name, "%d.%d.%d.%d", &octet, &octet, &octet, &octet) == 4 || strstr(n_name, ":") != NULL) {
+					/* ring0_ipaddr contains IP address */
+					free(n_name);
+					n_name = NULL;
+					break;
+				}
+			}
+			snprintf(name, COROSYNC_MAX_CLUSTER_MEMBER_NAME_LEN, "%s", n_name);
+			free(n_name);
+			break;
+		}
+
+		idx++;
+	}
+#endif
+
 	ninfo = dm_hash_lookup_binary(node_hash, csid, COROSYNC_CSID_LEN);
 	if (!ninfo)
 	{
-		sprintf(name, "UNKNOWN %s", print_corosync_csid(csid));
+		if (! *name)
+			snprintf(name, COROSYNC_MAX_CLUSTER_MEMBER_NAME_LEN, "UNKNOWN %s", print_corosync_csid(csid));
+
 		return -1;
 	}
 
-	sprintf(name, "%x", ninfo->nodeid);
+	if (! *name)
+		snprintf(name, COROSYNC_MAX_CLUSTER_MEMBER_NAME_LEN, "%d", ninfo->nodeid);
+
 	return 0;
 }
 
@@ -627,18 +751,12 @@ out:
 
 static int _get_cluster_name(char *buf, int buflen)
 {
-	cmap_handle_t cmap_handle = 0;
 	int result;
 	char *name = NULL;
 
 	/* This is a default in case everything else fails */
 	strncpy(buf, "Corosync", buflen);
 
-	/* Look for a cluster name in cmap */
-	result = cmap_initialize(&cmap_handle);
-	if (result != CS_OK)
-		return 0;
-
 	result = cmap_get_string(cmap_handle, "totem.cluster_name", &name);
 	if (result != CS_OK)
 		goto out;
@@ -649,7 +767,6 @@ static int _get_cluster_name(char *buf, int buflen)
 out:
 	if (name)
 		free(name);
-	cmap_finalize(cmap_handle);
 	return 0;
 }
 
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [linux-lvm] [PATCH 04/10] clvmd: fix positive return value is not an error in csid->name translation
  2013-03-19 13:32 [linux-lvm] [PATCH 00/10] Enhancements to a clustered logical volume activation Vladislav Bogdanov
                   ` (2 preceding siblings ...)
  2013-03-19 13:32 ` [linux-lvm] [PATCH 03/10] clvmd: Allow node names to be obtained from corosync's CMAP Vladislav Bogdanov
@ 2013-03-19 13:32 ` Vladislav Bogdanov
  2013-03-19 13:32 ` [linux-lvm] [PATCH 05/10] clvmd: use correct flags for local command execution Vladislav Bogdanov
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 36+ messages in thread
From: Vladislav Bogdanov @ 2013-03-19 13:32 UTC (permalink / raw)
  To: linux-lvm

csid_from_name() returns
* zero (cman) - should be fixed?
* positive value - nodeid (openais, corosync)
on success, but clvmd assumes only zero is a valid result.

Signed-off-by: Vladislav Bogdanov <bubble@hoster-ok.com>
---
 daemons/clvmd/clvmd.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/daemons/clvmd/clvmd.c b/daemons/clvmd/clvmd.c
index 07a318e..4f2f458 100644
--- a/daemons/clvmd/clvmd.c
+++ b/daemons/clvmd/clvmd.c
@@ -1284,7 +1284,7 @@ static int read_from_local_sock(struct local_client *thisfd)
 		}
 
 		/* Check the node name for validity */
-		if (inheader->node[0] && clops->csid_from_name(csid, inheader->node)) {
+		if (inheader->node[0] && clops->csid_from_name(csid, inheader->node) <= 0) {
 			/* Error, node is not in the cluster */
 			struct clvm_header reply = {
 				.cmd = CLVMD_CMD_REPLY,
@@ -1427,7 +1427,7 @@ static int distribute_command(struct local_client *thisfd)
                         /* Do it on a single node */
 			char csid[MAX_CSID_LEN];
 
-			if (clops->csid_from_name(csid, inheader->node)) {
+			if (clops->csid_from_name(csid, inheader->node) <= 0) {
 				/* This has already been checked so should not happen */
 				return 0;
 			} else {
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [linux-lvm] [PATCH 05/10] clvmd: use correct flags for local command execution
  2013-03-19 13:32 [linux-lvm] [PATCH 00/10] Enhancements to a clustered logical volume activation Vladislav Bogdanov
                   ` (3 preceding siblings ...)
  2013-03-19 13:32 ` [linux-lvm] [PATCH 04/10] clvmd: fix positive return value is not an error in csid->name translation Vladislav Bogdanov
@ 2013-03-19 13:32 ` Vladislav Bogdanov
  2013-03-19 13:32 ` [linux-lvm] [PATCH 06/10] clvmd: additional debugging - print message bodies Vladislav Bogdanov
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 36+ messages in thread
From: Vladislav Bogdanov @ 2013-03-19 13:32 UTC (permalink / raw)
  To: linux-lvm

Signed-off-by: Vladislav Bogdanov <bubble@hoster-ok.com>
---
 daemons/clvmd/clvmd.c |    8 ++++++--
 1 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/daemons/clvmd/clvmd.c b/daemons/clvmd/clvmd.c
index 4f2f458..fa09464 100644
--- a/daemons/clvmd/clvmd.c
+++ b/daemons/clvmd/clvmd.c
@@ -1417,7 +1417,9 @@ static int distribute_command(struct local_client *thisfd)
 			 * is set so we still get a reply if this is the
 			 * only node.
 			 */
+			inheader->flags |= CLVMD_FLAG_LOCAL;
 			add_to_lvmqueue(thisfd, inheader, len, NULL);
+			inheader->flags &= ~CLVMD_FLAG_LOCAL;
 
 			DEBUGLOG("Sending message to all cluster nodes\n");
 			inheader->xid = thisfd->xid;
@@ -1439,6 +1441,8 @@ static int distribute_command(struct local_client *thisfd)
 				/* Are we the requested node ?? */
 				if (memcmp(csid, our_csid, max_csid_len) == 0) {
 					DEBUGLOG("Doing command on local node only\n");
+					inheader->flags &= ~CLVMD_FLAG_REMOTE;
+					inheader->flags |= CLVMD_FLAG_LOCAL;
 					add_to_lvmqueue(thisfd, inheader, len, NULL);
 				} else {
 					DEBUGLOG("Sending message to single node: %s\n",
@@ -1742,8 +1746,8 @@ static int process_local_command(struct clvm_header *msg, int msglen,
 	if (replybuf == NULL)
 		return -1;
 
-	/* If remote flag is set, just set a successful status code. */
-	if (msg->flags & CLVMD_FLAG_REMOTE)
+	/* If local flag is not set, just set a successful status code. */
+	if (! (msg->flags & CLVMD_FLAG_LOCAL)) {
 		status = 0;
 	else
 		/* FIXME: usage of init_test() is unprotected */
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [linux-lvm] [PATCH 06/10] clvmd: additional debugging - print message bodies
  2013-03-19 13:32 [linux-lvm] [PATCH 00/10] Enhancements to a clustered logical volume activation Vladislav Bogdanov
                   ` (4 preceding siblings ...)
  2013-03-19 13:32 ` [linux-lvm] [PATCH 05/10] clvmd: use correct flags for local command execution Vladislav Bogdanov
@ 2013-03-19 13:32 ` Vladislav Bogdanov
  2013-03-19 13:32 ` [linux-lvm] [PATCH 07/10] locking: Allow lock management (activation, deactivation, conversion) on a remote nodes Vladislav Bogdanov
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 36+ messages in thread
From: Vladislav Bogdanov @ 2013-03-19 13:32 UTC (permalink / raw)
  To: linux-lvm

Signed-off-by: Vladislav Bogdanov <bubble@hoster-ok.com>
---
 daemons/clvmd/clvmd.c |   52 ++++++++++++++++++++++++++++++++++++++++++++----
 1 files changed, 47 insertions(+), 5 deletions(-)

diff --git a/daemons/clvmd/clvmd.c b/daemons/clvmd/clvmd.c
index fa09464..c4d6e03 100644
--- a/daemons/clvmd/clvmd.c
+++ b/daemons/clvmd/clvmd.c
@@ -1403,6 +1403,8 @@ static int distribute_command(struct local_client *thisfd)
 
 	/* Forward it to other nodes in the cluster if needed */
 	if (!(inheader->flags & CLVMD_FLAG_LOCAL)) {
+		char *args_printable = strdup_printable(inheader->args + strlen(inheader->node), inheader->arglen);
+
 		/* if node is empty then do it on the whole cluster */
 		if (inheader->node[0] == '\0') {
 			thisfd->bits.localsock.expected_replies =
@@ -1423,6 +1425,10 @@ static int distribute_command(struct local_client *thisfd)
 
 			DEBUGLOG("Sending message to all cluster nodes\n");
 			inheader->xid = thisfd->xid;
+			DEBUGLOG("Sending message of size %u: xid=%u, cmd=%s, status=%u, flags=%u, clientid=%u, arglen=%u, node='%s', args='%s'\n",
+				     len,
+				     inheader->xid, decode_cmd(inheader->cmd), inheader->status, inheader->flags, inheader->clientid, inheader->arglen,
+				     inheader->node, inheader->arglen > 0 ? args_printable : "");
 			send_message(inheader, len, NULL, -1,
 				     "Error forwarding message to cluster");
 		} else {
@@ -1440,25 +1446,35 @@ static int distribute_command(struct local_client *thisfd)
 
 				/* Are we the requested node ?? */
 				if (memcmp(csid, our_csid, max_csid_len) == 0) {
-					DEBUGLOG("Doing command on local node only\n");
+					DEBUGLOG("Doing command on local node only (by destination)\n");
 					inheader->flags &= ~CLVMD_FLAG_REMOTE;
 					inheader->flags |= CLVMD_FLAG_LOCAL;
 					add_to_lvmqueue(thisfd, inheader, len, NULL);
 				} else {
-					DEBUGLOG("Sending message to single node: %s\n",
-						 inheader->node);
+					unsigned int a, b;
+					memcpy(&a, csid, max_csid_len);
+					memcpy(&b, our_csid, max_csid_len);
+					
+					DEBUGLOG("Sending message to single node: %s, csid=0x%x, our_csid=0x%x\n",
+						 inheader->node, a, b);
 					inheader->xid = thisfd->xid;
+					DEBUGLOG("Sending mesaage of size %u: xid=%u, cmd=%s, status=%u, flags=%u, clientid=%u, arglen=%u, node='%s', args='%s'\n",
+						     len,
+						     inheader->xid, decode_cmd(inheader->cmd), inheader->status, inheader->flags, inheader->clientid, inheader->arglen,
+						     inheader->node, inheader->arglen > 0 ? args_printable : "");
 					send_message(inheader, len,
 						     csid, -1,
 						     "Error forwarding message to cluster node");
 				}
 			}
 		}
+		free(args_printable);
 	} else {
 		/* Local explicitly requested, ignore nodes */
 		thisfd->bits.localsock.in_progress = TRUE;
 		thisfd->bits.localsock.expected_replies = 1;
 		thisfd->bits.localsock.num_replies = 0;
+		DEBUGLOG("Doing command on local node only (by flag)\n");
 		add_to_lvmqueue(thisfd, inheader, len, NULL);
 	}
 	return 0;
@@ -1532,8 +1548,16 @@ static void process_remote_command(struct clvm_header *msg, int msglen, int fd,
 	if (replyargs != NULL) {
 		/* Run the command */
 		/* FIXME: usage of init_test() is unprotected */
+		char *args_printable = strdup_printable(msg->args, msg->arglen);
+
+		DEBUGLOG("Processing request of size %u: xid=%u, cmd=%s, status=%u, flags=%u, clientid=%u, arglen=%u, node='%s', args='%s'\n",
+			     msglen,
+			     msg->xid, decode_cmd(msg->cmd), msg->status, msg->flags, msg->clientid, msg->arglen,
+			     msg->node, msg->arglen > 0 ? args_printable : "");
+
 		status = do_command(NULL, msg, msglen, &replyargs,
 				    buflen, &replylen);
+		free(args_printable);
 	} else {
 		status = ENOMEM;
 	}
@@ -1547,6 +1571,7 @@ static void process_remote_command(struct clvm_header *msg, int msglen, int fd,
 		if (aggreply) {
 			struct clvm_header *agghead =
 			    (struct clvm_header *) aggreply;
+			char *args_printable = strdup_printable(replyargs, replylen);
 
 			replyargs = aggreply;
 			/* Move it up so there's room for a header in front of the data */
@@ -1560,6 +1585,11 @@ static void process_remote_command(struct clvm_header *msg, int msglen, int fd,
 			agghead->clientid = msg->clientid;
 			agghead->arglen = replylen;
 			agghead->node[0] = '\0';
+			DEBUGLOG("Sending reply of size %u: xid=%u, cmd=%s, status=%u, flags=%u, clientid=%u, arglen=%u, node='%s', args='%s'\n",
+				     (unsigned int) sizeof(struct clvm_header) + replylen,
+				     agghead->xid, decode_cmd(agghead->cmd), agghead->status, agghead->flags, agghead->clientid, agghead->arglen,
+				     agghead->node, agghead->arglen > 0 ? args_printable : "");
+			free(args_printable);
 			send_message(aggreply,
 				     sizeof(struct clvm_header) +
 				     replylen, csid, fd,
@@ -1748,16 +1778,28 @@ static int process_local_command(struct clvm_header *msg, int msglen,
 
 	/* If local flag is not set, just set a successful status code. */
 	if (! (msg->flags & CLVMD_FLAG_LOCAL)) {
+		DEBUGLOG("Setting success for remote msg in process_local_command\n");
 		status = 0;
-	else
+	}
+	else {
+		char *args_printable = strdup_printable(msg->args, msg->arglen);
+		DEBUGLOG("Processing request of size %u: xid=%u, cmd=%s, status=%u, flags=%u, clientid=%u, arglen=%u, node='%s', args='%s'\n",
+			     msglen,
+			     msg->xid, decode_cmd(msg->cmd), msg->status, msg->flags, msg->clientid, msg->arglen,
+			     msg->node, msg->arglen > 0 ? args_printable : "");
+
 		/* FIXME: usage of init_test() is unprotected */
 		status = do_command(client, msg, msglen, &replybuf, buflen, &replylen);
-
+		free(args_printable);
+	}
 	if (status)
 		client->bits.localsock.all_success = 0;
 
 	/* If we took too long then discard the reply */
 	if (xid == client->xid) {
+		char *args_printable = strdup_printable(replybuf, replylen);
+		DEBUGLOG("Adding local reply of size %u: '%s'\n", replylen, replylen > 0 ? args_printable : "");
+		free(args_printable);
 		add_reply_to_list(client, status, our_csid, replybuf, replylen);
 	} else {
 		DEBUGLOG
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [linux-lvm] [PATCH 07/10] locking: Allow lock management (activation, deactivation, conversion) on a remote nodes
  2013-03-19 13:32 [linux-lvm] [PATCH 00/10] Enhancements to a clustered logical volume activation Vladislav Bogdanov
                   ` (5 preceding siblings ...)
  2013-03-19 13:32 ` [linux-lvm] [PATCH 06/10] clvmd: additional debugging - print message bodies Vladislav Bogdanov
@ 2013-03-19 13:32 ` Vladislav Bogdanov
  2013-03-19 13:32 ` [linux-lvm] [PATCH 08/10] lvchange: implement remote lock management Vladislav Bogdanov
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 36+ messages in thread
From: Vladislav Bogdanov @ 2013-03-19 13:32 UTC (permalink / raw)
  To: linux-lvm

* add node argument to _query_resource()
* add node argument to remote_lock_held()
* add node argument to [_]lv_is_active[*] where needed
* rework _lv_is_active() logic a bit
* add activation calls for a "remote node" mode

Signed-off-by: Vladislav Bogdanov <bubble@hoster-ok.com>
---
 lib/activate/activate.c        |   55 ++++++++++++++++++++------------
 lib/activate/activate.h        |    3 +-
 lib/locking/cluster_locking.c  |   20 ++++++++---
 lib/locking/external_locking.c |    2 +-
 lib/locking/locking.c          |   68 ++++++++++++++++++++++++----------------
 lib/locking/locking.h          |   10 +++++-
 lib/locking/locking_types.h    |    2 +-
 7 files changed, 101 insertions(+), 59 deletions(-)

diff --git a/lib/activate/activate.c b/lib/activate/activate.c
index 185ba5f..1c8fd88 100644
--- a/lib/activate/activate.c
+++ b/lib/activate/activate.c
@@ -270,7 +270,7 @@ int lv_is_active_exclusive_locally(const struct logical_volume *lv)
 {
 	return 0;
 }
-int lv_is_active_exclusive_remotely(const struct logical_volume *lv)
+int lv_is_active_exclusive_remotely(const struct logical_volume *lv, const char *node)
 {
 	return 0;
 }
@@ -1014,27 +1014,33 @@ int lvs_in_vg_opened(const struct volume_group *vg)
  * Returns: 0 or 1
  */
 static int _lv_is_active(const struct logical_volume *lv,
-			 int *locally, int *exclusive)
+			 int *locally, int *exclusive, const char *node)
 {
 	int r, l, e; /* remote, local, and exclusive */
 
 	r = l = e = 0;
 
-	if (_lv_active(lv->vg->cmd, lv))
-		l = 1;
+	if (!node) {
+		if (_lv_active(lv->vg->cmd, lv))
+			l = 1;
 
-	if (!vg_is_clustered(lv->vg)) {
-		if (l)
-			e = 1;  /* exclusive by definition */
-		goto out;
-	}
+		if (!vg_is_clustered(lv->vg)) {
+			if (l)
+				e = 1;  /* exclusive by definition */
+			goto out;
+		}
 
-	/* Active locally, and the caller doesn't care about exclusive */
-	if (l && !exclusive)
-		goto out;
+		/* Active locally, and the caller doesn't care about exclusive */
+		if (l && !exclusive)
+			goto out;
+	}
 
-	if ((r = remote_lock_held(lv->lvid.s, &e)) >= 0)
+	if ((r = remote_lock_held(lv->lvid.s, &e, node)) >= 0) {
+		if (node && r > 0) {
+			l = 1;
+		}
 		goto out;
+	}
 
 	/*
 	 * If lock query is not supported (due to interfacing with old
@@ -1060,45 +1066,52 @@ out:
 	if (exclusive)
 		*exclusive = e;
 
-	log_very_verbose("%s/%s is %sactive%s%s",
+	log_very_verbose("%s/%s is %sactive%s%s%s%s",
 			 lv->vg->name, lv->name,
 			 (r || l) ? "" : "not ",
 			 (exclusive && e) ? " exclusive" : "",
-			 e ? (l ? " locally" : " remotely") : "");
+			 e ? (l ? " locally" : " remotely") : "",
+			 node ? " on node " : "", node ? node : "" );
 
 	return r || l;
 }
 
 int lv_is_active(const struct logical_volume *lv)
 {
-	return _lv_is_active(lv, NULL, NULL);
+	return _lv_is_active(lv, NULL, NULL, NULL);
+}
+
+int lv_is_active_remotely(const struct logical_volume *lv, const char *node)
+{
+	int l;
+	return _lv_is_active(lv, &l, NULL, node);
 }
 
 int lv_is_active_but_not_locally(const struct logical_volume *lv)
 {
 	int l;
-	return _lv_is_active(lv, &l, NULL) && !l;
+	return _lv_is_active(lv, &l, NULL, NULL) && !l;
 }
 
 int lv_is_active_exclusive(const struct logical_volume *lv)
 {
 	int e;
 
-	return _lv_is_active(lv, NULL, &e) && e;
+	return _lv_is_active(lv, NULL, &e, NULL) && e;
 }
 
 int lv_is_active_exclusive_locally(const struct logical_volume *lv)
 {
 	int l, e;
 
-	return _lv_is_active(lv, &l, &e) && l && e;
+	return _lv_is_active(lv, &l, &e, NULL) && l && e;
 }
 
-int lv_is_active_exclusive_remotely(const struct logical_volume *lv)
+int lv_is_active_exclusive_remotely(const struct logical_volume *lv, const char *node)
 {
 	int l, e;
 
-	return _lv_is_active(lv, &l, &e) && !l && e;
+	return _lv_is_active(lv, &l, &e, node) && e;
 }
 
 #ifdef DMEVENTD
diff --git a/lib/activate/activate.h b/lib/activate/activate.h
index ba24d2a..3a1931a 100644
--- a/lib/activate/activate.h
+++ b/lib/activate/activate.h
@@ -130,10 +130,11 @@ int lvs_in_vg_activated(const struct volume_group *vg);
 int lvs_in_vg_opened(const struct volume_group *vg);
 
 int lv_is_active(const struct logical_volume *lv);
+int lv_is_active_remotely(const struct logical_volume *lv, const char *node);
 int lv_is_active_but_not_locally(const struct logical_volume *lv);
 int lv_is_active_exclusive(const struct logical_volume *lv);
 int lv_is_active_exclusive_locally(const struct logical_volume *lv);
-int lv_is_active_exclusive_remotely(const struct logical_volume *lv);
+int lv_is_active_exclusive_remotely(const struct logical_volume *lv, const char *node);
 
 int lv_has_target_type(struct dm_pool *mem, struct logical_volume *lv,
 		       const char *layer, const char *target_type);
diff --git a/lib/locking/cluster_locking.c b/lib/locking/cluster_locking.c
index c3bd202..f4f396e 100644
--- a/lib/locking/cluster_locking.c
+++ b/lib/locking/cluster_locking.c
@@ -190,8 +190,10 @@ static void _build_header(struct clvm_header *head, int clvmd_cmd, const char *n
 	} else if (!strcmp(node, NODE_REMOTE)) {
 		head->node[0] = '\0';
 		head->flags = CLVMD_FLAG_REMOTE;
-	} else
-		strcpy(head->node, node);
+	} else {
+		memcpy(head->node, node, strlen(node) + 1);
+		head->flags = CLVMD_FLAG_REMOTE;
+	}
 }
 
 /*
@@ -236,6 +238,7 @@ static int _cluster_request(char clvmd_cmd, const char *node, void *data, int le
 		inptr += strlen(inptr) + 1;
 	}
 
+	log_debug("Got %u responses", num_responses);
 	/*
 	 * Allocate response array.
 	 * With an extra pair of INTs on the front to sanity
@@ -315,6 +318,9 @@ static int _lock_for_cluster(struct cmd_context *cmd, unsigned char clvmd_cmd,
 
 	assert(name);
 
+	if (node == NULL) {
+		node = "";
+	}
 	len = strlen(name) + 3;
 	args = alloca(len);
 	strcpy(args + 2, name);
@@ -374,7 +380,7 @@ static int _lock_for_cluster(struct cmd_context *cmd, unsigned char clvmd_cmd,
 		     !(flags & LCK_CLUSTER_VG)))
 			node = NODE_LOCAL;
 		else if (flags & LCK_REMOTE)
-			node = NODE_REMOTE;
+			node = cmd->node ? cmd->node : NODE_REMOTE;
 	}
 
 	status = _cluster_request(clvmd_cmd, node, args, len,
@@ -527,16 +533,18 @@ static int decode_lock_type(const char *response)
 }
 
 #ifdef CLUSTER_LOCKING_INTERNAL
-static int _query_resource(const char *resource, int *mode)
+static int _query_resource(const char *resource, int *mode, const char *node)
 #else
-int query_resource(const char *resource, int *mode)
+int query_resource(const char *resource, int *mode, const char *node)
 #endif
 {
 	int i, status, len, num_responses, saved_errno;
-	const char *node = "";
 	char *args;
 	lvm_response_t *response = NULL;
 
+	if (node == NULL) {
+		node = "";
+	}
 	saved_errno = errno;
 	len = strlen(resource) + 3;
 	args = alloca(len);
diff --git a/lib/locking/external_locking.c b/lib/locking/external_locking.c
index 4dacbe4..4c038f8 100644
--- a/lib/locking/external_locking.c
+++ b/lib/locking/external_locking.c
@@ -28,7 +28,7 @@ static int (*_lock_fn) (struct cmd_context * cmd, const char *resource,
 			uint32_t flags) = NULL;
 static int (*_init_fn) (int type, struct dm_config_tree * cft,
 			uint32_t *flags) = NULL;
-static int (*_lock_query_fn) (const char *resource, int *mode) = NULL;
+static int (*_lock_query_fn) (const char *resource, int *mode, const char *node) = NULL;
 
 static int _lock_resource(struct cmd_context *cmd, const char *resource,
 			  uint32_t flags)
diff --git a/lib/locking/locking.c b/lib/locking/locking.c
index ff46046..712e293 100644
--- a/lib/locking/locking.c
+++ b/lib/locking/locking.c
@@ -542,12 +542,8 @@ int suspend_lvs(struct cmd_context *cmd, struct dm_list *lvs,
 
 int deactivate_lv(struct cmd_context *cmd, struct logical_volume *lv)
 {
-	if (vg_is_clustered(lv->vg)) {
-		if (lv_is_active_exclusive(lv) && ! lv_is_active_exclusive_locally(lv)) {
-			return_0;
-		}
-	}
 	lock_lv_vol(cmd, lv, LCK_LV_DEACTIVATE);
+	return 1;
 }
 
 /*
@@ -561,20 +557,29 @@ int activate_lv_excl(struct cmd_context *cmd, struct logical_volume *lv)
 	if (!vg_is_clustered(lv->vg))
 		return activate_lv_excl_local(cmd, lv);
 
-	if (lv_is_active_exclusive_locally(lv))
-		return 1;
+	if (!cmd->node) {
+		if (lv_is_active_exclusive_locally(lv))
+			return 1;
 
-	if (!activate_lv_excl_local(cmd, lv))
-		return_0;
+		if (!activate_lv_excl_local(cmd, lv))
+			return_0;
 
-	if (lv_is_active_exclusive(lv))
-		return 1;
 
-	/* FIXME Deal with error return codes. */
-	if (activate_lv_excl_remote(cmd, lv))
-		stack;
+		if (lv_is_active_exclusive(lv))
+			return 1;
+	} else {
+		if (lv_is_active_exclusive_remotely(lv, cmd->node))
+			return 1;
+
+		if (!activate_lv_excl_remote(cmd, lv))
+			return_0;
+
+
+		if (lv_is_active_exclusive(lv))
+			return 1;
+	}
 
-	return lv_is_active_exclusive(lv);
+	return_0;
 }
 
 int activate_lv_excl_force(struct cmd_context *cmd, struct logical_volume *lv) 
@@ -583,20 +588,29 @@ int activate_lv_excl_force(struct cmd_context *cmd, struct logical_volume *lv)
 	if (!vg_is_clustered(lv->vg))
 		return activate_lv_excl_local(cmd, lv);
 
-	if (lv_is_active_exclusive_locally(lv))
-		return 1;
+	if (!cmd->node) {
+		if (lv_is_active_exclusive_locally(lv))
+			return 1;
 
-	if (!activate_lv_excl_local_force(cmd, lv))
-		return_0;
+		if (!activate_lv_excl_local_force(cmd, lv))
+			return_0;
 
-	if (lv_is_active_exclusive(lv))
-		return 1;
 
-	/* FIXME Deal with error return codes. */
-	if (activate_lv_excl_remote_force(cmd, lv))
-		stack;
+		if (lv_is_active_exclusive(lv))
+			return 1;
+	} else {
+		if (lv_is_active_exclusive_remotely(lv, cmd->node))
+			return 1;
+
+		if (!activate_lv_excl_remote_force(cmd, lv))
+			return_0;
+
+
+		if (lv_is_active_exclusive(lv))
+			return 1;
+	}
 
-	return lv_is_active_exclusive(lv);
+	return_0;
 }
 
 /* Lock a list of LVs */
@@ -635,7 +649,7 @@ int locking_is_clustered(void)
 	return (_locking.flags & LCK_CLUSTERED) ? 1 : 0;
 }
 
-int remote_lock_held(const char *vol, int *exclusive)
+int remote_lock_held(const char *vol, int *exclusive, const char *node)
 {
 	int mode = LCK_NULL;
 
@@ -648,7 +662,7 @@ int remote_lock_held(const char *vol, int *exclusive)
 	/*
 	 * If an error occured, expect that volume is active
 	 */
-	if (!_locking.query_resource(vol, &mode)) {
+	if (!_locking.query_resource(vol, &mode, node)) {
 		stack;
 		return 1;
 	}
diff --git a/lib/locking/locking.h b/lib/locking/locking.h
index b441a6c..ce20ff5 100644
--- a/lib/locking/locking.h
+++ b/lib/locking/locking.h
@@ -25,7 +25,7 @@ void reset_locking(void);
 int vg_write_lock_held(void);
 int locking_is_clustered(void);
 
-int remote_lock_held(const char *vol, int *exclusive);
+int remote_lock_held(const char *vol, int *exclusive, const char *node);
 
 /*
  * LCK_VG:
@@ -186,7 +186,7 @@ int check_lvm1_vg_inactive(struct cmd_context *cmd, const char *vgname);
 #define activate_lv_excl_local_force(cmd, lv)	\
 				lock_lv_vol(cmd, lv, LCK_LV_EXCLUSIVE | LCK_HOLD | LCK_LOCAL | (lv_is_active(lv) && ! lv_is_active_exclusive(lv) ? LCK_TRY_CONVERT : 0))
 #define activate_lv_excl_remote_force(cmd, lv)	\
-				lock_lv_vol(cmd, lv, LCK_LV_EXCLUSIVE | LCK_HOLD | LCK_REMOTE | (lv_is_active(lv) && ! lv_is_active_exclusive(lv) ? LCK_TRY_CONVERT : 0))
+				lock_lv_vol(cmd, lv, LCK_LV_EXCLUSIVE | LCK_HOLD | LCK_REMOTE | (lv_is_active_remotely(lv, cmd->node) && ! lv_is_active_exclusive(lv) ? LCK_TRY_CONVERT : 0))
 
 struct logical_volume;
 int activate_lv_excl(struct cmd_context *cmd, struct logical_volume *lv);
@@ -197,8 +197,14 @@ int deactivate_lv(struct cmd_context *cmd, struct logical_volume *lv);
 	lock_lv_vol(cmd, lv, LCK_LV_ACTIVATE | LCK_HOLD | LCK_LOCAL)
 #define activate_lv_local_force(cmd, lv)	\
 	lock_lv_vol(cmd, lv, LCK_LV_ACTIVATE | LCK_HOLD | LCK_LOCAL | (lv_is_active_exclusive_locally(lv) ? LCK_TRY_CONVERT : 0))
+#define activate_lv_remote(cmd, lv)	\
+	lock_lv_vol(cmd, lv, LCK_LV_ACTIVATE | LCK_HOLD | LCK_REMOTE)
+#define activate_lv_remote_force(cmd, lv)	\
+	lock_lv_vol(cmd, lv, LCK_LV_ACTIVATE | LCK_HOLD | LCK_REMOTE | (lv_is_active_exclusive_remotely(lv, cmd->node) ? LCK_TRY_CONVERT : 0))
 #define deactivate_lv_local(cmd, lv)	\
 	lock_lv_vol(cmd, lv, LCK_LV_DEACTIVATE | LCK_LOCAL)
+#define deactivate_lv_remote(cmd, lv)	\
+	lock_lv_vol(cmd, lv, LCK_LV_DEACTIVATE | LCK_REMOTE)
 #define drop_cached_metadata(vg)	\
 	lock_vol((vg)->cmd, (vg)->name, LCK_VG_DROP_CACHE)
 #define remote_commit_cached_metadata(vg)	\
diff --git a/lib/locking/locking_types.h b/lib/locking/locking_types.h
index 53c7016..787a922 100644
--- a/lib/locking/locking_types.h
+++ b/lib/locking/locking_types.h
@@ -18,7 +18,7 @@
 
 typedef int (*lock_resource_fn) (struct cmd_context * cmd, const char *resource,
 				 uint32_t flags);
-typedef int (*query_resource_fn) (const char *resource, int *mode);
+typedef int (*query_resource_fn) (const char *resource, int *mode, const char *node);
 
 typedef void (*fin_lock_fn) (void);
 typedef void (*reset_lock_fn) (void);
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [linux-lvm] [PATCH 08/10] lvchange: implement remote lock management
  2013-03-19 13:32 [linux-lvm] [PATCH 00/10] Enhancements to a clustered logical volume activation Vladislav Bogdanov
                   ` (6 preceding siblings ...)
  2013-03-19 13:32 ` [linux-lvm] [PATCH 07/10] locking: Allow lock management (activation, deactivation, conversion) on a remote nodes Vladislav Bogdanov
@ 2013-03-19 13:32 ` Vladislav Bogdanov
  2013-03-19 13:32 ` [linux-lvm] [PATCH 09/10] man: document --force option to lvchange, provide examples Vladislav Bogdanov
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 36+ messages in thread
From: Vladislav Bogdanov @ 2013-03-19 13:32 UTC (permalink / raw)
  To: linux-lvm

Signed-off-by: Vladislav Bogdanov <bubble@hoster-ok.com>
---
 lib/commands/toolcontext.h |    1 +
 tools/args.h               |    1 +
 tools/commands.h           |    3 +-
 tools/lvchange.c           |   98 ++++++++++++++++++++++++++++++++++++--------
 tools/pvmove.c             |    2 +-
 tools/vgchange.c           |    2 +-
 6 files changed, 87 insertions(+), 20 deletions(-)

diff --git a/lib/commands/toolcontext.h b/lib/commands/toolcontext.h
index 6e5803f..136ef53 100644
--- a/lib/commands/toolcontext.h
+++ b/lib/commands/toolcontext.h
@@ -112,6 +112,7 @@ struct cmd_context {
 	char dev_dir[PATH_MAX];
 	char proc_dir[PATH_MAX];
 	char sysfs_dir[PATH_MAX]; /* FIXME Use global value instead. */
+	const char *node;
 };
 
 /*
diff --git a/tools/args.h b/tools/args.h
index 0d9605a..7c75cff 100644
--- a/tools/args.h
+++ b/tools/args.h
@@ -76,6 +76,7 @@ arg(discards_ARG, '\0', "discards", discards_arg, 0)
 arg(stripes_long_ARG, '\0', "stripes", int_arg, 0)
 arg(sysinit_ARG, '\0', "sysinit", NULL, 0)
 arg(thinpool_ARG, '\0', "thinpool", string_arg, 0)
+arg(node_ARG, '\0', "node", string_arg, 0)
 
 /* Allow some variations */
 arg(resizable_ARG, '\0', "resizable", yes_no_arg, 0)
diff --git a/tools/commands.h b/tools/commands.h
index 6415d34..555ff87 100644
--- a/tools/commands.h
+++ b/tools/commands.h
@@ -87,6 +87,7 @@ xx(lvchange,
    "\t[-y|--yes]\n"
    "\t[--version]\n"
    "\t[-Z|--zero {y|n}]\n"
+   "\t[--node Node ]\n"
    "\tLogicalVolume[Path] [LogicalVolume[Path]...]\n",
 
    alloc_ARG, autobackup_ARG, activate_ARG, available_ARG, contiguous_ARG,
@@ -94,7 +95,7 @@ xx(lvchange,
    major_ARG, minor_ARG, monitor_ARG, noudevsync_ARG, partial_ARG,
    permission_ARG, persistent_ARG, poll_ARG, readahead_ARG, resync_ARG,
    refresh_ARG, addtag_ARG, deltag_ARG, sysinit_ARG, test_ARG, yes_ARG,
-   zero_ARG)
+   zero_ARG, node_ARG)
 
 xx(lvconvert,
    "Change logical volume layout",
diff --git a/tools/lvchange.c b/tools/lvchange.c
index 5740bea..e136eab 100644
--- a/tools/lvchange.c
+++ b/tools/lvchange.c
@@ -226,11 +226,30 @@ static int _lvchange_activate(struct cmd_context *cmd, struct logical_volume *lv
 	}
 
 	if (activate == CHANGE_ALN) {
-		log_verbose("Deactivating logical volume \"%s\" locally",
-			    lv->name);
-		if (!deactivate_lv_local(cmd, lv))
-			return_0;
+		if (cmd->node) {
+			log_verbose("Deactivating logical volume \"%s\" locally on node %s",
+				    lv->name, cmd->node);
+			if (!deactivate_lv_remote(cmd, lv))
+				return_0;
+		} else {
+			log_verbose("Deactivating logical volume \"%s\" locally",
+				    lv->name);
+			if (!deactivate_lv_local(cmd, lv))
+				return_0;
+		}
 	} else if (activate == CHANGE_AN) {
+		if (cmd->node) {
+			log_error("Use -aln to deactivate volume on a remote node");
+			return_0;
+		}
+		if (vg_is_clustered(lv->vg)) {
+			if (!arg_count(cmd, force_ARG)) {
+				if (lv_is_active_exclusive(lv) && ! lv_is_active_exclusive_locally(lv)) {
+					log_error("Failed to deactivate %s: exclusively activated on a remote node. Use -aln --node Node.", lv->name);
+					return_0;
+				}
+			}
+		}
 		log_verbose("Deactivating logical volume \"%s\"", lv->name);
 		if (!deactivate_lv(cmd, lv))
 			return_0;
@@ -239,29 +258,55 @@ static int _lvchange_activate(struct cmd_context *cmd, struct logical_volume *lv
 		    lv_is_origin(lv) ||
 		    lv_is_thin_type(lv)) {
 			if (arg_count(cmd, force_ARG)) {
-				log_verbose("Activating logical volume \"%s\" "
-					    "exclusively (forced)", lv->name);
+				if (cmd->node)
+					log_verbose("Activating logical volume \"%s\" "
+						    "exclusively on node %s (forced)", lv->name, cmd->node);
+				else
+					log_verbose("Activating logical volume \"%s\" "
+						    "exclusively (forced)", lv->name);
 				if (!activate_lv_excl_force(cmd, lv))
 					return_0;
 			} else {
-				log_verbose("Activating logical volume \"%s\" "
-					    "exclusively", lv->name);
+				if (cmd->node)
+					log_verbose("Activating logical volume \"%s\" "
+						    "exclusively on node %s", lv->name, cmd->node);
+				else
+					log_verbose("Activating logical volume \"%s\" "
+						    "exclusively", lv->name);
 				if (!activate_lv_excl(cmd, lv))
 					return_0;
 			}
 		} else if (activate == CHANGE_ALY) {
-			if (arg_count(cmd, force_ARG)) {
-				log_verbose("Activating logical volume \"%s\" locally (forced)",
-					    lv->name);
-				if (!activate_lv_local_force(cmd, lv))
-					return_0;
+			if (cmd->node) {
+				if (arg_count(cmd, force_ARG)) {
+					log_verbose("Activating logical volume \"%s\" locally on node %s (forced)",
+						    lv->name, cmd->node);
+					if (!activate_lv_remote_force(cmd, lv))
+						return_0;
+				} else {
+					log_verbose("Activating logical volume \"%s\" locally on node %s",
+						    lv->name, cmd->node);
+					if (!activate_lv_remote(cmd, lv))
+						return_0;
+				}
 			} else {
-				log_verbose("Activating logical volume \"%s\" locally",
-					    lv->name);
-				if (!activate_lv_local(cmd, lv))
-					return_0;
+				if (arg_count(cmd, force_ARG)) {
+					log_verbose("Activating logical volume \"%s\" locally (forced)",
+						    lv->name);
+					if (!activate_lv_local_force(cmd, lv))
+						return_0;
+				} else {
+					log_verbose("Activating logical volume \"%s\" locally",
+						    lv->name);
+					if (!activate_lv_local(cmd, lv))
+						return_0;
+				}
 			}
 		} else {
+			if (cmd->node) {
+				log_error("Use -aly or -aey to activate volume on a remote node");
+				return_0;
+			}
 			log_verbose("Activating logical volume \"%s\"",
 				    lv->name);
 			if (!activate_lv(cmd, lv))
@@ -808,6 +853,25 @@ static int lvchange_single(struct cmd_context *cmd, struct logical_volume *lv,
 		return ECMD_FAILED;
 	}
 
+	switch (arg_count(cmd, node_ARG)) {
+		case 1:
+			if (!arg_count(cmd, activate_ARG)) {
+				log_error("--node argument may be used only with -a");
+				return ECMD_FAILED;
+			}
+			if (!vg_is_clustered(lv->vg)) {
+				log_error("--node argument may be used only with clustered volume groups");
+				return ECMD_FAILED;
+			}
+			cmd->node = arg_str_value(cmd, node_ARG, NULL);
+			break;
+		case 0:
+			break;
+		default:
+			log_error("Only one --node argument is supported");
+			return ECMD_FAILED;
+			break;
+	}
 	/*
 	 * FIXME: DEFAULT_BACKGROUND_POLLING should be "unspecified".
 	 * If --poll is explicitly provided use it; otherwise polling
diff --git a/tools/pvmove.c b/tools/pvmove.c
index 9649f11..65718b1 100644
--- a/tools/pvmove.c
+++ b/tools/pvmove.c
@@ -240,7 +240,7 @@ static struct logical_volume *_set_up_pvmove_lv(struct cmd_context *cmd,
 		}
 
 		if (vg_is_clustered(vg) &&
-		    lv_is_active_exclusive_remotely(lv)) {
+		    lv_is_active_exclusive_remotely(lv, NULL)) {
 			lv_skipped = 1;
 			log_print_unless_silent("Skipping LV %s which is activated "
 						"exclusively on remote node.", lv->name);
diff --git a/tools/vgchange.c b/tools/vgchange.c
index a897a85..2d2b37f 100644
--- a/tools/vgchange.c
+++ b/tools/vgchange.c
@@ -125,7 +125,7 @@ static int _activate_lvs_in_vg(struct cmd_context *cmd, struct volume_group *vg,
 		 * If the LV is active exclusive remotely,
 		 * then ignore it here
 		 */
-		if (lv_is_active_exclusive_remotely(lv)) {
+		if (lv_is_active_exclusive_remotely(lv, NULL)) {
 			log_verbose("%s/%s is exclusively active on"
 				    " a remote node", vg->name, lv->name);
 			continue;
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [linux-lvm] [PATCH 09/10] man: document --force option to lvchange, provide examples
  2013-03-19 13:32 [linux-lvm] [PATCH 00/10] Enhancements to a clustered logical volume activation Vladislav Bogdanov
                   ` (7 preceding siblings ...)
  2013-03-19 13:32 ` [linux-lvm] [PATCH 08/10] lvchange: implement remote lock management Vladislav Bogdanov
@ 2013-03-19 13:32 ` Vladislav Bogdanov
  2013-03-19 13:32 ` [linux-lvm] [PATCH 10/10] man: document --node option to lvchange Vladislav Bogdanov
  2013-03-19 16:42 ` [linux-lvm] [PATCH 00/10] Enhancements to a clustered logical volume activation Alasdair G Kergon
  10 siblings, 0 replies; 36+ messages in thread
From: Vladislav Bogdanov @ 2013-03-19 13:32 UTC (permalink / raw)
  To: linux-lvm

Signed-off-by: Vladislav Bogdanov <bubble@hoster-ok.com>
---
 man/lvchange.8.in |   17 +++++++++++++++++
 1 files changed, 17 insertions(+), 0 deletions(-)

diff --git a/man/lvchange.8.in b/man/lvchange.8.in
index 32a0580..c26a65c 100644
--- a/man/lvchange.8.in
+++ b/man/lvchange.8.in
@@ -42,6 +42,7 @@ lvchange \- change attributes of a logical volume
 .RB [ \-v | \-\-verbose ]
 .RB [ \-Z | \-\-zero
 .RI { y | n }]
+.RB [ \-f | \-\-force ]
 .I LogicalVolumePath
 .RI [ LogicalVolumePath ...]
 .SH DESCRIPTION
@@ -159,10 +160,26 @@ manually without a clustered lock manager.
 Set zeroing mode for thin pool. Note: already provisioned blocks from pool
 in non-zero mode are not cleared in unwritten parts when setting zero to
 \fIy\fP.
+.TP
+.B \-\-force
+Allows to do lock conversion (shared to exclusive and vice versa) on a
+logical volume in a clustered volume group (-aly and -aey commands).
+Also allows to deactivate logical volume locked exclusively on a remote node
+(but not open) when using -an command.
 .SH Examples
 Changes the permission on volume lvol1 in volume group vg00 to be read-only:
 .sp
 .B lvchange -pr vg00/lvol1
+.sp
+Convert lock on a shared-locked volume lvol1 in clustered volume group vg00 to
+exclusive (assuming no more nodes in a cluster hold shared lock):
+.sp
+.B lvchange -aey --force vg00/lvol1
+.sp
+Convert lock on a exclusively-locked volume lvol1 in volume group vg00 to shared
+(f.e. in order to migrate virtual machine which uses that LV as a becking disk device):
+.sp
+.B lvchange -aly --force vg00/lvol1
 .SH SEE ALSO
 .BR lvm (8),
 .BR lvcreate (8),
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [linux-lvm] [PATCH 10/10] man: document --node option to lvchange
  2013-03-19 13:32 [linux-lvm] [PATCH 00/10] Enhancements to a clustered logical volume activation Vladislav Bogdanov
                   ` (8 preceding siblings ...)
  2013-03-19 13:32 ` [linux-lvm] [PATCH 09/10] man: document --force option to lvchange, provide examples Vladislav Bogdanov
@ 2013-03-19 13:32 ` Vladislav Bogdanov
  2013-03-19 15:32   ` David Teigland
  2013-03-19 16:42 ` [linux-lvm] [PATCH 00/10] Enhancements to a clustered logical volume activation Alasdair G Kergon
  10 siblings, 1 reply; 36+ messages in thread
From: Vladislav Bogdanov @ 2013-03-19 13:32 UTC (permalink / raw)
  To: linux-lvm

Signed-off-by: Vladislav Bogdanov <bubble@hoster-ok.com>
---
 man/lvchange.8.in |   26 ++++++++++++++++++++++++++
 1 files changed, 26 insertions(+), 0 deletions(-)

diff --git a/man/lvchange.8.in b/man/lvchange.8.in
index c26a65c..e7ecf07 100644
--- a/man/lvchange.8.in
+++ b/man/lvchange.8.in
@@ -43,6 +43,9 @@ lvchange \- change attributes of a logical volume
 .RB [ \-Z | \-\-zero
 .RI { y | n }]
 .RB [ \-f | \-\-force ]
+.RB [ \-\-node
+.IR NodeID ]
+]
 .I LogicalVolumePath
 .RI [ LogicalVolumePath ...]
 .SH DESCRIPTION
@@ -166,6 +169,15 @@ Allows to do lock conversion (shared to exclusive and vice versa) on a
 logical volume in a clustered volume group (-aly and -aey commands).
 Also allows to deactivate logical volume locked exclusively on a remote node
 (but not open) when using -an command.
+.TP
+.B \-\-node \fINodeID
+Perform specified activation command on a remote cluster node (the same to running
+corresponding command on that node locally).
+Currently tested only for corosync clusters (\fB-I\fP \fIcorosync\fP option to clvmd)
+for corosync versions from 2.0.
+If corosync configuration has node names in a nodelist (nodelist.node.X.name = name
+in CMAP tems) or node names are used for ring0_addr (nodelist.node.X.ring0_addr = name),
+then that names may be used as NodeID. Otherwise numeric node IDs should be used.
 .SH Examples
 Changes the permission on volume lvol1 in volume group vg00 to be read-only:
 .sp
@@ -180,6 +192,20 @@ Convert lock on a exclusively-locked volume lvol1 in volume group vg00 to shared
 (f.e. in order to migrate virtual machine which uses that LV as a becking disk device):
 .sp
 .B lvchange -aly --force vg00/lvol1
+.sp
+Activate  volume lvol1 in clustered volume group vg00 locally (shared lock) on the
+cluster node with name node1:
+.sp
+.B lvchange -aly --node node1 vg00/lvol1
+.sp
+Convert lock on a shared-locked volume lvol1 in clustered volume group vg00 to
+exclusive (assuming no more nodes in a cluster hold shared lock) on a node node1:
+.sp
+.B lvchange -aey --force --node node1 vg00/lvol1
+.sp
+The same without node names configuration (assuming node has ID 4):
+.sp
+.B lvchange -aey --force --node 4 vg00/lvol1
 .SH SEE ALSO
 .BR lvm (8),
 .BR lvcreate (8),
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [linux-lvm] [PATCH 01/10] lvchange: Allow cluster lock conversion
  2013-03-19 13:32 ` [linux-lvm] [PATCH 01/10] lvchange: Allow cluster lock conversion Vladislav Bogdanov
@ 2013-03-19 15:23   ` David Teigland
  2013-03-19 15:33     ` Vladislav Bogdanov
  0 siblings, 1 reply; 36+ messages in thread
From: David Teigland @ 2013-03-19 15:23 UTC (permalink / raw)
  To: Vladislav Bogdanov; +Cc: linux-lvm

On Tue, Mar 19, 2013 at 01:32:41PM +0000, Vladislav Bogdanov wrote:
> Allow clvm locks to be converted shared <-> exclusive with corosync/dlm.
> 
> Without this it is impossible to alow both
> * VM migration (shared lock is required)
> * host-side snapshots of VM disks (exlusive lock is required)
> 
> Locks are only converted if --force parameter passed to lvchange.
> 
> Internally LKF_CONVERT flag is passed to dlm, so that is a real lock
> conversion.
> 
> Also deny release of an exclusive lock on a remote note without --force
> flag to lvchange -an.

I'm trying to figure out why the code doesn't already use CONVERT when
possible; I would have expected it to.  (It has to be combined with
NOQUEUE to avoid deadlocks.)  What happens now without this force/CONVERT
patch? a new lock created/requested?

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [linux-lvm] [PATCH 10/10] man: document --node option to lvchange
  2013-03-19 13:32 ` [linux-lvm] [PATCH 10/10] man: document --node option to lvchange Vladislav Bogdanov
@ 2013-03-19 15:32   ` David Teigland
  2013-03-19 15:42     ` Vladislav Bogdanov
  2013-03-21 18:23     ` Vladislav Bogdanov
  0 siblings, 2 replies; 36+ messages in thread
From: David Teigland @ 2013-03-19 15:32 UTC (permalink / raw)
  To: Vladislav Bogdanov; +Cc: linux-lvm

On Tue, Mar 19, 2013 at 01:32:50PM +0000, Vladislav Bogdanov wrote:
> +.B \-\-node \fINodeID
> +Perform specified activation command on a remote cluster node (the same to running
> +corresponding command on that node locally).
> +Currently tested only for corosync clusters (\fB-I\fP \fIcorosync\fP option to clvmd)
> +for corosync versions from 2.0.
> +If corosync configuration has node names in a nodelist (nodelist.node.X.name = name
> +in CMAP tems) or node names are used for ring0_addr (nodelist.node.X.ring0_addr = name),
> +then that names may be used as NodeID. Otherwise numeric node IDs should be used.

lvm tools should move away from doing remote command execution.
As you mentioned before, this is the equivalent of "ssh node lvchange".
ssh or some other tool outside lvm is the right way to run the remote
commands.

Also, lvm should not assume that it's using dlm/corosync, or that this
kind of remote option will be possible to support.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [linux-lvm] [PATCH 01/10] lvchange: Allow cluster lock conversion
  2013-03-19 15:23   ` David Teigland
@ 2013-03-19 15:33     ` Vladislav Bogdanov
  2013-03-19 15:44       ` Vladislav Bogdanov
  0 siblings, 1 reply; 36+ messages in thread
From: Vladislav Bogdanov @ 2013-03-19 15:33 UTC (permalink / raw)
  To: David Teigland; +Cc: linux-lvm

19.03.2013 18:23, David Teigland wrote:
> On Tue, Mar 19, 2013 at 01:32:41PM +0000, Vladislav Bogdanov wrote:
>> Allow clvm locks to be converted shared <-> exclusive with corosync/dlm.
>>
>> Without this it is impossible to alow both
>> * VM migration (shared lock is required)
>> * host-side snapshots of VM disks (exlusive lock is required)
>>
>> Locks are only converted if --force parameter passed to lvchange.
>>
>> Internally LKF_CONVERT flag is passed to dlm, so that is a real lock
>> conversion.
>>
>> Also deny release of an exclusive lock on a remote note without --force
>> flag to lvchange -an.
> 
> I'm trying to figure out why the code doesn't already use CONVERT when
> possible; I would have expected it to. 

Like me - that's why this patch.

> (It has to be combined with
> NOQUEUE to avoid deadlocks.)

It is from what I see.

> What happens now without this force/CONVERT
> patch? a new lock created/requested?
> 

No, just returns error.

...
#lvchange.c:153     Activating logical volume "lustre03-right.vds-ok.com_disk0" exclusively
#activate/dev_manager.c:285         Getting device info for VG_VDS_OK_POOL_1-lustre03--right.vds--ok.com_disk0 [LVM-2nCt35tXpJcEJUErLzSM3nTjJZwr0DvX3FDOuTp381ZIEVQvUrGCqoSYs4lGhW2O]
#ioctl/libdm-iface.c:1687         dm info  LVM-2nCt35tXpJcEJUErLzSM3nTjJZwr0DvX3FDOuTp381ZIEVQvUrGCqoSYs4lGhW2O NF   [16384] (*1)
#locking/cluster_locking.c:563         Lock held for 2nCt35tXpJcEJUErLzSM3nTjJZwr0DvX3FDOuTp381ZIEVQvUrGCqoSYs4lGhW2O, node 3804050a : CR
#locking/cluster_locking.c:563         Lock held for 2nCt35tXpJcEJUErLzSM3nTjJZwr0DvX3FDOuTp381ZIEVQvUrGCqoSYs4lGhW2O, node 5c04050a : CR
#locking/cluster_locking.c:563         Lock held for 2nCt35tXpJcEJUErLzSM3nTjJZwr0DvX3FDOuTp381ZIEVQvUrGCqoSYs4lGhW2O, node 3904050a : CR
#activate/activate.c:1050       VG_VDS_OK_POOL_1/lustre03-right.vds-ok.com_disk0 is active
#locking/cluster_locking.c:503       Locking LV 2nCt35tXpJcEJUErLzSM3nTjJZwr0DvX3FDOuTp381ZIEVQvUrGCqoSYs4lGhW2O EX (LV|NONBLOCK|CLUSTER|LOCAL) (0xdd)
#locking/cluster_locking.c:391   Error locking on node 3904050a: Device or resource busy
...

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [linux-lvm] [PATCH 10/10] man: document --node option to lvchange
  2013-03-19 15:32   ` David Teigland
@ 2013-03-19 15:42     ` Vladislav Bogdanov
  2013-03-19 15:54       ` David Teigland
  2013-03-21 18:23     ` Vladislav Bogdanov
  1 sibling, 1 reply; 36+ messages in thread
From: Vladislav Bogdanov @ 2013-03-19 15:42 UTC (permalink / raw)
  To: David Teigland; +Cc: linux-lvm

19.03.2013 18:32, David Teigland wrote:
> On Tue, Mar 19, 2013 at 01:32:50PM +0000, Vladislav Bogdanov wrote:
>> +.B \-\-node \fINodeID
>> +Perform specified activation command on a remote cluster node (the same to running
>> +corresponding command on that node locally).
>> +Currently tested only for corosync clusters (\fB-I\fP \fIcorosync\fP option to clvmd)
>> +for corosync versions from 2.0.
>> +If corosync configuration has node names in a nodelist (nodelist.node.X.name = name
>> +in CMAP tems) or node names are used for ring0_addr (nodelist.node.X.ring0_addr = name),
>> +then that names may be used as NodeID. Otherwise numeric node IDs should be used.
> 
> lvm tools should move away from doing remote command execution.

For me that is just convenient - why not?.

> As you mentioned before, this is the equivalent of "ssh node lvchange".
> ssh or some other tool outside lvm is the right way to run the remote
> commands.

I think about porting this to LVM API as well, and use it in libvirt.

> Also, lvm should not assume that it's using dlm/corosync, or that this
> kind of remote option will be possible to support.

That all was already "almost" supported by clvmd. I only pass node id
from command line to an existing clvm header (and fix some related
functions).

All cluster/dlm staff is in corosync clvmd module - so both clvmd and
tools are cluster-stack-agnostic. May be the same is possible for both
cman and openais modules - I do not know because I do not use them and
do not plan to. You know they both are almost dead, and only
corosync/dlm actually remains.

And... Every cluster has nodes. Clustered LVM VG assumes you have
cluster. Why not be cluster-aware in tools then?

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [linux-lvm] [PATCH 01/10] lvchange: Allow cluster lock conversion
  2013-03-19 15:33     ` Vladislav Bogdanov
@ 2013-03-19 15:44       ` Vladislav Bogdanov
  2013-03-19 16:03         ` David Teigland
  0 siblings, 1 reply; 36+ messages in thread
From: Vladislav Bogdanov @ 2013-03-19 15:44 UTC (permalink / raw)
  To: linux-lvm

19.03.2013 18:33, Vladislav Bogdanov wrote:
> 19.03.2013 18:23, David Teigland wrote:
>> On Tue, Mar 19, 2013 at 01:32:41PM +0000, Vladislav Bogdanov wrote:
>>> Allow clvm locks to be converted shared <-> exclusive with corosync/dlm.
>>>
>>> Without this it is impossible to alow both
>>> * VM migration (shared lock is required)
>>> * host-side snapshots of VM disks (exlusive lock is required)
>>>
>>> Locks are only converted if --force parameter passed to lvchange.
>>>
>>> Internally LKF_CONVERT flag is passed to dlm, so that is a real lock
>>> conversion.
>>>
>>> Also deny release of an exclusive lock on a remote note without --force
>>> flag to lvchange -an.
>>
>> I'm trying to figure out why the code doesn't already use CONVERT when
>> possible; I would have expected it to. 
> 
> Like me - that's why this patch.
> 
>> (It has to be combined with
>> NOQUEUE to avoid deadlocks.)
> 
> It is from what I see.
> 
>> What happens now without this force/CONVERT
>> patch? a new lock created/requested?
>>
> 
> No, just returns error.

After new lock creation is requested ;) It's a deep evening here, sorry
for being not correct...

> 
> ...
> #lvchange.c:153     Activating logical volume "lustre03-right.vds-ok.com_disk0" exclusively
> #activate/dev_manager.c:285         Getting device info for VG_VDS_OK_POOL_1-lustre03--right.vds--ok.com_disk0 [LVM-2nCt35tXpJcEJUErLzSM3nTjJZwr0DvX3FDOuTp381ZIEVQvUrGCqoSYs4lGhW2O]
> #ioctl/libdm-iface.c:1687         dm info  LVM-2nCt35tXpJcEJUErLzSM3nTjJZwr0DvX3FDOuTp381ZIEVQvUrGCqoSYs4lGhW2O NF   [16384] (*1)
> #locking/cluster_locking.c:563         Lock held for 2nCt35tXpJcEJUErLzSM3nTjJZwr0DvX3FDOuTp381ZIEVQvUrGCqoSYs4lGhW2O, node 3804050a : CR
> #locking/cluster_locking.c:563         Lock held for 2nCt35tXpJcEJUErLzSM3nTjJZwr0DvX3FDOuTp381ZIEVQvUrGCqoSYs4lGhW2O, node 5c04050a : CR
> #locking/cluster_locking.c:563         Lock held for 2nCt35tXpJcEJUErLzSM3nTjJZwr0DvX3FDOuTp381ZIEVQvUrGCqoSYs4lGhW2O, node 3904050a : CR
> #activate/activate.c:1050       VG_VDS_OK_POOL_1/lustre03-right.vds-ok.com_disk0 is active
> #locking/cluster_locking.c:503       Locking LV 2nCt35tXpJcEJUErLzSM3nTjJZwr0DvX3FDOuTp381ZIEVQvUrGCqoSYs4lGhW2O EX (LV|NONBLOCK|CLUSTER|LOCAL) (0xdd)
> #locking/cluster_locking.c:391   Error locking on node 3904050a: Device or resource busy
> ...

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [linux-lvm] [PATCH 10/10] man: document --node option to lvchange
  2013-03-19 15:42     ` Vladislav Bogdanov
@ 2013-03-19 15:54       ` David Teigland
  2013-03-19 16:52         ` Vladislav Bogdanov
  0 siblings, 1 reply; 36+ messages in thread
From: David Teigland @ 2013-03-19 15:54 UTC (permalink / raw)
  To: Vladislav Bogdanov; +Cc: linux-lvm

On Tue, Mar 19, 2013 at 06:42:02PM +0300, Vladislav Bogdanov wrote:
> And... Every cluster has nodes. Clustered LVM VG assumes you have
> cluster. Why not be cluster-aware in tools then?

The changes I'm making move away from remote command execution within lvm
itself because it's poor a design and complicates things tremendously for
a dubious feature.  Also, it would not fit well at all with sanlock which
I'm adding support for.  There are much better ways of doing remote
commands.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [linux-lvm] [PATCH 01/10] lvchange: Allow cluster lock conversion
  2013-03-19 15:44       ` Vladislav Bogdanov
@ 2013-03-19 16:03         ` David Teigland
  2013-03-19 16:36           ` Vladislav Bogdanov
  0 siblings, 1 reply; 36+ messages in thread
From: David Teigland @ 2013-03-19 16:03 UTC (permalink / raw)
  To: Vladislav Bogdanov; +Cc: linux-lvm

On Tue, Mar 19, 2013 at 06:44:58PM +0300, Vladislav Bogdanov wrote:
> >> I'm trying to figure out why the code doesn't already use CONVERT when
> >> possible; I would have expected it to. 
> > 
> >> What happens now without this force/CONVERT
> >> patch? a new lock created/requested?
> >>

> After new lock creation is requested ;)

Yeah, it doesn't make much sense for clvmd-corosync.c to do that.
I'm hoping that we can make clvmd-corosync.c aware that it already
holds that lock and attempt to convert it.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [linux-lvm] [PATCH 01/10] lvchange: Allow cluster lock conversion
  2013-03-19 16:03         ` David Teigland
@ 2013-03-19 16:36           ` Vladislav Bogdanov
  0 siblings, 0 replies; 36+ messages in thread
From: Vladislav Bogdanov @ 2013-03-19 16:36 UTC (permalink / raw)
  To: David Teigland; +Cc: linux-lvm

19.03.2013 19:03, David Teigland wrote:
> On Tue, Mar 19, 2013 at 06:44:58PM +0300, Vladislav Bogdanov wrote:
>>>> I'm trying to figure out why the code doesn't already use CONVERT when
>>>> possible; I would have expected it to. 
>>>
>>>> What happens now without this force/CONVERT
>>>> patch? a new lock created/requested?
>>>>
> 
>> After new lock creation is requested ;)
> 
> Yeah, it doesn't make much sense for clvmd-corosync.c to do that.
> I'm hoping that we can make clvmd-corosync.c aware that it already
> holds that lock and attempt to convert it.

But that could be not always expected...

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [linux-lvm] [PATCH 00/10] Enhancements to a clustered logical volume activation
  2013-03-19 13:32 [linux-lvm] [PATCH 00/10] Enhancements to a clustered logical volume activation Vladislav Bogdanov
                   ` (9 preceding siblings ...)
  2013-03-19 13:32 ` [linux-lvm] [PATCH 10/10] man: document --node option to lvchange Vladislav Bogdanov
@ 2013-03-19 16:42 ` Alasdair G Kergon
  2013-03-19 17:42   ` Vladislav Bogdanov
  10 siblings, 1 reply; 36+ messages in thread
From: Alasdair G Kergon @ 2013-03-19 16:42 UTC (permalink / raw)
  To: Vladislav Bogdanov; +Cc: linux-lvm

So what I need is:

Separate out the fixes and minor changes that have no effect on support or
introduce new constraints on future changes.  Get those out of the way.

Then with what's left, sort out the impact in terms of additional support,
testing, constraints etc.  Then work out what the options are.
E.g. we support multiple lock managers.  How do your changes affect
each of these, including sanlock?

New features should only be available where they are tested and supported:
if some combination doesn't make sense or won't be tested properly, don't allow
that combination.  In extreme cases, use configure settings as the method of
control (e.g. if a distro decided it didn't want --node that would need to be a
configure option).  Man pages need to be correct - if an option doesn't work
with some lock manager, it should say that (or not mention it if under
configure's control).

Alasdair

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [linux-lvm] [PATCH 10/10] man: document --node option to lvchange
  2013-03-19 15:54       ` David Teigland
@ 2013-03-19 16:52         ` Vladislav Bogdanov
  2013-03-19 17:16           ` David Teigland
  0 siblings, 1 reply; 36+ messages in thread
From: Vladislav Bogdanov @ 2013-03-19 16:52 UTC (permalink / raw)
  To: David Teigland; +Cc: linux-lvm

19.03.2013 18:54, David Teigland wrote:
> On Tue, Mar 19, 2013 at 06:42:02PM +0300, Vladislav Bogdanov wrote:
>> And... Every cluster has nodes. Clustered LVM VG assumes you have
>> cluster. Why not be cluster-aware in tools then?
> 
> The changes I'm making move away from remote command execution within lvm
> itself because it's poor a design and complicates things tremendously for
> a dubious feature.  Also, it would not fit well at all with sanlock which
> I'm adding support for.  There are much better ways of doing remote
> commands.
> 

David, can you please share where you go, because google shows only
recent thread from this list when I ask it about lvmlockd.

And, do you have any estimations, how long may it take to have you ideas
ready for production use?

Also, as you're not satisfied with this implementation, what alternative
way do you see? (calling ssh from libvirt or LVM API is not a good idea
at all I think)

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [linux-lvm] [PATCH 10/10] man: document --node option to lvchange
  2013-03-19 16:52         ` Vladislav Bogdanov
@ 2013-03-19 17:16           ` David Teigland
  2013-03-19 17:36             ` Vladislav Bogdanov
  0 siblings, 1 reply; 36+ messages in thread
From: David Teigland @ 2013-03-19 17:16 UTC (permalink / raw)
  To: Vladislav Bogdanov; +Cc: linux-lvm

On Tue, Mar 19, 2013 at 07:52:14PM +0300, Vladislav Bogdanov wrote:
> And, do you have any estimations, how long may it take to have you ideas
> ready for production use?

It'll be quite a while (and the new locking scheme I'm working on will not
include remote command execution.)

> Also, as you're not satisfied with this implementation, what alternative
> way do you see? (calling ssh from libvirt or LVM API is not a good idea
> at all I think)

Apart from using ovirt/rhev, I'd try one of the following behind the
libvirt locking api: sanlock, dlm, file locks on nfs, file locks on gfs2.
There may also be ways to run remote commands other than ssh that would
work better from libvirt, although I don't have any specific suggestion.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [linux-lvm] [PATCH 10/10] man: document --node option to lvchange
  2013-03-19 17:16           ` David Teigland
@ 2013-03-19 17:36             ` Vladislav Bogdanov
  2013-03-20  8:45               ` Zdenek Kabelac
  0 siblings, 1 reply; 36+ messages in thread
From: Vladislav Bogdanov @ 2013-03-19 17:36 UTC (permalink / raw)
  To: David Teigland; +Cc: linux-lvm

19.03.2013 20:16, David Teigland wrote:
> On Tue, Mar 19, 2013 at 07:52:14PM +0300, Vladislav Bogdanov wrote:
>> And, do you have any estimations, how long may it take to have you ideas
>> ready for production use?
> 
> It'll be quite a while (and the new locking scheme I'm working on will not
> include remote command execution.)
> 
>> Also, as you're not satisfied with this implementation, what alternative
>> way do you see? (calling ssh from libvirt or LVM API is not a good idea
>> at all I think)
> 
> Apart from using ovirt/rhev, I'd try one of the following behind the
> libvirt locking api: sanlock, dlm, file locks on nfs, file locks on gfs2.

Unfortunately none of these solve the main thing I need: Allow LVM
snapshots without breaking live VM migration :(

Cluster-wide snapshots (with shared lock) would solve this, but I do not
expect to see this implemented soon.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [linux-lvm] [PATCH 00/10] Enhancements to a clustered logical volume activation
  2013-03-19 16:42 ` [linux-lvm] [PATCH 00/10] Enhancements to a clustered logical volume activation Alasdair G Kergon
@ 2013-03-19 17:42   ` Vladislav Bogdanov
  2013-06-05 13:23     ` [linux-lvm] clvmd leaving kernel dlm uncontrolled lockspace Andreas Pflug
  0 siblings, 1 reply; 36+ messages in thread
From: Vladislav Bogdanov @ 2013-03-19 17:42 UTC (permalink / raw)
  To: linux-lvm

19.03.2013 19:42, Alasdair G Kergon wrote:
> So what I need is:
> 
> Separate out the fixes and minor changes that have no effect on support or
> introduce new constraints on future changes.  Get those out of the way.
> 
> Then with what's left, sort out the impact in terms of additional support,
> testing, constraints etc.  Then work out what the options are.
> E.g. we support multiple lock managers.  How do your changes affect
> each of these, including sanlock?
> 
> New features should only be available where they are tested and supported:
> if some combination doesn't make sense or won't be tested properly, don't allow
> that combination.  In extreme cases, use configure settings as the method of
> control (e.g. if a distro decided it didn't want --node that would need to be a
> configure option).  Man pages need to be correct - if an option doesn't work
> with some lock manager, it should say that (or not mention it if under
> configure's control).

Ok.

Do you have any opinion of features themselves or a way they implemented
(as if subject had [RFC} in it)?

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [linux-lvm] [PATCH 10/10] man: document --node option to lvchange
  2013-03-19 17:36             ` Vladislav Bogdanov
@ 2013-03-20  8:45               ` Zdenek Kabelac
  2013-03-20 12:12                 ` Vladislav Bogdanov
  2013-03-21 18:31                 ` Vladislav Bogdanov
  0 siblings, 2 replies; 36+ messages in thread
From: Zdenek Kabelac @ 2013-03-20  8:45 UTC (permalink / raw)
  To: linux-lvm, Vladislav Bogdanov, David Teigland

Dne 19.3.2013 18:36, Vladislav Bogdanov napsal(a):
> 19.03.2013 20:16, David Teigland wrote:
>> On Tue, Mar 19, 2013 at 07:52:14PM +0300, Vladislav Bogdanov wrote:
>>> And, do you have any estimations, how long may it take to have you ideas
>>> ready for production use?
>>
>> It'll be quite a while (and the new locking scheme I'm working on will not
>> include remote command execution.)
>>
>>> Also, as you're not satisfied with this implementation, what alternative
>>> way do you see? (calling ssh from libvirt or LVM API is not a good idea
>>> at all I think)
>>
>> Apart from using ovirt/rhev, I'd try one of the following behind the
>> libvirt locking api: sanlock, dlm, file locks on nfs, file locks on gfs2.
>
> Unfortunately none of these solve the main thing I need: Allow LVM
> snapshots without breaking live VM migration :(
>
> Cluster-wide snapshots (with shared lock) would solve this, but I do not
> expect to see this implemented soon.
>

Before I'll go any deeper with reviewing patches myself - I'd like to
make myself clean about this 'snapshot' issue.

(BTW there is already one thing which will surely not pass - it's the 'node' 
option for lvm command - this would have to be made diferently).

But back to snapshots -

What would be the point of having (old, non thinp) snapshots active at the 
same time on more then 1 node ?

That would simply not work - since you would have to ensure that noone will 
write to  snapshot & origin on either of those nodes?

Is your code doing some transition which needs active device on both nodes
treating them in read-only way ?

Since metadata for snapshot are only parsed during first activation of 
snapshot, there is no way, the second node could resync if you would have 
written to the snapshot/origin on the first node.

So could you please describe in more details how it's supposed to work?


Zdenek

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [linux-lvm] [PATCH 10/10] man: document --node option to lvchange
  2013-03-20  8:45               ` Zdenek Kabelac
@ 2013-03-20 12:12                 ` Vladislav Bogdanov
  2013-03-21 18:31                 ` Vladislav Bogdanov
  1 sibling, 0 replies; 36+ messages in thread
From: Vladislav Bogdanov @ 2013-03-20 12:12 UTC (permalink / raw)
  To: Zdenek Kabelac; +Cc: David Teigland, linux-lvm

20.03.2013 11:45, Zdenek Kabelac wrote:
> Dne 19.3.2013 18:36, Vladislav Bogdanov napsal(a):
>> 19.03.2013 20:16, David Teigland wrote:
>>> On Tue, Mar 19, 2013 at 07:52:14PM +0300, Vladislav Bogdanov wrote:
>>>> And, do you have any estimations, how long may it take to have you
>>>> ideas
>>>> ready for production use?
>>>
>>> It'll be quite a while (and the new locking scheme I'm working on
>>> will not
>>> include remote command execution.)
>>>
>>>> Also, as you're not satisfied with this implementation, what
>>>> alternative
>>>> way do you see? (calling ssh from libvirt or LVM API is not a good idea
>>>> at all I think)
>>>
>>> Apart from using ovirt/rhev, I'd try one of the following behind the
>>> libvirt locking api: sanlock, dlm, file locks on nfs, file locks on
>>> gfs2.
>>
>> Unfortunately none of these solve the main thing I need: Allow LVM
>> snapshots without breaking live VM migration :(
>>
>> Cluster-wide snapshots (with shared lock) would solve this, but I do not
>> expect to see this implemented soon.
>>
> 
> Before I'll go any deeper with reviewing patches myself - I'd like to
> make myself clean about this 'snapshot' issue.
> 
> (BTW there is already one thing which will surely not pass - it's the
> 'node' option for lvm command - this would have to be made diferently).
> 
> But back to snapshots -
> 
> What would be the point of having (old, non thinp) snapshots active at
> the same time on more then 1 node ?

There is no need on this.

I need source volume itself to be active on two nodes to perform live VM
migration. libvirt/qemu controls which instance has CPUs turned on. But
qemu processes on two nodes need to have LV open simultaneously.

I'm able to take snapshot only when volume is activated exclusively.
I can open that snapshot (and take backup) only on node where source
volume is exclusive.

And I ultimately do not want to take VM down to lock LV exclusively to
take snapshot (if it runs on a shared-locked VM) and I do not want to do
offline migration (with exclusive lock on LV). To satisfy this lock
conversion is needed.

I'm still new to thinp, because it was introduced relatively recently,
and I had no chance to look at it closer (I tried to allocate pool once
on a clustered VG and the whole test cluster stuck because of this).

Does it work on clustered VGs now?
And, is it possible now to take/activate/open thinp snapshot on a node
different from one where source volume is open?

> 
> That would simply not work - since you would have to ensure that noone
> will write to  snapshot & origin on either of those nodes?
> 
> Is your code doing some transition which needs active device on both nodes
> treating them in read-only way ?

Yes, but not my, it is libvirt.
It opens block device (LV) on both source and destination nodes (it runs
qemu process in a paused state on a destination node and that process
opens block device).
After that memory state is transferred to a destination node, then qemu
process on a source node is paused (turns off virtual CPUs), then qemu
process on a destination node is resumed (turns on virtual CPUs) and
then qemu process on a source node is killed, thus releasing the LV.
Adding one more migration phase ("confirm confirmation") and thus
introducing one more migration protocol version seems to be overkill for me.

When qemu process is paused on a node, LV is effectively read-only (ok,
almost read-only, libvirt still try to set up DAC permissions and
selinux label on it, but data is not written).

There is only bit of time when both source and destination processes are
paused (less that millisecond).

When qemu is running, it writes to device.

What concerns my code in libvirt:
I made one more "logical" pool subtype - clvm, which starts with all LVs
deactivated.
I also wrote the locking driver (which works similar to sanlock an
virtlockd ones), which
* activates volume exclusively on start
* converts lock to shared on a source node before migration
* activates volume in a shared mode on a migration target
* deactivates volume on a source node after migration is finished
* converts lock from a shared to exclusive remotely on destination node
from a source node

It also has local locking concept to prevent LV to be opened more than
one time on the node it is activated exclusively.

As I wrote above, there is no event like "you can convert lock to
exclusive" available on a destination node.

> 
> Since metadata for snapshot are only parsed during first activation of
> snapshot, there is no way, the second node could resync if you would
> have written to the snapshot/origin on the first node.
> 
> So could you please describe in more details how it's supposed to work?

It is ok for me to lose snapshot during migration. I just need to be
able to backup VM data while it is constantly running on one node. If
pacemaker decides to migrate VM, then backup just fails and will be
restarted (together with new snapshot creation) from a beginning after
migration is finished.

Vladislav

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [linux-lvm] [PATCH 10/10] man: document --node option to lvchange
  2013-03-19 15:32   ` David Teigland
  2013-03-19 15:42     ` Vladislav Bogdanov
@ 2013-03-21 18:23     ` Vladislav Bogdanov
  1 sibling, 0 replies; 36+ messages in thread
From: Vladislav Bogdanov @ 2013-03-21 18:23 UTC (permalink / raw)
  To: LVM general discussion and development

19.03.2013 18:32, David Teigland wrote:
> On Tue, Mar 19, 2013 at 01:32:50PM +0000, Vladislav Bogdanov wrote:
>> +.B \-\-node \fINodeID
>> +Perform specified activation command on a remote cluster node (the same to running
>> +corresponding command on that node locally).
>> +Currently tested only for corosync clusters (\fB-I\fP \fIcorosync\fP option to clvmd)
>> +for corosync versions from 2.0.
>> +If corosync configuration has node names in a nodelist (nodelist.node.X.name = name
>> +in CMAP tems) or node names are used for ring0_addr (nodelist.node.X.ring0_addr = name),
>> +then that names may be used as NodeID. Otherwise numeric node IDs should be used.
> 
> lvm tools should move away from doing remote command execution.

This would break current behavior btw (I do not say current is the right
one, but it exists).
lvchange -ay|-an on a clustered LV activates it shared everywhere in
cluster. Internally this is done by sending message to all nodes
(wildcard). Each node obtain shared lock itself then.

I just slightly modified this to be able to send targeted messages.
Everything was already in code, I just made it work for non-wildcard
messages and added possibility to enable this in tools.

> As you mentioned before, this is the equivalent of "ssh node lvchange".
> ssh or some other tool outside lvm is the right way to run the remote
> commands.
> 
> Also, lvm should not assume that it's using dlm/corosync, or that this
> kind of remote option will be possible to support.
> 
> _______________________________________________
> linux-lvm mailing list
> linux-lvm@redhat.com
> https://www.redhat.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
> 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [linux-lvm] [PATCH 10/10] man: document --node option to lvchange
  2013-03-20  8:45               ` Zdenek Kabelac
  2013-03-20 12:12                 ` Vladislav Bogdanov
@ 2013-03-21 18:31                 ` Vladislav Bogdanov
  2013-03-21 19:01                   ` Zdenek Kabelac
  1 sibling, 1 reply; 36+ messages in thread
From: Vladislav Bogdanov @ 2013-03-21 18:31 UTC (permalink / raw)
  To: LVM general discussion and development; +Cc: David Teigland, Zdenek Kabelac

20.03.2013 11:45, Zdenek Kabelac wrote:
...
> 
> (BTW there is already one thing which will surely not pass - it's the
> 'node' option for lvm command - this would have to be made diferently).

clvmd uses term 'node' internally. And that is the most right term for
what it means IMHO.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [linux-lvm] [PATCH 10/10] man: document --node option to lvchange
  2013-03-21 18:31                 ` Vladislav Bogdanov
@ 2013-03-21 19:01                   ` Zdenek Kabelac
  2013-03-21 19:16                     ` Vladislav Bogdanov
  0 siblings, 1 reply; 36+ messages in thread
From: Zdenek Kabelac @ 2013-03-21 19:01 UTC (permalink / raw)
  To: Vladislav Bogdanov; +Cc: David Teigland, LVM general discussion and development

Dne 21.3.2013 19:31, Vladislav Bogdanov napsal(a):
> 20.03.2013 11:45, Zdenek Kabelac wrote:
> ...
>>
>> (BTW there is already one thing which will surely not pass - it's the
>> 'node' option for lvm command - this would have to be made diferently).
>
> clvmd uses term 'node' internally. And that is the most right term for
> what it means IMHO.
>

Yes - in clvmd locking layer it's ok - but  lvm namespace stays aways from 
this layer. Thus lvm is not aware of any locking mechanism.. So the thing 
would have to be handle via activation lock flags and as I've said - the 
protocol used there (lvm  <-> clvmd) is quite fragile.

Zdenek

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [linux-lvm] [PATCH 10/10] man: document --node option to lvchange
  2013-03-21 19:01                   ` Zdenek Kabelac
@ 2013-03-21 19:16                     ` Vladislav Bogdanov
  0 siblings, 0 replies; 36+ messages in thread
From: Vladislav Bogdanov @ 2013-03-21 19:16 UTC (permalink / raw)
  To: LVM general discussion and development; +Cc: David Teigland, Zdenek Kabelac

21.03.2013 22:01, Zdenek Kabelac wrote:
> Dne 21.3.2013 19:31, Vladislav Bogdanov napsal(a):
>> 20.03.2013 11:45, Zdenek Kabelac wrote:
>> ...
>>>
>>> (BTW there is already one thing which will surely not pass - it's the
>>> 'node' option for lvm command - this would have to be made diferently).
>>
>> clvmd uses term 'node' internally. And that is the most right term for
>> what it means IMHO.
>>
> 
> Yes - in clvmd locking layer it's ok - but  lvm namespace stays aways
> from this layer. Thus lvm is not aware of any locking mechanism.. So the
> thing would have to be handle via activation lock flags and as I've said
> - the protocol used there (lvm  <-> clvmd) is quite fragile.

And it also has "node" field (struct clvm_header used everywhere in
cluster_locking.c). It is fragile, yes, but I managed to not break it I
think.

So, what would be the right option name?

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [linux-lvm] clvmd leaving kernel dlm uncontrolled lockspace
  2013-03-19 17:42   ` Vladislav Bogdanov
@ 2013-06-05 13:23     ` Andreas Pflug
  2013-06-05 15:13       ` David Teigland
  0 siblings, 1 reply; 36+ messages in thread
From: Andreas Pflug @ 2013-06-05 13:23 UTC (permalink / raw)
  To: LVM general discussion and development, teigland

Hi David,

I got quite some trouble with clvmd on corosync 2.3.0/dlm; apparently a 
nonfunctional clvmd in the cluster can block all others (kern.log states 
clvmd stuck for >120s in some dlm call). I tried to clean things up 
killing -9 clvmd, but it will remain on state D or Z. Unfortunately, it 
seems that those zombies still keep some dlm stuff locked. When I 
restart corosync on a node and dlm_controld -D on it, I see "found 
uncontrolled lockspace, tell corosync to remove nodeid from cluster".

Well, that's fine for the first step, but how about cleaning up the dlm 
lockspace? dlm_tool leave <lockspace> hangs as well (sometimes it just 
fails with error 49). The comment in dlm_controld/action.c isn't too 
satisfactory: need reboot, not funny if a whole cluster is affected. I'd 
really appreciate a way to manually clean old lockspaces. I'd presume 
that an uncontrolled lockspace on an isolated node should be easily 
removable...


Regards
Andreas

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [linux-lvm] clvmd leaving kernel dlm uncontrolled lockspace
  2013-06-05 13:23     ` [linux-lvm] clvmd leaving kernel dlm uncontrolled lockspace Andreas Pflug
@ 2013-06-05 15:13       ` David Teigland
  2013-06-05 17:29         ` Andreas Pflug
  2013-06-06  6:17         ` Andreas Pflug
  0 siblings, 2 replies; 36+ messages in thread
From: David Teigland @ 2013-06-05 15:13 UTC (permalink / raw)
  To: Andreas Pflug; +Cc: LVM general discussion and development

On Wed, Jun 05, 2013 at 03:23:32PM +0200, Andreas Pflug wrote:
> Hi David,
> 
> I got quite some trouble with clvmd on corosync 2.3.0/dlm;
> apparently a nonfunctional clvmd in the cluster can block all others
> (kern.log states clvmd stuck for >120s in some dlm call). I tried to
> clean things up killing -9 clvmd, but it will remain on state D or
> Z. Unfortunately, it seems that those zombies still keep some dlm
> stuff locked. When I restart corosync on a node and dlm_controld -D
> on it, I see "found uncontrolled lockspace, tell corosync to remove
> nodeid from cluster".
> 
> Well, that's fine for the first step, but how about cleaning up the
> dlm lockspace? dlm_tool leave <lockspace> hangs as well (sometimes
> it just fails with error 49). The comment in dlm_controld/action.c
> isn't too satisfactory: need reboot, not funny if a whole cluster is
> affected. I'd really appreciate a way to manually clean old
> lockspaces. I'd presume that an uncontrolled lockspace on an
> isolated node should be easily removable...

A few different topics wrapped together there:

- With kill -9 clvmd (possibly combined with dlm_tool leave clvmd),
  you can manually clear/remove a userland lockspace like clvmd.

- If clvmd is blocked in the kernel in uninterruptible sleep, then
  the kill above will not work.  To make kill work, you'd locate the
  particular sleep in the kernel and determine if there's a way to
  make it interruptible, and cleanly back it out.

- If clvmd is blocked in the kernel for >120s, you probably want to
  investigate what is causing that, rather than being too hasty
  killing clvmd.

- If corosync or dlm_controld are killed while dlm lockspaces exist,
  they become "uncontrolled" and would need to be forcibly cleaned up.
  This cleanup may be possible to implement for userland lockspaces,
  but it's not been clear that the benefits would greatly outweigh
  using reboot for this.

- Killing either corosync or dlm_controld is very unlikely help
  anything, and more likely to cause further problems, so it should
  be avoided as far as possible.

Dave

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [linux-lvm] clvmd leaving kernel dlm uncontrolled lockspace
  2013-06-05 15:13       ` David Teigland
@ 2013-06-05 17:29         ` Andreas Pflug
  2013-06-06  6:17         ` Andreas Pflug
  1 sibling, 0 replies; 36+ messages in thread
From: Andreas Pflug @ 2013-06-05 17:29 UTC (permalink / raw)
  To: David Teigland; +Cc: LVM general discussion and development

On 06/05/13 17:13, David Teigland wrote:
> On Wed, Jun 05, 2013 at 03:23:32PM +0200, Andreas Pflug wrote:
> A few different topics wrapped together there:
>
> - With kill -9 clvmd (possibly combined with dlm_tool leave clvmd),
>    you can manually clear/remove a userland lockspace like clvmd.

I had some clvmd instances not starting up correctly, remaining in 
nowhereland...
>
> - If clvmd is blocked in the kernel in uninterruptible sleep, then
>    the kill above will not work.  To make kill work, you'd locate the
>    particular sleep in the kernel and determine if there's a way to
>    make it interruptible, and cleanly back it out.
>
> - If clvmd is blocked in the kernel for >120s, you probably want to
>    investigate what is causing that, rather than being too hasty
>    killing clvmd.
>
> - If corosync or dlm_controld are killed while dlm lockspaces exist,
>    they become "uncontrolled" and would need to be forcibly cleaned up.
>    This cleanup may be possible to implement for userland lockspaces,
>    but it's not been clear that the benefits would greatly outweigh
>    using reboot for this.

Any of those programs might get a problem, so either they should 
re-attach to the lockspace, or a cleanup should be possible. If (as in 
my case) the host is a xen host with san storage you wouldn't like to 
reboot it... In my naive imagination, an orphaned lockspace is just some 
allocated memory that should't be too hard to free.

>
> - Killing either corosync or dlm_controld is very unlikely help
>    anything, and more likely to cause further problems, so it should
>    be avoided as far as possible.

Apparently the problem started with corosync running correctly, but 
dlm_controld wasn't up; clvmd then blocked somewhere. I now have still 
four hosts with 60VMs or so to reboot. So any hint how to kill that 
lockspace is greatly appreciated.


Regards,
Andreas

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [linux-lvm] clvmd leaving kernel dlm uncontrolled lockspace
  2013-06-05 15:13       ` David Teigland
  2013-06-05 17:29         ` Andreas Pflug
@ 2013-06-06  6:17         ` Andreas Pflug
  2013-06-06 11:06           ` matthew patton
  1 sibling, 1 reply; 36+ messages in thread
From: Andreas Pflug @ 2013-06-06  6:17 UTC (permalink / raw)
  To: David Teigland; +Cc: LVM general discussion and development

Am 05.06.13 17:13, schrieb David Teigland:

> A few different topics wrapped together there:
>
> - With kill -9 clvmd (possibly combined with dlm_tool leave clvmd),
>    you can manually clear/remove a userland lockspace like clvmd.
>
> - If clvmd is blocked in the kernel in uninterruptible sleep, then
>    the kill above will not work.  To make kill work, you'd locate the
>    particular sleep in the kernel and determine if there's a way to
>    make it interruptible, and cleanly back it out.

I had clvmds blocked in kernel, so how to "locate the sleep and make it 
interruptible"?
>
> - If clvmd is blocked in the kernel for >120s, you probably want to
>    investigate what is causing that, rather than being too hasty
>    killing clvmd.
INFO: task clvmd:19766 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
clvmd           D ffff880058ec4870     0 19766      1 0x00000000
ffff880058ec4870 0000000000000282 0000000000000000 ffff8800698d9590
0000000000013740 ffff880063787fd8 ffff880063787fd8 0000000000013740
ffff880058ec4870 ffff880063786010 0000000000000001 0000000100000000
Call Trace:
[<ffffffff81367f7a>] ? rwsem_down_failed_common+0xda/0x10e
[<ffffffff811c5924>] ? call_rwsem_down_read_failed+0x14/0x30
[<ffffffff813678da>] ? down_read+0x17/0x19
[<ffffffffa059b705>] ? dlm_user_request+0x3a/0x17e [dlm]
[<ffffffffa05a40e4>] ? device_write+0x279/0x5f7 [dlm]
[<ffffffff810f7d7a>] ? __kmalloc+0x104/0x116
[<ffffffffa05a416b>] ? device_write+0x300/0x5f7 [dlm]
[<ffffffff810042c9>] ? xen_mc_flush+0x12b/0x158
[<ffffffff8117489e>] ? security_file_permission+0x18/0x2d
[<ffffffff81106dd5>] ? vfs_write+0xa4/0xff
[<ffffffff81106ee6>] ? sys_write+0x45/0x6e
[<ffffffff8136d652>] ? system_call_fastpath+0x16/0x1b

On 3.2.35

>
> - If corosync or dlm_controld are killed while dlm lockspaces exist,
>    they become "uncontrolled" and would need to be forcibly cleaned up.
>    This cleanup may be possible to implement for userland lockspaces,
>    but it's not been clear that the benefits would greatly outweigh
>    using reboot for this.

On a machine being Xen host with 20+ running VMs I'd clearly prefer to 
clean those orphaned memory space and go on.... I still have 4 hosts to 
be rebooted which serve as xen host, providing their devices from 
clvmd-controlled (i.e. now uncontrollable) san space.
>
> - Killing either corosync or dlm_controld is very unlikely help
>    anything, and more likely to cause further problems, so it should
>    be avoided as far as possible.

I understand. One reason to upgrade was that I had infrequent 
situations, where the corosync 1.4.2 instances on all nodes exitted 
simultaneously without any log notice. Having this with the new 
corosync2.3/dlm infrastructure would mean a whole cluster having 
uncontrollable san space. So either the lockspace should be 
automatically reclaimed if dlm_controld finds it uncontrolled, or a 
means to clean it up manually should be available.

Regards,
Andreas
>
> Dave

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [linux-lvm] clvmd leaving kernel dlm uncontrolled lockspace
  2013-06-06  6:17         ` Andreas Pflug
@ 2013-06-06 11:06           ` matthew patton
  2013-06-06 17:54             ` Andreas Pflug
  0 siblings, 1 reply; 36+ messages in thread
From: matthew patton @ 2013-06-06 11:06 UTC (permalink / raw)
  To: David Teigland, LVM general discussion and development

--- On Thu, 6/6/13, Andreas Pflug <pgadmin@pse-consulting.de> wrote:

> On a machine being Xen host with 20+ running VMs I'd clearly
> prefer to clean those orphaned memory space and go on.... I

This is exactly why it is STRONGLY suggested you split your storage tier from your compute tier. The lowest friction method would be a pair that hold the disks (or access a common disk set) and export it as NFS. The compute nodes can speed things up with CacheFS for their local running VMs assuming you shepherd the live-migration process.

If the VMs all want to have a shared filesystem for a running app and the app can't be written to work safely with NFS (why not?) then you can run corosync and friends +GFS2 at that level.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [linux-lvm] clvmd leaving kernel dlm uncontrolled lockspace
  2013-06-06 11:06           ` matthew patton
@ 2013-06-06 17:54             ` Andreas Pflug
  0 siblings, 0 replies; 36+ messages in thread
From: Andreas Pflug @ 2013-06-06 17:54 UTC (permalink / raw)
  To: linux-lvm

Am 06.06.13 13:06, schrieb matthew patton:
> --- On Thu, 6/6/13, Andreas Pflug <pgadmin@pse-consulting.de> wrote:
>
>> On a machine being Xen host with 20+ running VMs I'd clearly
>> prefer to clean those orphaned memory space and go on.... I
> This is exactly why it is STRONGLY suggested you split your storage tier from your compute tier. The lowest friction method would be a pair that hold the disks (or access a common disk set) and export it as NFS. The compute nodes can speed things up with CacheFS for their local running VMs assuming you shepherd the live-migration process.

The Xen hosts are iscsi initiators, but their usage of the san-located 
vg has to be coordinated, using clvmd. It's just what xcp/xenserver 
does, but with clvmd to insure locking (apparently xcp/xenserver relies 
on friendly behaviour, using no locking)
>
> If the VMs all want to have a shared filesystem for a running app and the app can't be written to work safely with NFS (why not?) then you can run corosync and friends +GFS2 at that level.

The VMs have their private devices, each a LV on a san-vg.

Regards
Andreas

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2013-06-06 17:54 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-03-19 13:32 [linux-lvm] [PATCH 00/10] Enhancements to a clustered logical volume activation Vladislav Bogdanov
2013-03-19 13:32 ` [linux-lvm] [PATCH 01/10] lvchange: Allow cluster lock conversion Vladislav Bogdanov
2013-03-19 15:23   ` David Teigland
2013-03-19 15:33     ` Vladislav Bogdanov
2013-03-19 15:44       ` Vladislav Bogdanov
2013-03-19 16:03         ` David Teigland
2013-03-19 16:36           ` Vladislav Bogdanov
2013-03-19 13:32 ` [linux-lvm] [PATCH 02/10] clvmd: Fix buffer size Vladislav Bogdanov
2013-03-19 13:32 ` [linux-lvm] [PATCH 03/10] clvmd: Allow node names to be obtained from corosync's CMAP Vladislav Bogdanov
2013-03-19 13:32 ` [linux-lvm] [PATCH 04/10] clvmd: fix positive return value is not an error in csid->name translation Vladislav Bogdanov
2013-03-19 13:32 ` [linux-lvm] [PATCH 05/10] clvmd: use correct flags for local command execution Vladislav Bogdanov
2013-03-19 13:32 ` [linux-lvm] [PATCH 06/10] clvmd: additional debugging - print message bodies Vladislav Bogdanov
2013-03-19 13:32 ` [linux-lvm] [PATCH 07/10] locking: Allow lock management (activation, deactivation, conversion) on a remote nodes Vladislav Bogdanov
2013-03-19 13:32 ` [linux-lvm] [PATCH 08/10] lvchange: implement remote lock management Vladislav Bogdanov
2013-03-19 13:32 ` [linux-lvm] [PATCH 09/10] man: document --force option to lvchange, provide examples Vladislav Bogdanov
2013-03-19 13:32 ` [linux-lvm] [PATCH 10/10] man: document --node option to lvchange Vladislav Bogdanov
2013-03-19 15:32   ` David Teigland
2013-03-19 15:42     ` Vladislav Bogdanov
2013-03-19 15:54       ` David Teigland
2013-03-19 16:52         ` Vladislav Bogdanov
2013-03-19 17:16           ` David Teigland
2013-03-19 17:36             ` Vladislav Bogdanov
2013-03-20  8:45               ` Zdenek Kabelac
2013-03-20 12:12                 ` Vladislav Bogdanov
2013-03-21 18:31                 ` Vladislav Bogdanov
2013-03-21 19:01                   ` Zdenek Kabelac
2013-03-21 19:16                     ` Vladislav Bogdanov
2013-03-21 18:23     ` Vladislav Bogdanov
2013-03-19 16:42 ` [linux-lvm] [PATCH 00/10] Enhancements to a clustered logical volume activation Alasdair G Kergon
2013-03-19 17:42   ` Vladislav Bogdanov
2013-06-05 13:23     ` [linux-lvm] clvmd leaving kernel dlm uncontrolled lockspace Andreas Pflug
2013-06-05 15:13       ` David Teigland
2013-06-05 17:29         ` Andreas Pflug
2013-06-06  6:17         ` Andreas Pflug
2013-06-06 11:06           ` matthew patton
2013-06-06 17:54             ` Andreas Pflug

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.