All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/3] mdadm: fix wrong condition for go to abort
@ 2015-07-06  8:52 Guoqing Jiang
  2015-07-06  8:52 ` [PATCH 2/3] md-cluster: use %-64s to print cluster_name Guoqing Jiang
  2015-07-06  8:52 ` [PATCH 3/3] Safeguard against writing to an active device of another node Guoqing Jiang
  0 siblings, 2 replies; 13+ messages in thread
From: Guoqing Jiang @ 2015-07-06  8:52 UTC (permalink / raw)
  To: neilb; +Cc: rgoldwyn, linux-raid

When parse_cluster_confirm_arg return 0, it means the
arg are parsed successfully, so change !rv to rv.

Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
---
Against cluster branch, it could be squashed with commit 4de9091
"Add a new clustered disk", sorry for inconvenience.  

 Manage.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Manage.c b/Manage.c
index e3bdfb3..377df32 100644
--- a/Manage.c
+++ b/Manage.c
@@ -1302,7 +1302,7 @@ int Manage_subdevs(char *devname, int fd,
 			rv = parse_cluster_confirm_arg(dv->devname,
 						       &dv->devname,
 						       &raid_slot);
-			if (!rv) {
+			if (rv) {
 				pr_err("Could not get the devname of cluster\n");
 				goto abort;
 			}
-- 
1.7.12.4


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/3] md-cluster: use %-64s to print cluster_name
  2015-07-06  8:52 [PATCH 1/3] mdadm: fix wrong condition for go to abort Guoqing Jiang
@ 2015-07-06  8:52 ` Guoqing Jiang
  2015-07-06  8:52 ` [PATCH 3/3] Safeguard against writing to an active device of another node Guoqing Jiang
  1 sibling, 0 replies; 13+ messages in thread
From: Guoqing Jiang @ 2015-07-06  8:52 UTC (permalink / raw)
  To: neilb; +Cc: rgoldwyn, linux-raid

Left align is better for cluster with name less than 64. Also
make the output of cluster name is aligned with others.

Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
---
Against cluster branch, this could be squashed with commit 0aa2f15
"mdadm: add the ability to change cluster name", sorry for inconvenience.

 bitmap.c | 2 +-
 super1.c | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/bitmap.c b/bitmap.c
index 0c3f6de..9847001 100644
--- a/bitmap.c
+++ b/bitmap.c
@@ -342,7 +342,7 @@ int ExamineBitmap(char *filename, int brief, struct supertype *st)
 		       100.0 * info->dirty_bits / (info->total_bits?:1));
 	} else {
 		printf("   Cluster nodes : %d\n", sb->nodes);
-		printf("    Cluster name : %64s\n", sb->cluster_name);
+		printf("    Cluster name : %-64s\n", sb->cluster_name);
 		for (i = 0; i < (int)sb->nodes; i++) {
 			if (i) {
 				free(info);
diff --git a/super1.c b/super1.c
index fda71e3..9b991e6 100644
--- a/super1.c
+++ b/super1.c
@@ -305,7 +305,7 @@ static void examine_super1(struct supertype *st, char *homehost)
 		printf("  (local to host %s)", homehost);
 	printf("\n");
 	if (bms->nodes > 0)
-		printf("Cluster Name : %s", bms->cluster_name);
+		printf("   Cluster Name : %-64s\n", bms->cluster_name);
 	atime = __le64_to_cpu(sb->ctime) & 0xFFFFFFFFFFULL;
 	printf("  Creation Time : %.24s\n", ctime(&atime));
 	c=map_num(pers, __le32_to_cpu(sb->level));
@@ -763,7 +763,7 @@ static void detail_super1(struct supertype *st, char *homehost)
 	    strncmp(sb->set_name, homehost, l) == 0)
 		printf("  (local to host %s)", homehost);
 	if (bms->nodes > 0)
-	    printf("Cluster Name : %64s", bms->cluster_name);
+	    printf("\n   Cluster Name : %-64s", bms->cluster_name);
 	printf("\n           UUID : ");
 	for (i=0; i<16; i++) {
 		if ((i&3)==0 && i != 0) printf(":");
-- 
1.7.12.4


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 3/3] Safeguard against writing to an active device of another node
  2015-07-06  8:52 [PATCH 1/3] mdadm: fix wrong condition for go to abort Guoqing Jiang
  2015-07-06  8:52 ` [PATCH 2/3] md-cluster: use %-64s to print cluster_name Guoqing Jiang
@ 2015-07-06  8:52 ` Guoqing Jiang
  2015-07-29  7:28   ` NeilBrown
  1 sibling, 1 reply; 13+ messages in thread
From: Guoqing Jiang @ 2015-07-06  8:52 UTC (permalink / raw)
  To: neilb; +Cc: rgoldwyn, linux-raid

Modifying an exiting device's superblock or creating a new superblock
on an existing device needs to be checked because the device could be
in use by another node in another array. So, we check this by taking
all superblock locks in userspace so that we don't  step onto an active
device used by another node and safeguard against accidental edits.
After the edit is complete, we release all locks and the lockspace so
that it can be used by the kernel space.

Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
---
Against cluster branch.

 Makefile |   6 ++-
 mdadm.h  |   3 ++
 super1.c |  62 +++++++++++++++++++++++++++++++
 util.c   | 127 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 197 insertions(+), 1 deletion(-)

diff --git a/Makefile b/Makefile
index c189279..a76ae13 100644
--- a/Makefile
+++ b/Makefile
@@ -81,11 +81,12 @@ FAILED_SLOTS_DIR = $(RUN_DIR)/failed-slots
 SYSTEMD_DIR=/lib/systemd/system
 
 COROSYNC:=$(shell [ -d /usr/include/corosync ] || echo -DNO_COROSYNC)
+DLM:=$(shell [ -f /usr/include/libdlm.h ] || echo -DNO_DLM)
 
 DIRFLAGS = -DMAP_DIR=\"$(MAP_DIR)\" -DMAP_FILE=\"$(MAP_FILE)\"
 DIRFLAGS += -DMDMON_DIR=\"$(MDMON_DIR)\"
 DIRFLAGS += -DFAILED_SLOTS_DIR=\"$(FAILED_SLOTS_DIR)\"
-CFLAGS = $(CWFLAGS) $(CXFLAGS) -DSendmail=\""$(MAILCMD)"\" $(CONFFILEFLAGS) $(DIRFLAGS) $(COROSYNC)
+CFLAGS = $(CWFLAGS) $(CXFLAGS) -DSendmail=\""$(MAILCMD)"\" $(CONFFILEFLAGS) $(DIRFLAGS) $(COROSYNC) $(DLM)
 
 VERSION = $(shell [ -d .git ] && git describe HEAD | sed 's/mdadm-//')
 VERS_DATE = $(shell [ -d .git ] && date --date="`git log -n1 --format=format:%cd --date=short`" '+%0dth %B %Y' | sed -e 's/1th/1st/' -e 's/2th/2nd/' -e 's/11st/11th/' -e 's/12nd/12th/')
@@ -105,6 +106,9 @@ endif
 # LDFLAGS = -static
 # STRIP = -s
 LDLIBS=-ldl
+ifneq ($(DLM), -DNO_DLM)
+LDLIBS += -ldlm_lt
+endif
 
 INSTALL = /usr/bin/install
 DESTDIR =
diff --git a/mdadm.h b/mdadm.h
index 97892e6..59f851e 100644
--- a/mdadm.h
+++ b/mdadm.h
@@ -1429,6 +1429,9 @@ extern char *fd2devnm(int fd);
 
 extern int in_initrd(void);
 extern int get_cluster_name(char **name);
+extern int is_clustered(struct supertype *st);
+extern int cluster_get_dlmlock(struct supertype *st, int *lockid);
+extern int cluster_release_dlmlock(struct supertype *st, int lockid);
 
 #define _ROUND_UP(val, base)	(((val) + (base) - 1) & ~(base - 1))
 #define ROUND_UP(val, base)	_ROUND_UP(val, (typeof(val))(base))
diff --git a/super1.c b/super1.c
index 9b991e6..f4d6345 100644
--- a/super1.c
+++ b/super1.c
@@ -1074,6 +1074,18 @@ static int update_super1(struct supertype *st, struct mdinfo *info,
 	int rv = 0;
 	struct mdp_superblock_1 *sb = st->sb;
 
+#ifndef NO_DLM
+	int lockid;
+	if (is_clustered(st)) {
+		rv = cluster_get_dlmlock(st, &lockid);
+		if (rv) {
+			pr_err("Cannot get dlmlock in %s return %d\n", __func__, rv);
+			cluster_release_dlmlock(st, lockid);
+			return rv;
+		}
+	}
+#endif
+
 	if (strcmp(update, "homehost") == 0 &&
 	    homehost) {
 		/* Note that 'homehost' is special as it is really
@@ -1330,6 +1342,10 @@ static int update_super1(struct supertype *st, struct mdinfo *info,
 		rv = -1;
 
 	sb->sb_csum = calc_sb_1_csum(sb);
+#ifndef NO_DLM
+	if (is_clustered(st))
+		cluster_release_dlmlock(st, lockid);
+#endif
 	return rv;
 }
 
@@ -1434,6 +1450,17 @@ static int add_to_super1(struct supertype *st, mdu_disk_info_t *dk,
 	__u16 *rp = sb->dev_roles + dk->number;
 	struct devinfo *di, **dip;
 
+#ifndef NO_DLM
+	int rv, lockid;
+	if (is_clustered(st)) {
+		rv = cluster_get_dlmlock(st, &lockid);
+		if (rv) {
+			pr_err("Cannot get dlmlock in %s return %d\n", __func__, rv);
+			cluster_release_dlmlock(st, lockid);
+			return rv;
+		}
+	}
+#endif
 	if ((dk->state & 6) == 6) /* active, sync */
 		*rp = __cpu_to_le16(dk->raid_disk);
 	else if ((dk->state & ~2) == 0) /* active or idle -> spare */
@@ -1460,6 +1487,10 @@ static int add_to_super1(struct supertype *st, mdu_disk_info_t *dk,
 	di->next = NULL;
 	*dip = di;
 
+#ifndef NO_DLM
+	if (is_clustered(st))
+		cluster_release_dlmlock(st, lockid);
+#endif
 	return 0;
 }
 #endif
@@ -1474,6 +1505,18 @@ static int store_super1(struct supertype *st, int fd)
 	int sbsize;
 	unsigned long long dsize;
 
+#ifndef NO_DLM
+	int rv, lockid;
+	if (is_clustered(st)) {
+		rv = cluster_get_dlmlock(st, &lockid);
+		if (rv) {
+			pr_err("Cannot get dlmlock in %s return %d\n", __func__, rv);
+			cluster_release_dlmlock(st, lockid);
+			return rv;
+		}
+	}
+#endif
+
 	if (!get_dev_size(fd, NULL, &dsize))
 		return 1;
 
@@ -1533,6 +1576,10 @@ static int store_super1(struct supertype *st, int fd)
 		}
 	}
 	fsync(fd);
+#ifndef NO_DLM
+	if (is_clustered(st))
+		cluster_release_dlmlock(st, lockid);
+#endif
 	return 0;
 }
 
@@ -2282,6 +2329,17 @@ static int write_bitmap1(struct supertype *st, int fd, enum bitmap_update update
 
 static void free_super1(struct supertype *st)
 {
+#ifndef NO_DLM
+	int rv, lockid;
+	if (is_clustered(st)) {
+		rv = cluster_get_dlmlock(st, &lockid);
+		if (rv) {
+			pr_err("Cannot get dlmlock in %s return %d\n", __func__, rv);
+			cluster_release_dlmlock(st, lockid);
+			return;
+		}
+	}
+#endif
 	if (st->sb)
 		free(st->sb);
 	while (st->info) {
@@ -2292,6 +2350,10 @@ static void free_super1(struct supertype *st)
 		free(di);
 	}
 	st->sb = NULL;
+#ifndef NO_DLM
+	if (is_clustered(st))
+		cluster_release_dlmlock(st, lockid);
+#endif
 }
 
 #ifndef MDASSEMBLE
diff --git a/util.c b/util.c
index ea6e688..3357957 100644
--- a/util.c
+++ b/util.c
@@ -24,6 +24,7 @@
 
 #include	"mdadm.h"
 #include	"md_p.h"
+#include	<sys/poll.h>
 #include	<sys/socket.h>
 #include	<sys/utsname.h>
 #include	<sys/wait.h>
@@ -42,6 +43,10 @@
 #else
  #include	<corosync/cmap.h>
 #endif
+#ifndef NO_DLM
+#include	<libdlm.h>
+#include	<errno.h>
+#endif
 
 
 /*
@@ -88,6 +93,128 @@ struct blkpg_partition {
    aren't permitted). */
 #define BUILD_BUG_ON_ZERO(e) (sizeof(struct { int:-!!(e); }))
 
+#ifndef NO_DLM
+struct dlm_lock_resource {
+	dlm_lshandle_t *ls;
+	struct dlm_lksb lksb;
+};
+
+struct dlm_lock_resource *dlm_lock_res = NULL;
+static int ast_called = 0;
+
+int is_clustered(struct supertype *st)
+{
+	/* is it a cluster md or not */
+	return st->cluster_name ? 1 :0;
+}
+
+/* Using poll(2) to wait for and dispatch ASTs */
+static int poll_for_ast(dlm_lshandle_t ls)
+{
+	struct pollfd pfd;
+
+	pfd.fd = dlm_ls_get_fd(ls);
+	pfd.events = POLLIN;
+
+	while (!ast_called)
+	{
+		if (poll(&pfd, 1, 0) < 0)
+		{
+			perror("poll");
+			return -1;
+		}
+		dlm_dispatch(dlm_ls_get_fd(ls));
+	}
+	ast_called = 0;
+
+	return 0;
+}
+
+static void dlm_ast(void *arg)
+{
+	ast_called = 1;
+}
+
+/* Create the lockspace, take bitmapXXX locks on all the bitmaps. */
+int cluster_get_dlmlock(struct supertype *st, int *lockid)
+{
+	int ret;
+	char str[64];
+	int flags = LKF_NOQUEUE;
+
+	dlm_lock_res = xmalloc(sizeof(struct dlm_lock_resource));
+	if (!dlm_lock_res)
+		return -ENOMEM;
+
+	dlm_lock_res->ls = dlm_create_lockspace(st->cluster_name, O_RDWR);
+	if (!dlm_lock_res->ls) {
+		pr_err("%s failed to create lockspace\n", st->cluster_name);
+		return -1;
+	}
+
+	/* Conversions need the lockid in the LKSB */
+	if (flags & LKF_CONVERT)
+		dlm_lock_res->lksb.sb_lkid = *lockid;
+
+	snprintf(str, 64, "bitmap%04d", st->nodes);
+	/* if flags with LKF_CONVERT causes below return ENOENT which means
+	 * "No such file or directory" */
+	ret = dlm_ls_lock(dlm_lock_res->ls, LKM_PWMODE, &dlm_lock_res->lksb,
+			  flags, str, strlen(str), 0, dlm_ast,
+			  dlm_lock_res, NULL, NULL);
+	if (ret) {
+		pr_err("error %d when get PW mode on lock %s\n", errno, str);
+		return ret;
+	}
+
+	/* Wait for it to complete */
+	poll_for_ast(dlm_lock_res->ls);
+	*lockid = dlm_lock_res->lksb.sb_lkid;
+
+	errno =	dlm_lock_res->lksb.sb_status;
+	if (errno) {
+	    pr_err("error %d happened in ast with lock %s\n", errno, str);
+	    return -1;
+	}
+
+	return 0;
+}
+
+int cluster_release_dlmlock(struct supertype *st, int lockid)
+{
+	int ret;
+
+	/* if flags with LKF_CONVERT causes below return EINVAL which means
+	 * "Invalid argument" */
+	ret = dlm_ls_unlock(dlm_lock_res->ls, lockid, 0, &dlm_lock_res->lksb, dlm_lock_res);
+	if (ret) {
+		pr_err("error %d happened when unlock\n", errno);
+		/* XXX make sure the lock is unlocked eventually */
+		return ret;
+	}
+
+	/* Wait for it to complete */
+	poll_for_ast(dlm_lock_res->ls);
+
+	errno =	dlm_lock_res->lksb.sb_status;
+	if (errno != EUNLOCK) {
+		pr_err("error %d happened in ast when unlock lockspace\n", errno);
+		/* XXX make sure the lockspace is unlocked eventually */
+		return -1;
+	}
+
+	ret = dlm_release_lockspace(st->cluster_name, dlm_lock_res->ls, 1);
+	if (ret) {
+		pr_err("error %d happened when release lockspace\n", errno);
+		/* XXX make sure the lockspace is released eventually */
+		return ret;
+	}
+	free(dlm_lock_res);
+
+	return 0;
+}
+#endif
+
 /*
  * Parse a 128 bit uuid in 4 integers
  * format is 32 hexx nibbles with options :.<space> separator
-- 
1.7.12.4


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 3/3] Safeguard against writing to an active device of another node
  2015-07-06  8:52 ` [PATCH 3/3] Safeguard against writing to an active device of another node Guoqing Jiang
@ 2015-07-29  7:28   ` NeilBrown
  2015-07-29 10:41     ` Guoqing Jiang
  0 siblings, 1 reply; 13+ messages in thread
From: NeilBrown @ 2015-07-29  7:28 UTC (permalink / raw)
  To: Guoqing Jiang; +Cc: rgoldwyn, linux-raid

On Mon,  6 Jul 2015 16:52:12 +0800 Guoqing Jiang <gqjiang@suse.com>
wrote:

> Modifying an exiting device's superblock or creating a new superblock
> on an existing device needs to be checked because the device could be
> in use by another node in another array. So, we check this by taking
> all superblock locks in userspace so that we don't  step onto an active
> device used by another node and safeguard against accidental edits.
> After the edit is complete, we release all locks and the lockspace so
> that it can be used by the kernel space.
> 
> Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
> Signed-off-by: Guoqing Jiang <gqjiang@suse.com>

Hi,
 thanks for these.
 I've applied 1 and 2.

 Could you re-do this on to follow the kernel style of #ifdefs,
 so that the #ifdefs only appear in header files.
 i.e. when NO_DLM is defined, is_clustered() always evaluates to zero
 and clustered_get_dlmlock() always succeeds etc.

Thanks,
NeilBrown



> ---
> Against cluster branch.
> 
>  Makefile |   6 ++-
>  mdadm.h  |   3 ++
>  super1.c |  62 +++++++++++++++++++++++++++++++
>  util.c   | 127 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  4 files changed, 197 insertions(+), 1 deletion(-)
> 
> diff --git a/Makefile b/Makefile
> index c189279..a76ae13 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -81,11 +81,12 @@ FAILED_SLOTS_DIR = $(RUN_DIR)/failed-slots
>  SYSTEMD_DIR=/lib/systemd/system
>  
>  COROSYNC:=$(shell [ -d /usr/include/corosync ] || echo -DNO_COROSYNC)
> +DLM:=$(shell [ -f /usr/include/libdlm.h ] || echo -DNO_DLM)
>  
>  DIRFLAGS = -DMAP_DIR=\"$(MAP_DIR)\" -DMAP_FILE=\"$(MAP_FILE)\"
>  DIRFLAGS += -DMDMON_DIR=\"$(MDMON_DIR)\"
>  DIRFLAGS += -DFAILED_SLOTS_DIR=\"$(FAILED_SLOTS_DIR)\"
> -CFLAGS = $(CWFLAGS) $(CXFLAGS) -DSendmail=\""$(MAILCMD)"\" $(CONFFILEFLAGS) $(DIRFLAGS) $(COROSYNC)
> +CFLAGS = $(CWFLAGS) $(CXFLAGS) -DSendmail=\""$(MAILCMD)"\" $(CONFFILEFLAGS) $(DIRFLAGS) $(COROSYNC) $(DLM)
>  
>  VERSION = $(shell [ -d .git ] && git describe HEAD | sed 's/mdadm-//')
>  VERS_DATE = $(shell [ -d .git ] && date --date="`git log -n1 --format=format:%cd --date=short`" '+%0dth %B %Y' | sed -e 's/1th/1st/' -e 's/2th/2nd/' -e 's/11st/11th/' -e 's/12nd/12th/')
> @@ -105,6 +106,9 @@ endif
>  # LDFLAGS = -static
>  # STRIP = -s
>  LDLIBS=-ldl
> +ifneq ($(DLM), -DNO_DLM)
> +LDLIBS += -ldlm_lt
> +endif
>  
>  INSTALL = /usr/bin/install
>  DESTDIR =
> diff --git a/mdadm.h b/mdadm.h
> index 97892e6..59f851e 100644
> --- a/mdadm.h
> +++ b/mdadm.h
> @@ -1429,6 +1429,9 @@ extern char *fd2devnm(int fd);
>  
>  extern int in_initrd(void);
>  extern int get_cluster_name(char **name);
> +extern int is_clustered(struct supertype *st);
> +extern int cluster_get_dlmlock(struct supertype *st, int *lockid);
> +extern int cluster_release_dlmlock(struct supertype *st, int lockid);
>  
>  #define _ROUND_UP(val, base)	(((val) + (base) - 1) & ~(base - 1))
>  #define ROUND_UP(val, base)	_ROUND_UP(val, (typeof(val))(base))
> diff --git a/super1.c b/super1.c
> index 9b991e6..f4d6345 100644
> --- a/super1.c
> +++ b/super1.c
> @@ -1074,6 +1074,18 @@ static int update_super1(struct supertype *st, struct mdinfo *info,
>  	int rv = 0;
>  	struct mdp_superblock_1 *sb = st->sb;
>  
> +#ifndef NO_DLM
> +	int lockid;
> +	if (is_clustered(st)) {
> +		rv = cluster_get_dlmlock(st, &lockid);
> +		if (rv) {
> +			pr_err("Cannot get dlmlock in %s return %d\n", __func__, rv);
> +			cluster_release_dlmlock(st, lockid);
> +			return rv;
> +		}
> +	}
> +#endif
> +
>  	if (strcmp(update, "homehost") == 0 &&
>  	    homehost) {
>  		/* Note that 'homehost' is special as it is really
> @@ -1330,6 +1342,10 @@ static int update_super1(struct supertype *st, struct mdinfo *info,
>  		rv = -1;
>  
>  	sb->sb_csum = calc_sb_1_csum(sb);
> +#ifndef NO_DLM
> +	if (is_clustered(st))
> +		cluster_release_dlmlock(st, lockid);
> +#endif
>  	return rv;
>  }
>  
> @@ -1434,6 +1450,17 @@ static int add_to_super1(struct supertype *st, mdu_disk_info_t *dk,
>  	__u16 *rp = sb->dev_roles + dk->number;
>  	struct devinfo *di, **dip;
>  
> +#ifndef NO_DLM
> +	int rv, lockid;
> +	if (is_clustered(st)) {
> +		rv = cluster_get_dlmlock(st, &lockid);
> +		if (rv) {
> +			pr_err("Cannot get dlmlock in %s return %d\n", __func__, rv);
> +			cluster_release_dlmlock(st, lockid);
> +			return rv;
> +		}
> +	}
> +#endif
>  	if ((dk->state & 6) == 6) /* active, sync */
>  		*rp = __cpu_to_le16(dk->raid_disk);
>  	else if ((dk->state & ~2) == 0) /* active or idle -> spare */
> @@ -1460,6 +1487,10 @@ static int add_to_super1(struct supertype *st, mdu_disk_info_t *dk,
>  	di->next = NULL;
>  	*dip = di;
>  
> +#ifndef NO_DLM
> +	if (is_clustered(st))
> +		cluster_release_dlmlock(st, lockid);
> +#endif
>  	return 0;
>  }
>  #endif
> @@ -1474,6 +1505,18 @@ static int store_super1(struct supertype *st, int fd)
>  	int sbsize;
>  	unsigned long long dsize;
>  
> +#ifndef NO_DLM
> +	int rv, lockid;
> +	if (is_clustered(st)) {
> +		rv = cluster_get_dlmlock(st, &lockid);
> +		if (rv) {
> +			pr_err("Cannot get dlmlock in %s return %d\n", __func__, rv);
> +			cluster_release_dlmlock(st, lockid);
> +			return rv;
> +		}
> +	}
> +#endif
> +
>  	if (!get_dev_size(fd, NULL, &dsize))
>  		return 1;
>  
> @@ -1533,6 +1576,10 @@ static int store_super1(struct supertype *st, int fd)
>  		}
>  	}
>  	fsync(fd);
> +#ifndef NO_DLM
> +	if (is_clustered(st))
> +		cluster_release_dlmlock(st, lockid);
> +#endif
>  	return 0;
>  }
>  
> @@ -2282,6 +2329,17 @@ static int write_bitmap1(struct supertype *st, int fd, enum bitmap_update update
>  
>  static void free_super1(struct supertype *st)
>  {
> +#ifndef NO_DLM
> +	int rv, lockid;
> +	if (is_clustered(st)) {
> +		rv = cluster_get_dlmlock(st, &lockid);
> +		if (rv) {
> +			pr_err("Cannot get dlmlock in %s return %d\n", __func__, rv);
> +			cluster_release_dlmlock(st, lockid);
> +			return;
> +		}
> +	}
> +#endif
>  	if (st->sb)
>  		free(st->sb);
>  	while (st->info) {
> @@ -2292,6 +2350,10 @@ static void free_super1(struct supertype *st)
>  		free(di);
>  	}
>  	st->sb = NULL;
> +#ifndef NO_DLM
> +	if (is_clustered(st))
> +		cluster_release_dlmlock(st, lockid);
> +#endif
>  }
>  
>  #ifndef MDASSEMBLE
> diff --git a/util.c b/util.c
> index ea6e688..3357957 100644
> --- a/util.c
> +++ b/util.c
> @@ -24,6 +24,7 @@
>  
>  #include	"mdadm.h"
>  #include	"md_p.h"
> +#include	<sys/poll.h>
>  #include	<sys/socket.h>
>  #include	<sys/utsname.h>
>  #include	<sys/wait.h>
> @@ -42,6 +43,10 @@
>  #else
>   #include	<corosync/cmap.h>
>  #endif
> +#ifndef NO_DLM
> +#include	<libdlm.h>
> +#include	<errno.h>
> +#endif
>  
>  
>  /*
> @@ -88,6 +93,128 @@ struct blkpg_partition {
>     aren't permitted). */
>  #define BUILD_BUG_ON_ZERO(e) (sizeof(struct { int:-!!(e); }))
>  
> +#ifndef NO_DLM
> +struct dlm_lock_resource {
> +	dlm_lshandle_t *ls;
> +	struct dlm_lksb lksb;
> +};
> +
> +struct dlm_lock_resource *dlm_lock_res = NULL;
> +static int ast_called = 0;
> +
> +int is_clustered(struct supertype *st)
> +{
> +	/* is it a cluster md or not */
> +	return st->cluster_name ? 1 :0;
> +}
> +
> +/* Using poll(2) to wait for and dispatch ASTs */
> +static int poll_for_ast(dlm_lshandle_t ls)
> +{
> +	struct pollfd pfd;
> +
> +	pfd.fd = dlm_ls_get_fd(ls);
> +	pfd.events = POLLIN;
> +
> +	while (!ast_called)
> +	{
> +		if (poll(&pfd, 1, 0) < 0)
> +		{
> +			perror("poll");
> +			return -1;
> +		}
> +		dlm_dispatch(dlm_ls_get_fd(ls));
> +	}
> +	ast_called = 0;
> +
> +	return 0;
> +}
> +
> +static void dlm_ast(void *arg)
> +{
> +	ast_called = 1;
> +}
> +
> +/* Create the lockspace, take bitmapXXX locks on all the bitmaps. */
> +int cluster_get_dlmlock(struct supertype *st, int *lockid)
> +{
> +	int ret;
> +	char str[64];
> +	int flags = LKF_NOQUEUE;
> +
> +	dlm_lock_res = xmalloc(sizeof(struct dlm_lock_resource));
> +	if (!dlm_lock_res)
> +		return -ENOMEM;
> +
> +	dlm_lock_res->ls = dlm_create_lockspace(st->cluster_name, O_RDWR);
> +	if (!dlm_lock_res->ls) {
> +		pr_err("%s failed to create lockspace\n", st->cluster_name);
> +		return -1;
> +	}
> +
> +	/* Conversions need the lockid in the LKSB */
> +	if (flags & LKF_CONVERT)
> +		dlm_lock_res->lksb.sb_lkid = *lockid;
> +
> +	snprintf(str, 64, "bitmap%04d", st->nodes);
> +	/* if flags with LKF_CONVERT causes below return ENOENT which means
> +	 * "No such file or directory" */
> +	ret = dlm_ls_lock(dlm_lock_res->ls, LKM_PWMODE, &dlm_lock_res->lksb,
> +			  flags, str, strlen(str), 0, dlm_ast,
> +			  dlm_lock_res, NULL, NULL);
> +	if (ret) {
> +		pr_err("error %d when get PW mode on lock %s\n", errno, str);
> +		return ret;
> +	}
> +
> +	/* Wait for it to complete */
> +	poll_for_ast(dlm_lock_res->ls);
> +	*lockid = dlm_lock_res->lksb.sb_lkid;
> +
> +	errno =	dlm_lock_res->lksb.sb_status;
> +	if (errno) {
> +	    pr_err("error %d happened in ast with lock %s\n", errno, str);
> +	    return -1;
> +	}
> +
> +	return 0;
> +}
> +
> +int cluster_release_dlmlock(struct supertype *st, int lockid)
> +{
> +	int ret;
> +
> +	/* if flags with LKF_CONVERT causes below return EINVAL which means
> +	 * "Invalid argument" */
> +	ret = dlm_ls_unlock(dlm_lock_res->ls, lockid, 0, &dlm_lock_res->lksb, dlm_lock_res);
> +	if (ret) {
> +		pr_err("error %d happened when unlock\n", errno);
> +		/* XXX make sure the lock is unlocked eventually */
> +		return ret;
> +	}
> +
> +	/* Wait for it to complete */
> +	poll_for_ast(dlm_lock_res->ls);
> +
> +	errno =	dlm_lock_res->lksb.sb_status;
> +	if (errno != EUNLOCK) {
> +		pr_err("error %d happened in ast when unlock lockspace\n", errno);
> +		/* XXX make sure the lockspace is unlocked eventually */
> +		return -1;
> +	}
> +
> +	ret = dlm_release_lockspace(st->cluster_name, dlm_lock_res->ls, 1);
> +	if (ret) {
> +		pr_err("error %d happened when release lockspace\n", errno);
> +		/* XXX make sure the lockspace is released eventually */
> +		return ret;
> +	}
> +	free(dlm_lock_res);
> +
> +	return 0;
> +}
> +#endif
> +
>  /*
>   * Parse a 128 bit uuid in 4 integers
>   * format is 32 hexx nibbles with options :.<space> separator


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 3/3] Safeguard against writing to an active device of another node
  2015-07-29  7:28   ` NeilBrown
@ 2015-07-29 10:41     ` Guoqing Jiang
  2015-07-29 22:02       ` NeilBrown
  0 siblings, 1 reply; 13+ messages in thread
From: Guoqing Jiang @ 2015-07-29 10:41 UTC (permalink / raw)
  To: NeilBrown; +Cc: rgoldwyn, linux-raid

Hi Neil,

NeilBrown wrote:
> On Mon,  6 Jul 2015 16:52:12 +0800 Guoqing Jiang <gqjiang@suse.com>
> wrote:
>
>   
>> Modifying an exiting device's superblock or creating a new superblock
>> on an existing device needs to be checked because the device could be
>> in use by another node in another array. So, we check this by taking
>> all superblock locks in userspace so that we don't  step onto an active
>> device used by another node and safeguard against accidental edits.
>> After the edit is complete, we release all locks and the lockspace so
>> that it can be used by the kernel space.
>>
>> Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
>> Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
>>     
>
> Hi,
>  thanks for these.
>  I've applied 1 and 2.
>
>  Could you re-do this on to follow the kernel style of #ifdefs,
>  so that the #ifdefs only appear in header files.
>  i.e. when NO_DLM is defined, is_clustered() always evaluates to zero
>  and clustered_get_dlmlock() always succeeds etc.
>   
Thanks, I guess you mean the following incremental modification. Please
let me know if I misunderstood.

diff --git a/mdadm.h b/mdadm.h
index 59f851e..4d9e8c8 100644
--- a/mdadm.h
+++ b/mdadm.h
@@ -1429,9 +1429,26 @@ extern char *fd2devnm(int fd);

extern int in_initrd(void);
extern int get_cluster_name(char **name);
+#ifndef NO_DLM
extern int is_clustered(struct supertype *st);
extern int cluster_get_dlmlock(struct supertype *st, int *lockid);
extern int cluster_release_dlmlock(struct supertype *st, int lockid);
+#else
+static inline int is_clustered(struct supertype *st)
+{
+ return 0;
+}
+
+static inline int cluster_get_dlmlock(struct supertype *st, int *lockid)
+{
+ return 0;
+}
+
+static inline int cluster_release_dlmlock(struct supertype *st, int lockid)
+{
+ return 0;
+}
+#endif

#define _ROUND_UP(val, base) (((val) + (base) - 1) & ~(base - 1))
#define ROUND_UP(val, base) _ROUND_UP(val, (typeof(val))(base))
diff --git a/super1.c b/super1.c
index c49a3b3..bd88c36 100644
--- a/super1.c
+++ b/super1.c
@@ -1072,10 +1072,9 @@ static int update_super1(struct supertype *st,
struct mdinfo *info,
* ignored.
*/
int rv = 0;
+ int lockid;
struct mdp_superblock_1 *sb = st->sb;

-#ifndef NO_DLM
- int lockid;
if (is_clustered(st)) {
rv = cluster_get_dlmlock(st, &lockid);
if (rv) {
@@ -1084,7 +1083,6 @@ static int update_super1(struct supertype *st,
struct mdinfo *info,
return rv;
}
}
-#endif

if (strcmp(update, "homehost") == 0 &&
homehost) {
@@ -1342,10 +1340,9 @@ static int update_super1(struct supertype *st,
struct mdinfo *info,
rv = -1;

sb->sb_csum = calc_sb_1_csum(sb);
-#ifndef NO_DLM
if (is_clustered(st))
cluster_release_dlmlock(st, lockid);
-#endif
+
return rv;
}

@@ -1449,9 +1446,8 @@ static int add_to_super1(struct supertype *st,
mdu_disk_info_t *dk,
struct mdp_superblock_1 *sb = st->sb;
__u16 *rp = sb->dev_roles + dk->number;
struct devinfo *di, **dip;
-
-#ifndef NO_DLM
int rv, lockid;
+
if (is_clustered(st)) {
rv = cluster_get_dlmlock(st, &lockid);
if (rv) {
@@ -1460,7 +1456,7 @@ static int add_to_super1(struct supertype *st,
mdu_disk_info_t *dk,
return rv;
}
}
-#endif
+
if ((dk->state & 6) == 6) /* active, sync */
*rp = __cpu_to_le16(dk->raid_disk);
else if ((dk->state & ~2) == 0) /* active or idle -> spare */
@@ -1487,10 +1483,9 @@ static int add_to_super1(struct supertype *st,
mdu_disk_info_t *dk,
di->next = NULL;
*dip = di;

-#ifndef NO_DLM
if (is_clustered(st))
cluster_release_dlmlock(st, lockid);
-#endif
+
return 0;
}
#endif
@@ -1504,9 +1499,8 @@ static int store_super1(struct supertype *st, int fd)
struct align_fd afd;
int sbsize;
unsigned long long dsize;
-
-#ifndef NO_DLM
int rv, lockid;
+
if (is_clustered(st)) {
rv = cluster_get_dlmlock(st, &lockid);
if (rv) {
@@ -1515,7 +1509,6 @@ static int store_super1(struct supertype *st, int fd)
return rv;
}
}
-#endif

if (!get_dev_size(fd, NULL, &dsize))
return 1;
@@ -1576,10 +1569,9 @@ static int store_super1(struct supertype *st, int fd)
}
}
fsync(fd);
-#ifndef NO_DLM
if (is_clustered(st))
cluster_release_dlmlock(st, lockid);
-#endif
+
return 0;
}

@@ -2329,7 +2321,6 @@ static int write_bitmap1(struct supertype *st, int
fd, enum bitmap_update update

static void free_super1(struct supertype *st)
{
-#ifndef NO_DLM
int rv, lockid;
if (is_clustered(st)) {
rv = cluster_get_dlmlock(st, &lockid);
@@ -2339,7 +2330,7 @@ static void free_super1(struct supertype *st)
return;
}
}
-#endif
+
if (st->sb)
free(st->sb);
while (st->info) {
@@ -2350,10 +2341,8 @@ static void free_super1(struct supertype *st)
free(di);
}
st->sb = NULL;
-#ifndef NO_DLM
if (is_clustered(st))
cluster_release_dlmlock(st, lockid);
-#endif
}

#ifndef MDASSEMBLE

Thanks,
Guoqing

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 3/3] Safeguard against writing to an active device of another node
  2015-07-29 10:41     ` Guoqing Jiang
@ 2015-07-29 22:02       ` NeilBrown
  2015-07-29 23:40         ` Goldwyn Rodrigues
  2015-07-30  3:16         ` Guoqing Jiang
  0 siblings, 2 replies; 13+ messages in thread
From: NeilBrown @ 2015-07-29 22:02 UTC (permalink / raw)
  To: Guoqing Jiang; +Cc: rgoldwyn, linux-raid

On Wed, 29 Jul 2015 18:41:37 +0800 Guoqing Jiang <gqJiang@suse.com>
wrote:

> Hi Neil,
> 
> NeilBrown wrote:
> > On Mon,  6 Jul 2015 16:52:12 +0800 Guoqing Jiang <gqjiang@suse.com>
> > wrote:
> >
> >   
> >> Modifying an exiting device's superblock or creating a new superblock
> >> on an existing device needs to be checked because the device could be
> >> in use by another node in another array. So, we check this by taking
> >> all superblock locks in userspace so that we don't  step onto an active
> >> device used by another node and safeguard against accidental edits.
> >> After the edit is complete, we release all locks and the lockspace so
> >> that it can be used by the kernel space.
> >>
> >> Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
> >> Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
> >>     
> >
> > Hi,
> >  thanks for these.
> >  I've applied 1 and 2.
> >
> >  Could you re-do this on to follow the kernel style of #ifdefs,
> >  so that the #ifdefs only appear in header files.
> >  i.e. when NO_DLM is defined, is_clustered() always evaluates to zero
> >  and clustered_get_dlmlock() always succeeds etc.
> >   
> Thanks, I guess you mean the following incremental modification. Please
> let me know if I misunderstood.
> 
> diff --git a/mdadm.h b/mdadm.h
> index 59f851e..4d9e8c8 100644
> --- a/mdadm.h
> +++ b/mdadm.h
> @@ -1429,9 +1429,26 @@ extern char *fd2devnm(int fd);
> 
> extern int in_initrd(void);
> extern int get_cluster_name(char **name);
> +#ifndef NO_DLM
> extern int is_clustered(struct supertype *st);
> extern int cluster_get_dlmlock(struct supertype *st, int *lockid);
> extern int cluster_release_dlmlock(struct supertype *st, int lockid);
> +#else
> +static inline int is_clustered(struct supertype *st)
> +{
> + return 0;
> +}
> +
> +static inline int cluster_get_dlmlock(struct supertype *st, int *lockid)
> +{
> + return 0;
> +}
> +
> +static inline int cluster_release_dlmlock(struct supertype *st, int lockid)
> +{
> + return 0;
> +}
> +#endif
> 
> #define _ROUND_UP(val, base) (((val) + (base) - 1) & ~(base - 1))
> #define ROUND_UP(val, base) _ROUND_UP(val, (typeof(val))(base))
> diff --git a/super1.c b/super1.c
> index c49a3b3..bd88c36 100644
> --- a/super1.c
> +++ b/super1.c
> @@ -1072,10 +1072,9 @@ static int update_super1(struct supertype *st,
> struct mdinfo *info,
> * ignored.
> */
> int rv = 0;
> + int lockid;
> struct mdp_superblock_1 *sb = st->sb;
> 
> -#ifndef NO_DLM
> - int lockid;
> if (is_clustered(st)) {
> rv = cluster_get_dlmlock(st, &lockid);
> if (rv) {
> @@ -1084,7 +1083,6 @@ static int update_super1(struct supertype *st,
> struct mdinfo *info,
> return rv;
> }
> }
> -#endif
> 
> if (strcmp(update, "homehost") == 0 &&
> homehost) {
> @@ -1342,10 +1340,9 @@ static int update_super1(struct supertype *st,
> struct mdinfo *info,
> rv = -1;
> 
> sb->sb_csum = calc_sb_1_csum(sb);
> -#ifndef NO_DLM
> if (is_clustered(st))
> cluster_release_dlmlock(st, lockid);
> -#endif
> +
> return rv;
> }
> 
> @@ -1449,9 +1446,8 @@ static int add_to_super1(struct supertype *st,
> mdu_disk_info_t *dk,
> struct mdp_superblock_1 *sb = st->sb;
> __u16 *rp = sb->dev_roles + dk->number;
> struct devinfo *di, **dip;
> -
> -#ifndef NO_DLM
> int rv, lockid;
> +
> if (is_clustered(st)) {
> rv = cluster_get_dlmlock(st, &lockid);
> if (rv) {
> @@ -1460,7 +1456,7 @@ static int add_to_super1(struct supertype *st,
> mdu_disk_info_t *dk,
> return rv;
> }
> }
> -#endif
> +
> if ((dk->state & 6) == 6) /* active, sync */
> *rp = __cpu_to_le16(dk->raid_disk);
> else if ((dk->state & ~2) == 0) /* active or idle -> spare */
> @@ -1487,10 +1483,9 @@ static int add_to_super1(struct supertype *st,
> mdu_disk_info_t *dk,
> di->next = NULL;
> *dip = di;
> 
> -#ifndef NO_DLM
> if (is_clustered(st))
> cluster_release_dlmlock(st, lockid);
> -#endif
> +
> return 0;
> }
> #endif
> @@ -1504,9 +1499,8 @@ static int store_super1(struct supertype *st, int fd)
> struct align_fd afd;
> int sbsize;
> unsigned long long dsize;
> -
> -#ifndef NO_DLM
> int rv, lockid;
> +
> if (is_clustered(st)) {
> rv = cluster_get_dlmlock(st, &lockid);
> if (rv) {
> @@ -1515,7 +1509,6 @@ static int store_super1(struct supertype *st, int fd)
> return rv;
> }
> }
> -#endif
> 
> if (!get_dev_size(fd, NULL, &dsize))
> return 1;
> @@ -1576,10 +1569,9 @@ static int store_super1(struct supertype *st, int fd)
> }
> }
> fsync(fd);
> -#ifndef NO_DLM
> if (is_clustered(st))
> cluster_release_dlmlock(st, lockid);
> -#endif
> +
> return 0;
> }
> 
> @@ -2329,7 +2321,6 @@ static int write_bitmap1(struct supertype *st, int
> fd, enum bitmap_update update
> 
> static void free_super1(struct supertype *st)
> {
> -#ifndef NO_DLM
> int rv, lockid;
> if (is_clustered(st)) {
> rv = cluster_get_dlmlock(st, &lockid);
> @@ -2339,7 +2330,7 @@ static void free_super1(struct supertype *st)
> return;
> }
> }
> -#endif
> +
> if (st->sb)
> free(st->sb);
> while (st->info) {
> @@ -2350,10 +2341,8 @@ static void free_super1(struct supertype *st)
> free(di);
> }
> st->sb = NULL;
> -#ifndef NO_DLM
> if (is_clustered(st))
> cluster_release_dlmlock(st, lockid);
> -#endif
> }
> 
> #ifndef MDASSEMBLE
> 
> Thanks,
> Guoqing

Probably something like that - yes.  Unfortunately your mailer messed
that up more that usual make it rather difficult to apply.

I was wondering if we could make it a run-time dependency though .. a
bit like get_cluster_name.  We can still do that later I guess.  I'm
not really sure what is best at the moment.

Thanks,
NeilBrown

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 3/3] Safeguard against writing to an active device of another node
  2015-07-29 22:02       ` NeilBrown
@ 2015-07-29 23:40         ` Goldwyn Rodrigues
  2015-07-30  3:18           ` Guoqing Jiang
  2015-08-01 15:50           ` Wols Lists
  2015-07-30  3:16         ` Guoqing Jiang
  1 sibling, 2 replies; 13+ messages in thread
From: Goldwyn Rodrigues @ 2015-07-29 23:40 UTC (permalink / raw)
  To: NeilBrown, Guoqing Jiang; +Cc: linux-raid



On 07/29/2015 05:02 PM, NeilBrown wrote:
> On Wed, 29 Jul 2015 18:41:37 +0800 Guoqing Jiang <gqJiang@suse.com>
> wrote:
>
>> Hi Neil,
>>
>> NeilBrown wrote:
>>> On Mon,  6 Jul 2015 16:52:12 +0800 Guoqing Jiang <gqjiang@suse.com>
>>> wrote:
>>>
>>>
>>>> Modifying an exiting device's superblock or creating a new superblock
>>>> on an existing device needs to be checked because the device could be
>>>> in use by another node in another array. So, we check this by taking
>>>> all superblock locks in userspace so that we don't  step onto an active
>>>> device used by another node and safeguard against accidental edits.
>>>> After the edit is complete, we release all locks and the lockspace so
>>>> that it can be used by the kernel space.
>>>>
>>>> Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
>>>> Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
>>>>
>>>
>>> Hi,
>>>   thanks for these.
>>>   I've applied 1 and 2.
>>>
>>>   Could you re-do this on to follow the kernel style of #ifdefs,
>>>   so that the #ifdefs only appear in header files.
>>>   i.e. when NO_DLM is defined, is_clustered() always evaluates to zero
>>>   and clustered_get_dlmlock() always succeeds etc.
>>>
>> Thanks, I guess you mean the following incremental modification. Please
>> let me know if I misunderstood.
>>
>> diff --git a/mdadm.h b/mdadm.h
>> index 59f851e..4d9e8c8 100644
>> --- a/mdadm.h
>> +++ b/mdadm.h
>> @@ -1429,9 +1429,26 @@ extern char *fd2devnm(int fd);
>>
>> extern int in_initrd(void);
>> extern int get_cluster_name(char **name);
>> +#ifndef NO_DLM
>> extern int is_clustered(struct supertype *st);
>> extern int cluster_get_dlmlock(struct supertype *st, int *lockid);
>> extern int cluster_release_dlmlock(struct supertype *st, int lockid);
>> +#else
>> +static inline int is_clustered(struct supertype *st)
>> +{
>> + return 0;
>> +}
>> +
>> +static inline int cluster_get_dlmlock(struct supertype *st, int *lockid)
>> +{
>> + return 0;
>> +}
>> +
>> +static inline int cluster_release_dlmlock(struct supertype *st, int lockid)
>> +{
>> + return 0;
>> +}
>> +#endif
>>
>> #define _ROUND_UP(val, base) (((val) + (base) - 1) & ~(base - 1))
>> #define ROUND_UP(val, base) _ROUND_UP(val, (typeof(val))(base))
>> diff --git a/super1.c b/super1.c
>> index c49a3b3..bd88c36 100644
>> --- a/super1.c
>> +++ b/super1.c
>> @@ -1072,10 +1072,9 @@ static int update_super1(struct supertype *st,
>> struct mdinfo *info,
>> * ignored.
>> */
>> int rv = 0;
>> + int lockid;
>> struct mdp_superblock_1 *sb = st->sb;
>>
>> -#ifndef NO_DLM
>> - int lockid;
>> if (is_clustered(st)) {
>> rv = cluster_get_dlmlock(st, &lockid);
>> if (rv) {
>> @@ -1084,7 +1083,6 @@ static int update_super1(struct supertype *st,
>> struct mdinfo *info,
>> return rv;
>> }
>> }
>> -#endif
>>
>> if (strcmp(update, "homehost") == 0 &&
>> homehost) {
>> @@ -1342,10 +1340,9 @@ static int update_super1(struct supertype *st,
>> struct mdinfo *info,
>> rv = -1;
>>
>> sb->sb_csum = calc_sb_1_csum(sb);
>> -#ifndef NO_DLM
>> if (is_clustered(st))
>> cluster_release_dlmlock(st, lockid);
>> -#endif
>> +
>> return rv;
>> }
>>
>> @@ -1449,9 +1446,8 @@ static int add_to_super1(struct supertype *st,
>> mdu_disk_info_t *dk,
>> struct mdp_superblock_1 *sb = st->sb;
>> __u16 *rp = sb->dev_roles + dk->number;
>> struct devinfo *di, **dip;
>> -
>> -#ifndef NO_DLM
>> int rv, lockid;
>> +
>> if (is_clustered(st)) {
>> rv = cluster_get_dlmlock(st, &lockid);
>> if (rv) {
>> @@ -1460,7 +1456,7 @@ static int add_to_super1(struct supertype *st,
>> mdu_disk_info_t *dk,
>> return rv;
>> }
>> }
>> -#endif
>> +
>> if ((dk->state & 6) == 6) /* active, sync */
>> *rp = __cpu_to_le16(dk->raid_disk);
>> else if ((dk->state & ~2) == 0) /* active or idle -> spare */
>> @@ -1487,10 +1483,9 @@ static int add_to_super1(struct supertype *st,
>> mdu_disk_info_t *dk,
>> di->next = NULL;
>> *dip = di;
>>
>> -#ifndef NO_DLM
>> if (is_clustered(st))
>> cluster_release_dlmlock(st, lockid);
>> -#endif
>> +
>> return 0;
>> }
>> #endif
>> @@ -1504,9 +1499,8 @@ static int store_super1(struct supertype *st, int fd)
>> struct align_fd afd;
>> int sbsize;
>> unsigned long long dsize;
>> -
>> -#ifndef NO_DLM
>> int rv, lockid;
>> +
>> if (is_clustered(st)) {
>> rv = cluster_get_dlmlock(st, &lockid);
>> if (rv) {
>> @@ -1515,7 +1509,6 @@ static int store_super1(struct supertype *st, int fd)
>> return rv;
>> }
>> }
>> -#endif
>>
>> if (!get_dev_size(fd, NULL, &dsize))
>> return 1;
>> @@ -1576,10 +1569,9 @@ static int store_super1(struct supertype *st, int fd)
>> }
>> }
>> fsync(fd);
>> -#ifndef NO_DLM
>> if (is_clustered(st))
>> cluster_release_dlmlock(st, lockid);
>> -#endif
>> +
>> return 0;
>> }
>>
>> @@ -2329,7 +2321,6 @@ static int write_bitmap1(struct supertype *st, int
>> fd, enum bitmap_update update
>>
>> static void free_super1(struct supertype *st)
>> {
>> -#ifndef NO_DLM
>> int rv, lockid;
>> if (is_clustered(st)) {
>> rv = cluster_get_dlmlock(st, &lockid);
>> @@ -2339,7 +2330,7 @@ static void free_super1(struct supertype *st)
>> return;
>> }
>> }
>> -#endif
>> +
>> if (st->sb)
>> free(st->sb);
>> while (st->info) {
>> @@ -2350,10 +2341,8 @@ static void free_super1(struct supertype *st)
>> free(di);
>> }
>> st->sb = NULL;
>> -#ifndef NO_DLM
>> if (is_clustered(st))
>> cluster_release_dlmlock(st, lockid);
>> -#endif
>> }
>>
>> #ifndef MDASSEMBLE
>>
>> Thanks,
>> Guoqing
>
> Probably something like that - yes.  Unfortunately your mailer messed
> that up more that usual make it rather difficult to apply.
>
> I was wondering if we could make it a run-time dependency though .. a
> bit like get_cluster_name.  We can still do that later I guess.  I'm
> not really sure what is best at the moment.
>


Yes, I would second that: Making these functions a run-time dependency. 
We should have the ability for it to work by just installing libdlm 
(with the proper flags set), rather than recompiling the package for 
cluster features.

-- 
Goldwyn

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 3/3] Safeguard against writing to an active device of another node
  2015-07-29 22:02       ` NeilBrown
  2015-07-29 23:40         ` Goldwyn Rodrigues
@ 2015-07-30  3:16         ` Guoqing Jiang
  2015-07-30  8:22           ` Guoqing Jiang
  1 sibling, 1 reply; 13+ messages in thread
From: Guoqing Jiang @ 2015-07-30  3:16 UTC (permalink / raw)
  To: NeilBrown; +Cc: rgoldwyn, linux-raid

Hi Neil,
>
> Probably something like that - yes.  Unfortunately your mailer messed
> that up more that usual make it rather difficult to apply.
>
>   
Sorry, it is just for review.
> I was wondering if we could make it a run-time dependency though .. a
> bit like get_cluster_name.  We can still do that later I guess.  I'm
> not really sure what is best at the moment.
>
>   
It could be, I am changing it and run some test, then the new version patch
will be send.

Thanks,
Guoqing

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 3/3] Safeguard against writing to an active device of another node
  2015-07-29 23:40         ` Goldwyn Rodrigues
@ 2015-07-30  3:18           ` Guoqing Jiang
  2015-08-01 15:50           ` Wols Lists
  1 sibling, 0 replies; 13+ messages in thread
From: Guoqing Jiang @ 2015-07-30  3:18 UTC (permalink / raw)
  To: Goldwyn Rodrigues; +Cc: NeilBrown, linux-raid

Hi Goldwyn,
>> Probably something like that - yes.  Unfortunately your mailer messed
>> that up more that usual make it rather difficult to apply.
>>
>> I was wondering if we could make it a run-time dependency though .. a
>> bit like get_cluster_name.  We can still do that later I guess.  I'm
>> not really sure what is best at the moment.
>>
>
>
> Yes, I would second that: Making these functions a run-time
> dependency. We should have the ability for it to work by just
> installing libdlm (with the proper flags set), rather than recompiling
> the package for cluster features.
>
NP, I should take a look at get_cluster_name first.

Thanks,
Guoqing

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 3/3] Safeguard against writing to an active device of another node
  2015-07-30  3:16         ` Guoqing Jiang
@ 2015-07-30  8:22           ` Guoqing Jiang
  2015-07-30  8:38             ` Guoqing Jiang
  0 siblings, 1 reply; 13+ messages in thread
From: Guoqing Jiang @ 2015-07-30  8:22 UTC (permalink / raw)
  To: NeilBrown; +Cc: rgoldwyn, linux-raid

Guoqing Jiang wrote:
> Hi Neil,
>   
>> Probably something like that - yes.  Unfortunately your mailer messed
>> that up more that usual make it rather difficult to apply.
>>
>>   
>>     
> Sorry, it is just for review.
>   
>> I was wondering if we could make it a run-time dependency though .. a
>> bit like get_cluster_name.  We can still do that later I guess.  I'm
>> not really sure what is best at the moment.
>>
>>   
>>     
> It could be, I am changing it and run some test, then the new version patch
> will be send.
>   
After change to dynamic load dlm library, if compile mdadm without
related dlm library,
then some problems appeared, such as unlock dlm would fail when create
cluster-md,
the errno is 52 (EBADE: Invalid exchange).

But, if compile mdadm with previous code, then nothing wrong happened
whether
the dlm library is installed or not.

I will post the new patch which depends on run-time library anyway.

Thanks,
Guoqing

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 3/3] Safeguard against writing to an active device of another node
  2015-07-30  8:22           ` Guoqing Jiang
@ 2015-07-30  8:38             ` Guoqing Jiang
  0 siblings, 0 replies; 13+ messages in thread
From: Guoqing Jiang @ 2015-07-30  8:38 UTC (permalink / raw)
  To: NeilBrown; +Cc: rgoldwyn, linux-raid

Guoqing Jiang wrote:
> Guoqing Jiang wrote:
>   
>> Hi Neil,
>>   
>>     
>>> Probably something like that - yes.  Unfortunately your mailer messed
>>> that up more that usual make it rather difficult to apply.
>>>
>>>   
>>>     
>>>       
>> Sorry, it is just for review.
>>   
>>     
>>> I was wondering if we could make it a run-time dependency though .. a
>>> bit like get_cluster_name.  We can still do that later I guess.  I'm
>>> not really sure what is best at the moment.
>>>
>>>   
>>>     
>>>       
>> It could be, I am changing it and run some test, then the new version patch
>> will be send.
>>   
>>     
> After change to dynamic load dlm library, if compile mdadm without
> related dlm library,
> then some problems appeared, such as unlock dlm would fail when create
> cluster-md,
> the errno is 52 (EBADE: Invalid exchange).
>
> But, if compile mdadm with previous code, then nothing wrong happened
> whether
> the dlm library is installed or not.
>
>   
Oops, if dlm library is not existed then the dlm code is not even
compiled in previous code,
so it doesn't complain about anything about dlm.

Thanks,
Guoqing

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 3/3] Safeguard against writing to an active device of another node
  2015-07-29 23:40         ` Goldwyn Rodrigues
  2015-07-30  3:18           ` Guoqing Jiang
@ 2015-08-01 15:50           ` Wols Lists
  2015-08-01 15:54             ` Goldwyn Rodrigues
  1 sibling, 1 reply; 13+ messages in thread
From: Wols Lists @ 2015-08-01 15:50 UTC (permalink / raw)
  To: Goldwyn Rodrigues, NeilBrown, Guoqing Jiang; +Cc: linux-raid

On 30/07/15 00:40, Goldwyn Rodrigues wrote:
>>
>> I was wondering if we could make it a run-time dependency though .. a
>> bit like get_cluster_name.  We can still do that later I guess.  I'm
>> not really sure what is best at the moment.
>>
> 
> 
> Yes, I would second that: Making these functions a run-time dependency.
> We should have the ability for it to work by just installing libdlm
> (with the proper flags set), rather than recompiling the package for
> cluster features.

Or - maybe unlikely but maybe not - what happens if somebody installs a
binary that was built on a system WITH dlm, but is meant to run on a
system WITHOUT dlm?

Actually, wouldn't that be the typical case for a typical distro?

Cheers,
Wol

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 3/3] Safeguard against writing to an active device of another node
  2015-08-01 15:50           ` Wols Lists
@ 2015-08-01 15:54             ` Goldwyn Rodrigues
  0 siblings, 0 replies; 13+ messages in thread
From: Goldwyn Rodrigues @ 2015-08-01 15:54 UTC (permalink / raw)
  To: Wols Lists, NeilBrown, Guoqing Jiang; +Cc: linux-raid



On 08/01/2015 10:50 AM, Wols Lists wrote:
> On 30/07/15 00:40, Goldwyn Rodrigues wrote:
>>>
>>> I was wondering if we could make it a run-time dependency though .. a
>>> bit like get_cluster_name.  We can still do that later I guess.  I'm
>>> not really sure what is best at the moment.
>>>
>>
>>
>> Yes, I would second that: Making these functions a run-time dependency.
>> We should have the ability for it to work by just installing libdlm
>> (with the proper flags set), rather than recompiling the package for
>> cluster features.
>
> Or - maybe unlikely but maybe not - what happens if somebody installs a
> binary that was built on a system WITH dlm, but is meant to run on a
> system WITHOUT dlm?
>
> Actually, wouldn't that be the typical case for a typical distro?
>


That's exactly the case we are trying to cover in the above explanation. 
IOw, we would be using -ldl for runtime compatibility. If libdlm is not 
present, cluster functions will be resolved to null and will not be 
executed/allowed.

-- 
Goldwyn

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2015-08-01 15:54 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-07-06  8:52 [PATCH 1/3] mdadm: fix wrong condition for go to abort Guoqing Jiang
2015-07-06  8:52 ` [PATCH 2/3] md-cluster: use %-64s to print cluster_name Guoqing Jiang
2015-07-06  8:52 ` [PATCH 3/3] Safeguard against writing to an active device of another node Guoqing Jiang
2015-07-29  7:28   ` NeilBrown
2015-07-29 10:41     ` Guoqing Jiang
2015-07-29 22:02       ` NeilBrown
2015-07-29 23:40         ` Goldwyn Rodrigues
2015-07-30  3:18           ` Guoqing Jiang
2015-08-01 15:50           ` Wols Lists
2015-08-01 15:54             ` Goldwyn Rodrigues
2015-07-30  3:16         ` Guoqing Jiang
2015-07-30  8:22           ` Guoqing Jiang
2015-07-30  8:38             ` Guoqing Jiang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.