All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/6] small multipath fixes
@ 2017-05-09 16:56 Benjamin Marzinski
  2017-05-09 16:57 ` [PATCH 1/6] libmultipath: print alias with no_path_retry message Benjamin Marzinski
                   ` (6 more replies)
  0 siblings, 7 replies; 16+ messages in thread
From: Benjamin Marzinski @ 2017-05-09 16:56 UTC (permalink / raw)
  To: device-mapper development

Here are a number of small multipath fixes. These patches are based on
top of my and Martin's previous patch sets.

Benjamin Marzinski (6):
  libmultipath: print alias with no_path_retry message
  multipathd: force reload device on all resizes
  libmultipath: refactor calls to get dm device info
  libmultipath: fix suspended devs from failed reloads
  kpartx: fix device checks
  mpath_persist: Don't join threads that don't exist

 kpartx/kpartx.c                 |  23 +++--
 libmpathpersist/mpath_persist.c |  34 +++++---
 libmpathpersist/mpath_persist.h |   3 +-
 libmultipath/devmapper.c        | 184 +++++++++++-----------------------------
 libmultipath/devmapper.h        |   4 +-
 libmultipath/propsel.c          |   2 +-
 multipathd/cli_handlers.c       |  25 ++----
 7 files changed, 100 insertions(+), 175 deletions(-)

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 1/6] libmultipath: print alias with no_path_retry message
  2017-05-09 16:56 [PATCH 0/6] small multipath fixes Benjamin Marzinski
@ 2017-05-09 16:57 ` Benjamin Marzinski
  2017-05-09 16:57 ` [PATCH 2/6] multipathd: force reload device on all resizes Benjamin Marzinski
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 16+ messages in thread
From: Benjamin Marzinski @ 2017-05-09 16:57 UTC (permalink / raw)
  To: device-mapper development

the "flush_on_last_del in progress" message in select_no_path_retry
should print the device alias.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 libmultipath/propsel.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libmultipath/propsel.c b/libmultipath/propsel.c
index 09fe728..d393710 100644
--- a/libmultipath/propsel.c
+++ b/libmultipath/propsel.c
@@ -456,7 +456,7 @@ int select_no_path_retry(struct config *conf, struct multipath *mp)
 	char buff[12];
 
 	if (mp->flush_on_last_del == FLUSH_IN_PROGRESS) {
-		condlog(0, "flush_on_last_del in progress");
+		condlog(0, "%s: flush_on_last_del in progress", mp->alias);
 		mp->no_path_retry = NO_PATH_RETRY_FAIL;
 		return 0;
 	}
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 2/6] multipathd: force reload device on all resizes
  2017-05-09 16:56 [PATCH 0/6] small multipath fixes Benjamin Marzinski
  2017-05-09 16:57 ` [PATCH 1/6] libmultipath: print alias with no_path_retry message Benjamin Marzinski
@ 2017-05-09 16:57 ` Benjamin Marzinski
  2017-05-09 16:57 ` [PATCH 3/6] libmultipath: refactor calls to get dm device info Benjamin Marzinski
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 16+ messages in thread
From: Benjamin Marzinski @ 2017-05-09 16:57 UTC (permalink / raw)
  To: device-mapper development

My previous patch missed setting force_udev_reload on resizes initiated
by

multipathd resize map <map>

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 multipathd/cli_handlers.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/multipathd/cli_handlers.c b/multipathd/cli_handlers.c
index efc12dd..d598039 100644
--- a/multipathd/cli_handlers.c
+++ b/multipathd/cli_handlers.c
@@ -870,6 +870,7 @@ int resize_map(struct multipath *mpp, unsigned long long size,
 	update_mpp_paths(mpp, vecs->pathvec);
 	setup_map(mpp, params, PARAMS_SIZE);
 	mpp->action = ACT_RESIZE;
+	mpp->force_udev_reload = 1;
 	if (domap(mpp, params, 1) <= 0) {
 		condlog(0, "%s: failed to resize map : %s", mpp->alias,
 			strerror(errno));
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 3/6] libmultipath: refactor calls to get dm device info
  2017-05-09 16:56 [PATCH 0/6] small multipath fixes Benjamin Marzinski
  2017-05-09 16:57 ` [PATCH 1/6] libmultipath: print alias with no_path_retry message Benjamin Marzinski
  2017-05-09 16:57 ` [PATCH 2/6] multipathd: force reload device on all resizes Benjamin Marzinski
@ 2017-05-09 16:57 ` Benjamin Marzinski
  2017-05-11 20:26   ` Martin Wilck
  2017-05-09 16:57 ` [PATCH 4/6] libmultipath: fix suspended devs from failed reloads Benjamin Marzinski
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 16+ messages in thread
From: Benjamin Marzinski @ 2017-05-09 16:57 UTC (permalink / raw)
  To: device-mapper development

Multipath has a number of functions that all do basically the same
thing to get members of the dm_info structure. This just refactors them
to use common code to do it.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 libmultipath/devmapper.c  | 171 +++++++++-------------------------------------
 libmultipath/devmapper.h  |   3 +-
 multipathd/cli_handlers.c |  24 ++-----
 3 files changed, 42 insertions(+), 156 deletions(-)

diff --git a/libmultipath/devmapper.c b/libmultipath/devmapper.c
index 1c6b87e..2c4a13a 100644
--- a/libmultipath/devmapper.c
+++ b/libmultipath/devmapper.c
@@ -399,16 +399,16 @@ int dm_addmap_reload(struct multipath *mpp, char *params, int flush)
 	return r;
 }
 
-int dm_map_present(const char * str)
+static int
+do_get_info(const char *name, struct dm_info *info)
 {
-	int r = 0;
+	int r = -1;
 	struct dm_task *dmt;
-	struct dm_info info;
 
 	if (!(dmt = dm_task_create(DM_DEVICE_INFO)))
-		return 0;
+		return r;
 
-	if (!dm_task_set_name(dmt, str))
+	if (!dm_task_set_name(dmt, name))
 		goto out;
 
 	dm_task_no_open_count(dmt);
@@ -416,16 +416,25 @@ int dm_map_present(const char * str)
 	if (!dm_task_run(dmt))
 		goto out;
 
-	if (!dm_task_get_info(dmt, &info))
+	if (!dm_task_get_info(dmt, info))
 		goto out;
 
-	if (info.exists)
-		r = 1;
+	if (!info->exists)
+		goto out;
+
+	r = 0;
 out:
 	dm_task_destroy(dmt);
 	return r;
 }
 
+int dm_map_present(const char * str)
+{
+	struct dm_info info;
+
+	return (do_get_info(str, &info) == 0);
+}
+
 int dm_get_map(const char *name, unsigned long long *size, char *outparams)
 {
 	int r = 1;
@@ -646,29 +655,15 @@ out:
 static int
 dm_dev_t (const char * mapname, char * dev_t, int len)
 {
-	int r = 1;
-	struct dm_task *dmt;
 	struct dm_info info;
 
-	if (!(dmt = dm_task_create(DM_DEVICE_INFO)))
-		return 0;
-
-	if (!dm_task_set_name(dmt, mapname))
-		goto out;
-
-	if (!dm_task_run(dmt))
-		goto out;
-
-	if (!dm_task_get_info(dmt, &info) || !info.exists)
-		goto out;
+	if (do_get_info(mapname, &info) != 0)
+		return 1;
 
 	if (snprintf(dev_t, len, "%i:%i", info.major, info.minor) > len)
-		goto out;
+		return 1;
 
-	r = 0;
-out:
-	dm_task_destroy(dmt);
-	return r;
+	return 0;
 }
 
 int
@@ -700,59 +695,16 @@ out:
 }
 
 int
-dm_get_major (char * mapname)
+dm_get_major_minor(const char *name, int *major, int *minor)
 {
-	int r = -1;
-	struct dm_task *dmt;
 	struct dm_info info;
 
-	if (!(dmt = dm_task_create(DM_DEVICE_INFO)))
-		return 0;
-
-	if (!dm_task_set_name(dmt, mapname))
-		goto out;
-
-	if (!dm_task_run(dmt))
-		goto out;
-
-	if (!dm_task_get_info(dmt, &info))
-		goto out;
-
-	if (!info.exists)
-		goto out;
-
-	r = info.major;
-out:
-	dm_task_destroy(dmt);
-	return r;
-}
-
-int
-dm_get_minor (char * mapname)
-{
-	int r = -1;
-	struct dm_task *dmt;
-	struct dm_info info;
-
-	if (!(dmt = dm_task_create(DM_DEVICE_INFO)))
-		return 0;
-
-	if (!dm_task_set_name(dmt, mapname))
-		goto out;
-
-	if (!dm_task_run(dmt))
-		goto out;
-
-	if (!dm_task_get_info(dmt, &info))
-		goto out;
-
-	if (!info.exists)
-		goto out;
+	if (do_get_info(name, &info) != 0)
+		return -1;
 
-	r = info.minor;
-out:
-	dm_task_destroy(dmt);
-	return r;
+	*major = info.major;
+	*minor = info.minor;
+	return 0;
 }
 
 static int
@@ -1070,31 +1022,12 @@ out:
 int
 dm_geteventnr (char *name)
 {
-	struct dm_task *dmt;
 	struct dm_info info;
-	int event = -1;
 
-	if (!(dmt = dm_task_create(DM_DEVICE_INFO)))
+	if (do_get_info(name, &info) != 0)
 		return -1;
 
-	if (!dm_task_set_name(dmt, name))
-		goto out;
-
-	dm_task_no_open_count(dmt);
-
-	if (!dm_task_run(dmt))
-		goto out;
-
-	if (!dm_task_get_info(dmt, &info))
-		goto out;
-
-	if (info.exists)
-		event = info.event_nr;
-
-out:
-	dm_task_destroy(dmt);
-
-	return event;
+	return info.event_nr;
 }
 
 char *
@@ -1262,26 +1195,12 @@ cancel_remove_partmap (const char *name, void *unused)
 static int
 dm_get_deferred_remove (char * mapname)
 {
-	int r = -1;
-	struct dm_task *dmt;
 	struct dm_info info;
 
-	if (!(dmt = dm_task_create(DM_DEVICE_INFO)))
+	if (do_get_info(mapname, &info) != 0)
 		return -1;
 
-	if (!dm_task_set_name(dmt, mapname))
-		goto out;
-
-	if (!dm_task_run(dmt))
-		goto out;
-
-	if (!dm_task_get_info(dmt, &info))
-		goto out;
-
-	r = info.deferred_remove;
-out:
-	dm_task_destroy(dmt);
-	return r;
+	return info.deferred_remove;
 }
 
 static int
@@ -1328,9 +1247,6 @@ alloc_dminfo (void)
 int
 dm_get_info (char * mapname, struct dm_info ** dmi)
 {
-	int r = 1;
-	struct dm_task *dmt = NULL;
-
 	if (!mapname)
 		return 1;
 
@@ -1340,32 +1256,13 @@ dm_get_info (char * mapname, struct dm_info ** dmi)
 	if (!*dmi)
 		return 1;
 
-	if (!(dmt = dm_task_create(DM_DEVICE_INFO)))
-		goto out;
-
-	if (!dm_task_set_name(dmt, mapname))
-		goto out;
-
-	dm_task_no_open_count(dmt);
-
-	if (!dm_task_run(dmt))
-		goto out;
-
-	if (!dm_task_get_info(dmt, *dmi))
-		goto out;
-
-	r = 0;
-out:
-	if (r) {
+	if (do_get_info(mapname, *dmi) != 0) {
 		memset(*dmi, 0, sizeof(struct dm_info));
 		FREE(*dmi);
 		*dmi = NULL;
+		return 1;
 	}
-
-	if (dmt)
-		dm_task_destroy(dmt);
-
-	return r;
+	return 0;
 }
 
 struct rename_data {
diff --git a/libmultipath/devmapper.h b/libmultipath/devmapper.h
index aca4454..5b66865 100644
--- a/libmultipath/devmapper.h
+++ b/libmultipath/devmapper.h
@@ -52,8 +52,7 @@ int dm_enablegroup(char * mapname, int index);
 int dm_disablegroup(char * mapname, int index);
 int dm_get_maps (vector mp);
 int dm_geteventnr (char *name);
-int dm_get_major (char *name);
-int dm_get_minor (char *name);
+int dm_get_major_minor (const char *name, int *major, int *minor);
 char * dm_mapname(int major, int minor);
 int dm_remove_partmaps (const char * mapname, int need_sync,
 			int deferred_remove);
diff --git a/multipathd/cli_handlers.c b/multipathd/cli_handlers.c
index d598039..04c7386 100644
--- a/multipathd/cli_handlers.c
+++ b/multipathd/cli_handlers.c
@@ -746,7 +746,7 @@ cli_add_map (void * v, char ** reply, int * len, void * data)
 	char * param = get_keyparam(v, MAP);
 	int major, minor;
 	char dev_path[PATH_SIZE];
-	char *alias, *refwwid;
+	char *refwwid, *alias = NULL;
 	int rc, count = 0;
 	struct config *conf;
 
@@ -763,14 +763,12 @@ cli_add_map (void * v, char ** reply, int * len, void * data)
 	}
 	put_multipath_config(conf);
 	do {
-		minor = dm_get_minor(param);
-		if (minor < 0)
+		if (dm_get_major_minor(param, &major, &minor) < 0)
 			condlog(2, "%s: not a device mapper table", param);
-		major = dm_get_major(param);
-		if (major < 0)
-			condlog(2, "%s: not a device mapper table", param);
-		sprintf(dev_path, "dm-%d", minor);
-		alias = dm_mapname(major, minor);
+		else {
+			sprintf(dev_path, "dm-%d", minor);
+			alias = dm_mapname(major, minor);
+		}
 		/*if there is no mapname found, we first create the device*/
 		if (!alias && !count) {
 			condlog(2, "%s: mapname not found for %d:%d",
@@ -804,23 +802,15 @@ cli_del_map (void * v, char ** reply, int * len, void * data)
 	struct vectors * vecs = (struct vectors *)data;
 	char * param = get_keyparam(v, MAP);
 	int major, minor;
-	char dev_path[PATH_SIZE];
 	char *alias;
 	int rc;
 
 	param = convert_dev(param, 0);
 	condlog(2, "%s: remove map (operator)", param);
-	minor = dm_get_minor(param);
-	if (minor < 0) {
-		condlog(2, "%s: not a device mapper table", param);
-		return 1;
-	}
-	major = dm_get_major(param);
-	if (major < 0) {
+	if (dm_get_major_minor(param, &major, &minor) < 0) {
 		condlog(2, "%s: not a device mapper table", param);
 		return 1;
 	}
-	sprintf(dev_path,"dm-%d", minor);
 	alias = dm_mapname(major, minor);
 	if (!alias) {
 		condlog(2, "%s: mapname not found for %d:%d",
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 4/6] libmultipath: fix suspended devs from failed reloads
  2017-05-09 16:56 [PATCH 0/6] small multipath fixes Benjamin Marzinski
                   ` (2 preceding siblings ...)
  2017-05-09 16:57 ` [PATCH 3/6] libmultipath: refactor calls to get dm device info Benjamin Marzinski
@ 2017-05-09 16:57 ` Benjamin Marzinski
  2017-05-11 20:26   ` Martin Wilck
  2017-05-09 16:57 ` [PATCH 5/6] kpartx: fix device checks Benjamin Marzinski
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 16+ messages in thread
From: Benjamin Marzinski @ 2017-05-09 16:57 UTC (permalink / raw)
  To: device-mapper development

When multipath reloads a device, it can either fail while loading the
new table or while resuming the device. If it fails while resuming the
device, the device can get stuck in the suspended state.  To fix this,
multipath needs to resume the device again so that it can continue using
the old table.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 libmultipath/devmapper.c | 19 ++++++++++++++++++-
 libmultipath/devmapper.h |  1 +
 2 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/libmultipath/devmapper.c b/libmultipath/devmapper.c
index 2c4a13a..69b634b 100644
--- a/libmultipath/devmapper.c
+++ b/libmultipath/devmapper.c
@@ -396,7 +396,13 @@ int dm_addmap_reload(struct multipath *mpp, char *params, int flush)
 	if (r)
 		r = dm_simplecmd(DM_DEVICE_RESUME, mpp->alias, !flush,
 				 1, udev_flags, 0);
-	return r;
+	if (r)
+		return r;
+
+	if (dm_is_suspended(mpp->alias))
+		dm_simplecmd(DM_DEVICE_RESUME, mpp->alias, !flush, 1,
+			     udev_flags, 0);
+	return 0;
 }
 
 static int
@@ -1030,6 +1036,17 @@ dm_geteventnr (char *name)
 	return info.event_nr;
 }
 
+int
+dm_is_suspended(const char *name)
+{
+	struct dm_info info;
+
+	if (do_get_info(name, &info) != 0)
+		return -1;
+
+	return info.suspended;
+}
+
 char *
 dm_mapname(int major, int minor)
 {
diff --git a/libmultipath/devmapper.h b/libmultipath/devmapper.h
index 5b66865..fa739bc 100644
--- a/libmultipath/devmapper.h
+++ b/libmultipath/devmapper.h
@@ -52,6 +52,7 @@ int dm_enablegroup(char * mapname, int index);
 int dm_disablegroup(char * mapname, int index);
 int dm_get_maps (vector mp);
 int dm_geteventnr (char *name);
+int dm_is_suspended(const char *name);
 int dm_get_major_minor (const char *name, int *major, int *minor);
 char * dm_mapname(int major, int minor);
 int dm_remove_partmaps (const char * mapname, int need_sync,
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 5/6] kpartx: fix device checks
  2017-05-09 16:56 [PATCH 0/6] small multipath fixes Benjamin Marzinski
                   ` (3 preceding siblings ...)
  2017-05-09 16:57 ` [PATCH 4/6] libmultipath: fix suspended devs from failed reloads Benjamin Marzinski
@ 2017-05-09 16:57 ` Benjamin Marzinski
  2017-05-11 20:30   ` Martin Wilck
  2017-05-09 16:57 ` [PATCH 6/6] mpath_persist: Don't join threads that don't exist Benjamin Marzinski
  2017-05-17 22:32 ` [PATCH 0/6] small multipath fixes Christophe Varoqui
  6 siblings, 1 reply; 16+ messages in thread
From: Benjamin Marzinski @ 2017-05-09 16:57 UTC (permalink / raw)
  To: device-mapper development

There are a number of issues in the the kpartx device checking code.
First, it accepts files that are neither regular files or a block device
nodes (you can run kpartx on character devices or directories, and it
will treat them as block devices). When trying to figure out the
basename of a device, the code returns garbage if the path doesn't
include a '/'. Finally, the set_delimiter code can access memory outside
of the string if an empty string is passed in.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 kpartx/kpartx.c | 23 ++++++++++++++++-------
 1 file changed, 16 insertions(+), 7 deletions(-)

diff --git a/kpartx/kpartx.c b/kpartx/kpartx.c
index cc7e2e7..a1af156 100644
--- a/kpartx/kpartx.c
+++ b/kpartx/kpartx.c
@@ -135,10 +135,13 @@ set_delimiter (char * device, char * delimiter)
 {
 	char * p = device;
 
-	while (*(p++) != 0x0)
+	if (*p == 0x0)
+		return;
+
+	while (*(++p) != 0x0)
 		continue;
 
-	if (isdigit(*(p - 2)))
+	if (isdigit(*(p - 1)))
 		*delimiter = 'p';
 }
 
@@ -157,15 +160,17 @@ strip_slash (char * device)
 static int
 find_devname_offset (char * device)
 {
-	char *p, *q = NULL;
+	char *p, *q;
 
-	p = device;
+	q = p = device;
 
-	while (*p++)
+	while (*p) {
 		if (*p == '/')
-			q = p;
+			q = p + 1;
+		p++;
+	}
 
-	return (int)(q - device) + 1;
+	return (int)(q - device);
 }
 
 static char *
@@ -381,6 +386,10 @@ main(int argc, char **argv){
 			exit (1);
 		}
 	}
+	else if (!S_ISBLK(buf.st_mode)) {
+		fprintf(stderr, "invalid device: %s\n", device);
+		exit(1);
+	}
 
 	off = find_devname_offset(device);
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 6/6] mpath_persist: Don't join threads that don't exist
  2017-05-09 16:56 [PATCH 0/6] small multipath fixes Benjamin Marzinski
                   ` (4 preceding siblings ...)
  2017-05-09 16:57 ` [PATCH 5/6] kpartx: fix device checks Benjamin Marzinski
@ 2017-05-09 16:57 ` Benjamin Marzinski
  2017-05-11 20:33   ` Martin Wilck
  2017-05-17 22:32 ` [PATCH 0/6] small multipath fixes Christophe Varoqui
  6 siblings, 1 reply; 16+ messages in thread
From: Benjamin Marzinski @ 2017-05-09 16:57 UTC (permalink / raw)
  To: device-mapper development

There are a couple of places in the mpath_persist code that were calling
pthread join, even if the pthread_create call failed, and the thread id
was undefined. This can cause crashes, so don't do it.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 libmpathpersist/mpath_persist.c | 34 ++++++++++++++++++++++------------
 libmpathpersist/mpath_persist.h |  3 ++-
 2 files changed, 24 insertions(+), 13 deletions(-)

diff --git a/libmpathpersist/mpath_persist.c b/libmpathpersist/mpath_persist.c
index 982c795..bf06efe 100644
--- a/libmpathpersist/mpath_persist.c
+++ b/libmpathpersist/mpath_persist.c
@@ -519,14 +519,17 @@ int mpath_prout_reg(struct multipath *mpp,int rq_servact, int rq_scope,
 			rc = pthread_create(&thread[count].id, &attr, mpath_prout_pthread_fn, (void *)(&thread[count].param));
 			if (rc){
 				condlog (0, "%s: failed to create thread %d", mpp->wwid, rc);
+				thread[count].param.status = MPATH_PR_THREAD_ERROR;
 			}
 			count = count + 1;
 		}
 	}
 	for( i=0; i < active_pathcount ; i++){
-		rc = pthread_join(thread[i].id, NULL);
-		if (rc){
-			condlog (0, "%s: Thread[%d] failed to join thread %d", mpp->wwid, i, rc);
+		if (thread[i].param.status != MPATH_PR_THREAD_ERROR) {
+			rc = pthread_join(thread[i].id, NULL);
+			if (rc){
+				condlog (0, "%s: Thread[%d] failed to join thread %d", mpp->wwid, i, rc);
+			}
 		}
 		if (!rollback && (thread[i].param.status == MPATH_PR_RESERV_CONFLICT)){
 			rollback = 1;
@@ -554,14 +557,17 @@ int mpath_prout_reg(struct multipath *mpp,int rq_servact, int rq_scope,
 						(void *)(&thread[i].param));
 				if (rc){
 					condlog (0, "%s: failed to create thread for rollback. %d",  mpp->wwid, rc);
+					thread[i].param.status = MPATH_PR_THREAD_ERROR;
 				}
 			}
 		}
 		for(i=0; i < active_pathcount ; i++){
-			rc = pthread_join(thread[i].id, NULL);
-			if (rc){
-				condlog (3, "%s: failed to join thread while rolling back %d",
-						mpp->wwid, i);
+			if (thread[i].param.status != MPATH_PR_THREAD_ERROR) {
+				rc = pthread_join(thread[i].id, NULL);
+				if (rc){
+					condlog (3, "%s: failed to join thread while rolling back %d",
+						 mpp->wwid, i);
+				}
 			}
 		}
 	}
@@ -630,7 +636,7 @@ int send_prout_activepath(char * dev, int rq_servact, int rq_scope,
 	rc = pthread_create(&thread, &attr, mpath_prout_pthread_fn, (void *)(&param));
 	if (rc){
 		condlog (3, "%s: failed to create thread %d", dev, rc);
-		return MPATH_PR_OTHER;
+		return MPATH_PR_THREAD_ERROR;
 	}
 	/* Free attribute and wait for the other threads */
 	pthread_attr_destroy(&attr);
@@ -695,16 +701,20 @@ int mpath_prout_rel(struct multipath *mpp,int rq_servact, int rq_scope,
 			condlog (3, "%s: sending pr out command to %s", mpp->wwid, pp->dev);
 			rc = pthread_create (&thread[count].id, &attr, mpath_prout_pthread_fn,
 					(void *) (&thread[count].param));
-			if (rc)
+			if (rc) {
 				condlog (0, "%s: failed to create thread. %d",  mpp->wwid, rc);
+				thread[count].param.status = MPATH_PR_THREAD_ERROR;
+			}
 			count = count + 1;
 		}
 	}
 	pthread_attr_destroy (&attr);
 	for (i = 0; i < active_pathcount; i++){
-		rc = pthread_join (thread[i].id, NULL);
-		if (rc){
-			condlog (1, "%s: failed to join thread.  %d",  mpp->wwid,  rc);
+		if (thread[i].param.status != MPATH_PR_THREAD_ERROR) {
+			rc = pthread_join (thread[i].id, NULL);
+			if (rc){
+				condlog (1, "%s: failed to join thread.  %d",  mpp->wwid,  rc);
+			}
 		}
 	}
 
diff --git a/libmpathpersist/mpath_persist.h b/libmpathpersist/mpath_persist.h
index 79de5b5..0f95943 100644
--- a/libmpathpersist/mpath_persist.h
+++ b/libmpathpersist/mpath_persist.h
@@ -59,7 +59,8 @@ extern "C" {
 #define MPATH_PR_RESERV_CONFLICT	11  /* Reservation conflict on the device */
 #define MPATH_PR_FILE_ERROR		12  /* file (device node) problems(e.g. not found)*/
 #define MPATH_PR_DMMP_ERROR		13  /* DMMP related error.(e.g Error in getting dm info */
-#define MPATH_PR_OTHER			14  /*other error/warning has occurred(transport
+#define MPATH_PR_THREAD_ERROR		14  /* pthreads error (e.g. unable to create new thread) */
+#define MPATH_PR_OTHER			15  /*other error/warning has occurred(transport
 					      or driver error) */
 
 /* PR MASK */
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/6] libmultipath: refactor calls to get dm device info
  2017-05-09 16:57 ` [PATCH 3/6] libmultipath: refactor calls to get dm device info Benjamin Marzinski
@ 2017-05-11 20:26   ` Martin Wilck
  2017-05-12 16:59     ` Benjamin Marzinski
  0 siblings, 1 reply; 16+ messages in thread
From: Martin Wilck @ 2017-05-11 20:26 UTC (permalink / raw)
  To: dm-devel, Benjamin Marzinski

On Tue, 2017-05-09 at 11:57 -0500, Benjamin Marzinski wrote:
> Multipath has a number of functions that all do basically the same
> thing to get members of the dm_info structure. This just refactors
> them
> to use common code to do it.

This is nice, but it might be even better to simply store all relevant
information from DM_DEVICE_INFO in the mpp structure in the first
place, avoiding the need to do this ioctl multiple times to get various
pieces of information.

That can be done in a separate patch, of course.

Reviewed-by: Martin Wilck <mwilck@suse.com>

Regards
Martin


> 
> Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> ---
>  libmultipath/devmapper.c  | 171 +++++++++---------------------------
> ----------
>  libmultipath/devmapper.h  |   3 +-
>  multipathd/cli_handlers.c |  24 ++-----
>  3 files changed, 42 insertions(+), 156 deletions(-)
> 
> diff --git a/libmultipath/devmapper.c b/libmultipath/devmapper.c
> index 1c6b87e..2c4a13a 100644
> --- a/libmultipath/devmapper.c
> +++ b/libmultipath/devmapper.c
> @@ -399,16 +399,16 @@ int dm_addmap_reload(struct multipath *mpp,
> char *params, int flush)
>  	return r;
>  }
>  
> -int dm_map_present(const char * str)
> +static int
> +do_get_info(const char *name, struct dm_info *info)
>  {
> -	int r = 0;
> +	int r = -1;
>  	struct dm_task *dmt;
> -	struct dm_info info;
>  
>  	if (!(dmt = dm_task_create(DM_DEVICE_INFO)))
> -		return 0;
> +		return r;
>  
> -	if (!dm_task_set_name(dmt, str))
> +	if (!dm_task_set_name(dmt, name))
>  		goto out;
>  
>  	dm_task_no_open_count(dmt);
> @@ -416,16 +416,25 @@ int dm_map_present(const char * str)
>  	if (!dm_task_run(dmt))
>  		goto out;
>  
> -	if (!dm_task_get_info(dmt, &info))
> +	if (!dm_task_get_info(dmt, info))
>  		goto out;
>  
> -	if (info.exists)
> -		r = 1;
> +	if (!info->exists)
> +		goto out;
> +
> +	r = 0;
>  out:
>  	dm_task_destroy(dmt);
>  	return r;
>  }
>  
> +int dm_map_present(const char * str)
> +{
> +	struct dm_info info;
> +
> +	return (do_get_info(str, &info) == 0);
> +}
> +
>  int dm_get_map(const char *name, unsigned long long *size, char
> *outparams)
>  {
>  	int r = 1;
> @@ -646,29 +655,15 @@ out:
>  static int
>  dm_dev_t (const char * mapname, char * dev_t, int len)
>  {
> -	int r = 1;
> -	struct dm_task *dmt;
>  	struct dm_info info;
>  
> -	if (!(dmt = dm_task_create(DM_DEVICE_INFO)))
> -		return 0;
> -
> -	if (!dm_task_set_name(dmt, mapname))
> -		goto out;
> -
> -	if (!dm_task_run(dmt))
> -		goto out;
> -
> -	if (!dm_task_get_info(dmt, &info) || !info.exists)
> -		goto out;
> +	if (do_get_info(mapname, &info) != 0)
> +		return 1;
>  
>  	if (snprintf(dev_t, len, "%i:%i", info.major, info.minor) >
> len)
> -		goto out;
> +		return 1;
>  
> -	r = 0;
> -out:
> -	dm_task_destroy(dmt);
> -	return r;
> +	return 0;
>  }
>  
>  int
> @@ -700,59 +695,16 @@ out:
>  }
>  
>  int
> -dm_get_major (char * mapname)
> +dm_get_major_minor(const char *name, int *major, int *minor)
>  {
> -	int r = -1;
> -	struct dm_task *dmt;
>  	struct dm_info info;
>  
> -	if (!(dmt = dm_task_create(DM_DEVICE_INFO)))
> -		return 0;
> -
> -	if (!dm_task_set_name(dmt, mapname))
> -		goto out;
> -
> -	if (!dm_task_run(dmt))
> -		goto out;
> -
> -	if (!dm_task_get_info(dmt, &info))
> -		goto out;
> -
> -	if (!info.exists)
> -		goto out;
> -
> -	r = info.major;
> -out:
> -	dm_task_destroy(dmt);
> -	return r;
> -}
> -
> -int
> -dm_get_minor (char * mapname)
> -{
> -	int r = -1;
> -	struct dm_task *dmt;
> -	struct dm_info info;
> -
> -	if (!(dmt = dm_task_create(DM_DEVICE_INFO)))
> -		return 0;
> -
> -	if (!dm_task_set_name(dmt, mapname))
> -		goto out;
> -
> -	if (!dm_task_run(dmt))
> -		goto out;
> -
> -	if (!dm_task_get_info(dmt, &info))
> -		goto out;
> -
> -	if (!info.exists)
> -		goto out;
> +	if (do_get_info(name, &info) != 0)
> +		return -1;
>  
> -	r = info.minor;
> -out:
> -	dm_task_destroy(dmt);
> -	return r;
> +	*major = info.major;
> +	*minor = info.minor;
> +	return 0;
>  }
>  
>  static int
> @@ -1070,31 +1022,12 @@ out:
>  int
>  dm_geteventnr (char *name)
>  {
> -	struct dm_task *dmt;
>  	struct dm_info info;
> -	int event = -1;
>  
> -	if (!(dmt = dm_task_create(DM_DEVICE_INFO)))
> +	if (do_get_info(name, &info) != 0)
>  		return -1;
>  
> -	if (!dm_task_set_name(dmt, name))
> -		goto out;
> -
> -	dm_task_no_open_count(dmt);
> -
> -	if (!dm_task_run(dmt))
> -		goto out;
> -
> -	if (!dm_task_get_info(dmt, &info))
> -		goto out;
> -
> -	if (info.exists)
> -		event = info.event_nr;
> -
> -out:
> -	dm_task_destroy(dmt);
> -
> -	return event;
> +	return info.event_nr;
>  }
>  
>  char *
> @@ -1262,26 +1195,12 @@ cancel_remove_partmap (const char *name, void
> *unused)
>  static int
>  dm_get_deferred_remove (char * mapname)
>  {
> -	int r = -1;
> -	struct dm_task *dmt;
>  	struct dm_info info;
>  
> -	if (!(dmt = dm_task_create(DM_DEVICE_INFO)))
> +	if (do_get_info(mapname, &info) != 0)
>  		return -1;
>  
> -	if (!dm_task_set_name(dmt, mapname))
> -		goto out;
> -
> -	if (!dm_task_run(dmt))
> -		goto out;
> -
> -	if (!dm_task_get_info(dmt, &info))
> -		goto out;
> -
> -	r = info.deferred_remove;
> -out:
> -	dm_task_destroy(dmt);
> -	return r;
> +	return info.deferred_remove;
>  }
>  
>  static int
> @@ -1328,9 +1247,6 @@ alloc_dminfo (void)
>  int
>  dm_get_info (char * mapname, struct dm_info ** dmi)
>  {
> -	int r = 1;
> -	struct dm_task *dmt = NULL;
> -
>  	if (!mapname)
>  		return 1;
>  
> @@ -1340,32 +1256,13 @@ dm_get_info (char * mapname, struct dm_info
> ** dmi)
>  	if (!*dmi)
>  		return 1;
>  
> -	if (!(dmt = dm_task_create(DM_DEVICE_INFO)))
> -		goto out;
> -
> -	if (!dm_task_set_name(dmt, mapname))
> -		goto out;
> -
> -	dm_task_no_open_count(dmt);
> -
> -	if (!dm_task_run(dmt))
> -		goto out;
> -
> -	if (!dm_task_get_info(dmt, *dmi))
> -		goto out;
> -
> -	r = 0;
> -out:
> -	if (r) {
> +	if (do_get_info(mapname, *dmi) != 0) {
>  		memset(*dmi, 0, sizeof(struct dm_info));
>  		FREE(*dmi);
>  		*dmi = NULL;
> +		return 1;
>  	}
> -
> -	if (dmt)
> -		dm_task_destroy(dmt);
> -
> -	return r;
> +	return 0;
>  }
>  
>  struct rename_data {
> diff --git a/libmultipath/devmapper.h b/libmultipath/devmapper.h
> index aca4454..5b66865 100644
> --- a/libmultipath/devmapper.h
> +++ b/libmultipath/devmapper.h
> @@ -52,8 +52,7 @@ int dm_enablegroup(char * mapname, int index);
>  int dm_disablegroup(char * mapname, int index);
>  int dm_get_maps (vector mp);
>  int dm_geteventnr (char *name);
> -int dm_get_major (char *name);
> -int dm_get_minor (char *name);
> +int dm_get_major_minor (const char *name, int *major, int *minor);
>  char * dm_mapname(int major, int minor);
>  int dm_remove_partmaps (const char * mapname, int need_sync,
>  			int deferred_remove);
> diff --git a/multipathd/cli_handlers.c b/multipathd/cli_handlers.c
> index d598039..04c7386 100644
> --- a/multipathd/cli_handlers.c
> +++ b/multipathd/cli_handlers.c
> @@ -746,7 +746,7 @@ cli_add_map (void * v, char ** reply, int * len,
> void * data)
>  	char * param = get_keyparam(v, MAP);
>  	int major, minor;
>  	char dev_path[PATH_SIZE];
> -	char *alias, *refwwid;
> +	char *refwwid, *alias = NULL;
>  	int rc, count = 0;
>  	struct config *conf;
>  
> @@ -763,14 +763,12 @@ cli_add_map (void * v, char ** reply, int *
> len, void * data)
>  	}
>  	put_multipath_config(conf);
>  	do {
> -		minor = dm_get_minor(param);
> -		if (minor < 0)
> +		if (dm_get_major_minor(param, &major, &minor) < 0)
>  			condlog(2, "%s: not a device mapper table",
> param);
> -		major = dm_get_major(param);
> -		if (major < 0)
> -			condlog(2, "%s: not a device mapper table",
> param);
> -		sprintf(dev_path, "dm-%d", minor);
> -		alias = dm_mapname(major, minor);
> +		else {
> +			sprintf(dev_path, "dm-%d", minor);
> +			alias = dm_mapname(major, minor);
> +		}
>  		/*if there is no mapname found, we first create the
> device*/
>  		if (!alias && !count) {
>  			condlog(2, "%s: mapname not found for
> %d:%d",
> @@ -804,23 +802,15 @@ cli_del_map (void * v, char ** reply, int *
> len, void * data)
>  	struct vectors * vecs = (struct vectors *)data;
>  	char * param = get_keyparam(v, MAP);
>  	int major, minor;
> -	char dev_path[PATH_SIZE];
>  	char *alias;
>  	int rc;
>  
>  	param = convert_dev(param, 0);
>  	condlog(2, "%s: remove map (operator)", param);
> -	minor = dm_get_minor(param);
> -	if (minor < 0) {
> -		condlog(2, "%s: not a device mapper table", param);
> -		return 1;
> -	}
> -	major = dm_get_major(param);
> -	if (major < 0) {
> +	if (dm_get_major_minor(param, &major, &minor) < 0) {
>  		condlog(2, "%s: not a device mapper table", param);
>  		return 1;
>  	}
> -	sprintf(dev_path,"dm-%d", minor);
>  	alias = dm_mapname(major, minor);
>  	if (!alias) {
>  		condlog(2, "%s: mapname not found for %d:%d",

-- 
Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 4/6] libmultipath: fix suspended devs from failed reloads
  2017-05-09 16:57 ` [PATCH 4/6] libmultipath: fix suspended devs from failed reloads Benjamin Marzinski
@ 2017-05-11 20:26   ` Martin Wilck
  2017-05-16 17:33     ` Benjamin Marzinski
  0 siblings, 1 reply; 16+ messages in thread
From: Martin Wilck @ 2017-05-11 20:26 UTC (permalink / raw)
  To: dm-devel

On Tue, 2017-05-09 at 11:57 -0500, Benjamin Marzinski wrote:
> When multipath reloads a device, it can either fail while loading the
> new table or while resuming the device. If it fails while resuming
> the
> device, the device can get stuck in the suspended state.  To fix
> this,
> multipath needs to resume the device again so that it can continue
> using
> the old table.
> 
> Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> ---
>  libmultipath/devmapper.c | 19 ++++++++++++++++++-
>  libmultipath/devmapper.h |  1 +
>  2 files changed, 19 insertions(+), 1 deletion(-)
> 
> diff --git a/libmultipath/devmapper.c b/libmultipath/devmapper.c
> index 2c4a13a..69b634b 100644
> --- a/libmultipath/devmapper.c
> +++ b/libmultipath/devmapper.c
> @@ -396,7 +396,13 @@ int dm_addmap_reload(struct multipath *mpp, char
> *params, int flush)
>  	if (r)
>  		r = dm_simplecmd(DM_DEVICE_RESUME, mpp->alias,
> !flush,
>  				 1, udev_flags, 0);
> -	return r;
> +	if (r)
> +		return r;
> +
> +	if (dm_is_suspended(mpp->alias))
> +		dm_simplecmd(DM_DEVICE_RESUME, mpp->alias, !flush,
> 1,
> +			     udev_flags, 0);
> +	return 0;
>  }

Why would the second DM_DEVICE_RESUME call succeed if the first one
failed?

Martin

-- 
Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 5/6] kpartx: fix device checks
  2017-05-09 16:57 ` [PATCH 5/6] kpartx: fix device checks Benjamin Marzinski
@ 2017-05-11 20:30   ` Martin Wilck
  0 siblings, 0 replies; 16+ messages in thread
From: Martin Wilck @ 2017-05-11 20:30 UTC (permalink / raw)
  To: dm-devel

On Tue, 2017-05-09 at 11:57 -0500, Benjamin Marzinski wrote:
> There are a number of issues in the the kpartx device checking code.
> First, it accepts files that are neither regular files or a block
> device
> nodes (you can run kpartx on character devices or directories, and it
> will treat them as block devices). When trying to figure out the
> basename of a device, the code returns garbage if the path doesn't
> include a '/'. Finally, the set_delimiter code can access memory
> outside
> of the string if an empty string is passed in.
> 
> Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>

Reviewed-by: Martin Wilck <mwilck@suse.com>

-- 
Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 6/6] mpath_persist: Don't join threads that don't exist
  2017-05-09 16:57 ` [PATCH 6/6] mpath_persist: Don't join threads that don't exist Benjamin Marzinski
@ 2017-05-11 20:33   ` Martin Wilck
  0 siblings, 0 replies; 16+ messages in thread
From: Martin Wilck @ 2017-05-11 20:33 UTC (permalink / raw)
  To: Benjamin Marzinski, device-mapper development

On Tue, 2017-05-09 at 11:57 -0500, Benjamin Marzinski wrote:
> There are a couple of places in the mpath_persist code that were
> calling
> pthread join, even if the pthread_create call failed, and the thread
> id
> was undefined. This can cause crashes, so don't do it.
> 
> Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>

Reviewed-by: Martin Wilck <mwilck@suse.com>

-- 
Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/6] libmultipath: refactor calls to get dm device info
  2017-05-11 20:26   ` Martin Wilck
@ 2017-05-12 16:59     ` Benjamin Marzinski
  0 siblings, 0 replies; 16+ messages in thread
From: Benjamin Marzinski @ 2017-05-12 16:59 UTC (permalink / raw)
  To: Martin Wilck; +Cc: dm-devel

On Thu, May 11, 2017 at 10:26:00PM +0200, Martin Wilck wrote:
> On Tue, 2017-05-09 at 11:57 -0500, Benjamin Marzinski wrote:
> > Multipath has a number of functions that all do basically the same
> > thing to get members of the dm_info structure. This just refactors
> > them
> > to use common code to do it.
> 
> This is nice, but it might be even better to simply store all relevant
> information from DM_DEVICE_INFO in the mpp structure in the first
> place, avoiding the need to do this ioctl multiple times to get various
> pieces of information.

I totally agree that we are likely underusing mpp->dmi, but in a number
of these cases, you actually need the current information. In still
others, you don't necessarily even have a multipath structure to put it
in.  So, I think dm_get_info needs to simply return the info.

> 
> That can be done in a separate patch, of course.

Yeah. We definitely need to look at all the times where we grab a fresh
copy of the device info, and see if we are actually looking at a field
that can change over time, where we need the current data. mpp->dmi is
great for when we don't need current data, and we should definitely make
sure we're using it there. On the other hand, it's pointless to keep
freeing and mallocing it when we really need current info.

-Ben 

> Reviewed-by: Martin Wilck <mwilck@suse.com>
> 
> Regards
> Martin
> 
> 
> > 
> > Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> > ---
> >  libmultipath/devmapper.c  | 171 +++++++++---------------------------
> > ----------
> >  libmultipath/devmapper.h  |   3 +-
> >  multipathd/cli_handlers.c |  24 ++-----
> >  3 files changed, 42 insertions(+), 156 deletions(-)
> > 
> > diff --git a/libmultipath/devmapper.c b/libmultipath/devmapper.c
> > index 1c6b87e..2c4a13a 100644
> > --- a/libmultipath/devmapper.c
> > +++ b/libmultipath/devmapper.c
> > @@ -399,16 +399,16 @@ int dm_addmap_reload(struct multipath *mpp,
> > char *params, int flush)
> >  	return r;
> >  }
> >  
> > -int dm_map_present(const char * str)
> > +static int
> > +do_get_info(const char *name, struct dm_info *info)
> >  {
> > -	int r = 0;
> > +	int r = -1;
> >  	struct dm_task *dmt;
> > -	struct dm_info info;
> >  
> >  	if (!(dmt = dm_task_create(DM_DEVICE_INFO)))
> > -		return 0;
> > +		return r;
> >  
> > -	if (!dm_task_set_name(dmt, str))
> > +	if (!dm_task_set_name(dmt, name))
> >  		goto out;
> >  
> >  	dm_task_no_open_count(dmt);
> > @@ -416,16 +416,25 @@ int dm_map_present(const char * str)
> >  	if (!dm_task_run(dmt))
> >  		goto out;
> >  
> > -	if (!dm_task_get_info(dmt, &info))
> > +	if (!dm_task_get_info(dmt, info))
> >  		goto out;
> >  
> > -	if (info.exists)
> > -		r = 1;
> > +	if (!info->exists)
> > +		goto out;
> > +
> > +	r = 0;
> >  out:
> >  	dm_task_destroy(dmt);
> >  	return r;
> >  }
> >  
> > +int dm_map_present(const char * str)
> > +{
> > +	struct dm_info info;
> > +
> > +	return (do_get_info(str, &info) == 0);
> > +}
> > +
> >  int dm_get_map(const char *name, unsigned long long *size, char
> > *outparams)
> >  {
> >  	int r = 1;
> > @@ -646,29 +655,15 @@ out:
> >  static int
> >  dm_dev_t (const char * mapname, char * dev_t, int len)
> >  {
> > -	int r = 1;
> > -	struct dm_task *dmt;
> >  	struct dm_info info;
> >  
> > -	if (!(dmt = dm_task_create(DM_DEVICE_INFO)))
> > -		return 0;
> > -
> > -	if (!dm_task_set_name(dmt, mapname))
> > -		goto out;
> > -
> > -	if (!dm_task_run(dmt))
> > -		goto out;
> > -
> > -	if (!dm_task_get_info(dmt, &info) || !info.exists)
> > -		goto out;
> > +	if (do_get_info(mapname, &info) != 0)
> > +		return 1;
> >  
> >  	if (snprintf(dev_t, len, "%i:%i", info.major, info.minor) >
> > len)
> > -		goto out;
> > +		return 1;
> >  
> > -	r = 0;
> > -out:
> > -	dm_task_destroy(dmt);
> > -	return r;
> > +	return 0;
> >  }
> >  
> >  int
> > @@ -700,59 +695,16 @@ out:
> >  }
> >  
> >  int
> > -dm_get_major (char * mapname)
> > +dm_get_major_minor(const char *name, int *major, int *minor)
> >  {
> > -	int r = -1;
> > -	struct dm_task *dmt;
> >  	struct dm_info info;
> >  
> > -	if (!(dmt = dm_task_create(DM_DEVICE_INFO)))
> > -		return 0;
> > -
> > -	if (!dm_task_set_name(dmt, mapname))
> > -		goto out;
> > -
> > -	if (!dm_task_run(dmt))
> > -		goto out;
> > -
> > -	if (!dm_task_get_info(dmt, &info))
> > -		goto out;
> > -
> > -	if (!info.exists)
> > -		goto out;
> > -
> > -	r = info.major;
> > -out:
> > -	dm_task_destroy(dmt);
> > -	return r;
> > -}
> > -
> > -int
> > -dm_get_minor (char * mapname)
> > -{
> > -	int r = -1;
> > -	struct dm_task *dmt;
> > -	struct dm_info info;
> > -
> > -	if (!(dmt = dm_task_create(DM_DEVICE_INFO)))
> > -		return 0;
> > -
> > -	if (!dm_task_set_name(dmt, mapname))
> > -		goto out;
> > -
> > -	if (!dm_task_run(dmt))
> > -		goto out;
> > -
> > -	if (!dm_task_get_info(dmt, &info))
> > -		goto out;
> > -
> > -	if (!info.exists)
> > -		goto out;
> > +	if (do_get_info(name, &info) != 0)
> > +		return -1;
> >  
> > -	r = info.minor;
> > -out:
> > -	dm_task_destroy(dmt);
> > -	return r;
> > +	*major = info.major;
> > +	*minor = info.minor;
> > +	return 0;
> >  }
> >  
> >  static int
> > @@ -1070,31 +1022,12 @@ out:
> >  int
> >  dm_geteventnr (char *name)
> >  {
> > -	struct dm_task *dmt;
> >  	struct dm_info info;
> > -	int event = -1;
> >  
> > -	if (!(dmt = dm_task_create(DM_DEVICE_INFO)))
> > +	if (do_get_info(name, &info) != 0)
> >  		return -1;
> >  
> > -	if (!dm_task_set_name(dmt, name))
> > -		goto out;
> > -
> > -	dm_task_no_open_count(dmt);
> > -
> > -	if (!dm_task_run(dmt))
> > -		goto out;
> > -
> > -	if (!dm_task_get_info(dmt, &info))
> > -		goto out;
> > -
> > -	if (info.exists)
> > -		event = info.event_nr;
> > -
> > -out:
> > -	dm_task_destroy(dmt);
> > -
> > -	return event;
> > +	return info.event_nr;
> >  }
> >  
> >  char *
> > @@ -1262,26 +1195,12 @@ cancel_remove_partmap (const char *name, void
> > *unused)
> >  static int
> >  dm_get_deferred_remove (char * mapname)
> >  {
> > -	int r = -1;
> > -	struct dm_task *dmt;
> >  	struct dm_info info;
> >  
> > -	if (!(dmt = dm_task_create(DM_DEVICE_INFO)))
> > +	if (do_get_info(mapname, &info) != 0)
> >  		return -1;
> >  
> > -	if (!dm_task_set_name(dmt, mapname))
> > -		goto out;
> > -
> > -	if (!dm_task_run(dmt))
> > -		goto out;
> > -
> > -	if (!dm_task_get_info(dmt, &info))
> > -		goto out;
> > -
> > -	r = info.deferred_remove;
> > -out:
> > -	dm_task_destroy(dmt);
> > -	return r;
> > +	return info.deferred_remove;
> >  }
> >  
> >  static int
> > @@ -1328,9 +1247,6 @@ alloc_dminfo (void)
> >  int
> >  dm_get_info (char * mapname, struct dm_info ** dmi)
> >  {
> > -	int r = 1;
> > -	struct dm_task *dmt = NULL;
> > -
> >  	if (!mapname)
> >  		return 1;
> >  
> > @@ -1340,32 +1256,13 @@ dm_get_info (char * mapname, struct dm_info
> > ** dmi)
> >  	if (!*dmi)
> >  		return 1;
> >  
> > -	if (!(dmt = dm_task_create(DM_DEVICE_INFO)))
> > -		goto out;
> > -
> > -	if (!dm_task_set_name(dmt, mapname))
> > -		goto out;
> > -
> > -	dm_task_no_open_count(dmt);
> > -
> > -	if (!dm_task_run(dmt))
> > -		goto out;
> > -
> > -	if (!dm_task_get_info(dmt, *dmi))
> > -		goto out;
> > -
> > -	r = 0;
> > -out:
> > -	if (r) {
> > +	if (do_get_info(mapname, *dmi) != 0) {
> >  		memset(*dmi, 0, sizeof(struct dm_info));
> >  		FREE(*dmi);
> >  		*dmi = NULL;
> > +		return 1;
> >  	}
> > -
> > -	if (dmt)
> > -		dm_task_destroy(dmt);
> > -
> > -	return r;
> > +	return 0;
> >  }
> >  
> >  struct rename_data {
> > diff --git a/libmultipath/devmapper.h b/libmultipath/devmapper.h
> > index aca4454..5b66865 100644
> > --- a/libmultipath/devmapper.h
> > +++ b/libmultipath/devmapper.h
> > @@ -52,8 +52,7 @@ int dm_enablegroup(char * mapname, int index);
> >  int dm_disablegroup(char * mapname, int index);
> >  int dm_get_maps (vector mp);
> >  int dm_geteventnr (char *name);
> > -int dm_get_major (char *name);
> > -int dm_get_minor (char *name);
> > +int dm_get_major_minor (const char *name, int *major, int *minor);
> >  char * dm_mapname(int major, int minor);
> >  int dm_remove_partmaps (const char * mapname, int need_sync,
> >  			int deferred_remove);
> > diff --git a/multipathd/cli_handlers.c b/multipathd/cli_handlers.c
> > index d598039..04c7386 100644
> > --- a/multipathd/cli_handlers.c
> > +++ b/multipathd/cli_handlers.c
> > @@ -746,7 +746,7 @@ cli_add_map (void * v, char ** reply, int * len,
> > void * data)
> >  	char * param = get_keyparam(v, MAP);
> >  	int major, minor;
> >  	char dev_path[PATH_SIZE];
> > -	char *alias, *refwwid;
> > +	char *refwwid, *alias = NULL;
> >  	int rc, count = 0;
> >  	struct config *conf;
> >  
> > @@ -763,14 +763,12 @@ cli_add_map (void * v, char ** reply, int *
> > len, void * data)
> >  	}
> >  	put_multipath_config(conf);
> >  	do {
> > -		minor = dm_get_minor(param);
> > -		if (minor < 0)
> > +		if (dm_get_major_minor(param, &major, &minor) < 0)
> >  			condlog(2, "%s: not a device mapper table",
> > param);
> > -		major = dm_get_major(param);
> > -		if (major < 0)
> > -			condlog(2, "%s: not a device mapper table",
> > param);
> > -		sprintf(dev_path, "dm-%d", minor);
> > -		alias = dm_mapname(major, minor);
> > +		else {
> > +			sprintf(dev_path, "dm-%d", minor);
> > +			alias = dm_mapname(major, minor);
> > +		}
> >  		/*if there is no mapname found, we first create the
> > device*/
> >  		if (!alias && !count) {
> >  			condlog(2, "%s: mapname not found for
> > %d:%d",
> > @@ -804,23 +802,15 @@ cli_del_map (void * v, char ** reply, int *
> > len, void * data)
> >  	struct vectors * vecs = (struct vectors *)data;
> >  	char * param = get_keyparam(v, MAP);
> >  	int major, minor;
> > -	char dev_path[PATH_SIZE];
> >  	char *alias;
> >  	int rc;
> >  
> >  	param = convert_dev(param, 0);
> >  	condlog(2, "%s: remove map (operator)", param);
> > -	minor = dm_get_minor(param);
> > -	if (minor < 0) {
> > -		condlog(2, "%s: not a device mapper table", param);
> > -		return 1;
> > -	}
> > -	major = dm_get_major(param);
> > -	if (major < 0) {
> > +	if (dm_get_major_minor(param, &major, &minor) < 0) {
> >  		condlog(2, "%s: not a device mapper table", param);
> >  		return 1;
> >  	}
> > -	sprintf(dev_path,"dm-%d", minor);
> >  	alias = dm_mapname(major, minor);
> >  	if (!alias) {
> >  		condlog(2, "%s: mapname not found for %d:%d",
> 
> -- 
> Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
> SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton
> HRB 21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 4/6] libmultipath: fix suspended devs from failed reloads
  2017-05-11 20:26   ` Martin Wilck
@ 2017-05-16 17:33     ` Benjamin Marzinski
  2017-05-16 18:56       ` Martin Wilck
  0 siblings, 1 reply; 16+ messages in thread
From: Benjamin Marzinski @ 2017-05-16 17:33 UTC (permalink / raw)
  To: Martin Wilck; +Cc: dm-devel

On Thu, May 11, 2017 at 10:26:52PM +0200, Martin Wilck wrote:
> On Tue, 2017-05-09 at 11:57 -0500, Benjamin Marzinski wrote:
> > When multipath reloads a device, it can either fail while loading the
> > new table or while resuming the device. If it fails while resuming
> > the
> > device, the device can get stuck in the suspended state.  To fix
> > this,
> > multipath needs to resume the device again so that it can continue
> > using
> > the old table.
> > 
> > Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> > ---
> >  libmultipath/devmapper.c | 19 ++++++++++++++++++-
> >  libmultipath/devmapper.h |  1 +
> >  2 files changed, 19 insertions(+), 1 deletion(-)
> > 
> > diff --git a/libmultipath/devmapper.c b/libmultipath/devmapper.c
> > index 2c4a13a..69b634b 100644
> > --- a/libmultipath/devmapper.c
> > +++ b/libmultipath/devmapper.c
> > @@ -396,7 +396,13 @@ int dm_addmap_reload(struct multipath *mpp, char
> > *params, int flush)
> >  	if (r)
> >  		r = dm_simplecmd(DM_DEVICE_RESUME, mpp->alias,
> > !flush,
> >  				 1, udev_flags, 0);
> > -	return r;
> > +	if (r)
> > +		return r;
> > +
> > +	if (dm_is_suspended(mpp->alias))
> > +		dm_simplecmd(DM_DEVICE_RESUME, mpp->alias, !flush,
> > 1,
> > +			     udev_flags, 0);
> > +	return 0;
> >  }
> 
> Why would the second DM_DEVICE_RESUME call succeed if the first one
> failed?

Because if the first resume fails, device-mapper rolls back to the
original table.

The specific way that someone found this was by running a

multipathd resize

when only some of the paths had been rescaned and noticed the new,
larger, size. Since the first path had changed size (that's all the
multipath looks at to find the new size) the multipath device tried to
reload larger. However, when it tried to resume with the larger size, it
failed since some of the devices were too small.  The second resume lets
it try again with the original table.

Of course, that isn't the only way that this could fail, and it's nice
to be able to gracefully continue using the old table instead of just
being suspended.

-Ben

> 
> Martin
> 
> -- 
> Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
> SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton
> HRB 21284 (AG Nürnberg)
> 
> --
> dm-devel mailing list
> dm-devel@redhat.com
> https://www.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 4/6] libmultipath: fix suspended devs from failed reloads
  2017-05-16 17:33     ` Benjamin Marzinski
@ 2017-05-16 18:56       ` Martin Wilck
  2017-05-17 16:25         ` Benjamin Marzinski
  0 siblings, 1 reply; 16+ messages in thread
From: Martin Wilck @ 2017-05-16 18:56 UTC (permalink / raw)
  To: Benjamin Marzinski; +Cc: dm-devel

On Tue, 2017-05-16 at 12:33 -0500, Benjamin Marzinski wrote:
> On Thu, May 11, 2017 at 10:26:52PM +0200, Martin Wilck wrote:
> > On Tue, 2017-05-09 at 11:57 -0500, Benjamin Marzinski wrote:
> > > When multipath reloads a device, it can either fail while loading
> > > the
> > > new table or while resuming the device. If it fails while
> > > resuming
> > > the
> > > device, the device can get stuck in the suspended state.  To fix
> > > this,
> > > multipath needs to resume the device again so that it can
> > > continue
> > > using
> > > the old table.
> > > 
> > > Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> > > ---
> > >  libmultipath/devmapper.c | 19 ++++++++++++++++++-
> > >  libmultipath/devmapper.h |  1 +
> > >  2 files changed, 19 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/libmultipath/devmapper.c b/libmultipath/devmapper.c
> > > index 2c4a13a..69b634b 100644
> > > --- a/libmultipath/devmapper.c
> > > +++ b/libmultipath/devmapper.c
> > > @@ -396,7 +396,13 @@ int dm_addmap_reload(struct multipath *mpp,
> > > char
> > > *params, int flush)
> > >  	if (r)
> > >  		r = dm_simplecmd(DM_DEVICE_RESUME, mpp->alias,
> > > !flush,
> > >  				 1, udev_flags, 0);
> > > -	return r;
> > > +	if (r)
> > > +		return r;
> > > +
> > > +	if (dm_is_suspended(mpp->alias))
> > > +		dm_simplecmd(DM_DEVICE_RESUME, mpp->alias,
> > > !flush,
> > > 1,
> > > +			     udev_flags, 0);
> > > +	return 0;
> > >  }
> > 
> > Why would the second DM_DEVICE_RESUME call succeed if the first one
> > failed?
> 
> Because if the first resume fails, device-mapper rolls back to the
> original table.

Ah. I didn't know that, and I'm likely to forget it again. Would you
mind adding a short comment at that point in the code? 

Otherwise, ACK.

Martin

-- 
Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 4/6] libmultipath: fix suspended devs from failed reloads
  2017-05-16 18:56       ` Martin Wilck
@ 2017-05-17 16:25         ` Benjamin Marzinski
  0 siblings, 0 replies; 16+ messages in thread
From: Benjamin Marzinski @ 2017-05-17 16:25 UTC (permalink / raw)
  To: Martin Wilck; +Cc: dm-devel

On Tue, May 16, 2017 at 08:56:40PM +0200, Martin Wilck wrote:
> On Tue, 2017-05-16 at 12:33 -0500, Benjamin Marzinski wrote:
> > On Thu, May 11, 2017 at 10:26:52PM +0200, Martin Wilck wrote:
> > > On Tue, 2017-05-09 at 11:57 -0500, Benjamin Marzinski wrote:
> > > > When multipath reloads a device, it can either fail while loading
> > > > the
> > > > new table or while resuming the device. If it fails while
> > > > resuming
> > > > the
> > > > device, the device can get stuck in the suspended state.  To fix
> > > > this,
> > > > multipath needs to resume the device again so that it can
> > > > continue
> > > > using
> > > > the old table.
> > > > 
> > > > Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> > > > ---
> > > >  libmultipath/devmapper.c | 19 ++++++++++++++++++-
> > > >  libmultipath/devmapper.h |  1 +
> > > >  2 files changed, 19 insertions(+), 1 deletion(-)
> > > > 
> > > > diff --git a/libmultipath/devmapper.c b/libmultipath/devmapper.c
> > > > index 2c4a13a..69b634b 100644
> > > > --- a/libmultipath/devmapper.c
> > > > +++ b/libmultipath/devmapper.c
> > > > @@ -396,7 +396,13 @@ int dm_addmap_reload(struct multipath *mpp,
> > > > char
> > > > *params, int flush)
> > > >  	if (r)
> > > >  		r = dm_simplecmd(DM_DEVICE_RESUME, mpp->alias,
> > > > !flush,
> > > >  				 1, udev_flags, 0);
> > > > -	return r;
> > > > +	if (r)
> > > > +		return r;
> > > > +
> > > > +	if (dm_is_suspended(mpp->alias))
> > > > +		dm_simplecmd(DM_DEVICE_RESUME, mpp->alias,
> > > > !flush,
> > > > 1,
> > > > +			     udev_flags, 0);
> > > > +	return 0;
> > > >  }
> > > 
> > > Why would the second DM_DEVICE_RESUME call succeed if the first one
> > > failed?
> > 
> > Because if the first resume fails, device-mapper rolls back to the
> > original table.
> 
> Ah. I didn't know that, and I'm likely to forget it again. Would you
> mind adding a short comment at that point in the code? 

Sure.

-Ben

> 
> Otherwise, ACK.
> 
> Martin
> 
> -- 
> Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
> SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton
> HRB 21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 0/6] small multipath fixes
  2017-05-09 16:56 [PATCH 0/6] small multipath fixes Benjamin Marzinski
                   ` (5 preceding siblings ...)
  2017-05-09 16:57 ` [PATCH 6/6] mpath_persist: Don't join threads that don't exist Benjamin Marzinski
@ 2017-05-17 22:32 ` Christophe Varoqui
  6 siblings, 0 replies; 16+ messages in thread
From: Christophe Varoqui @ 2017-05-17 22:32 UTC (permalink / raw)
  To: Benjamin Marzinski; +Cc: device-mapper development


[-- Attachment #1.1: Type: text/plain, Size: 1003 bytes --]

This set is merged.
Thanks.

On Tue, May 9, 2017 at 6:56 PM, Benjamin Marzinski <bmarzins@redhat.com>
wrote:

> Here are a number of small multipath fixes. These patches are based on
> top of my and Martin's previous patch sets.
>
> Benjamin Marzinski (6):
>   libmultipath: print alias with no_path_retry message
>   multipathd: force reload device on all resizes
>   libmultipath: refactor calls to get dm device info
>   libmultipath: fix suspended devs from failed reloads
>   kpartx: fix device checks
>   mpath_persist: Don't join threads that don't exist
>
>  kpartx/kpartx.c                 |  23 +++--
>  libmpathpersist/mpath_persist.c |  34 +++++---
>  libmpathpersist/mpath_persist.h |   3 +-
>  libmultipath/devmapper.c        | 184 +++++++++++-------------------
> ----------
>  libmultipath/devmapper.h        |   4 +-
>  libmultipath/propsel.c          |   2 +-
>  multipathd/cli_handlers.c       |  25 ++----
>  7 files changed, 100 insertions(+), 175 deletions(-)
>
> --
> 1.8.3.1
>
>

[-- Attachment #1.2: Type: text/html, Size: 1508 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2017-05-17 22:32 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-09 16:56 [PATCH 0/6] small multipath fixes Benjamin Marzinski
2017-05-09 16:57 ` [PATCH 1/6] libmultipath: print alias with no_path_retry message Benjamin Marzinski
2017-05-09 16:57 ` [PATCH 2/6] multipathd: force reload device on all resizes Benjamin Marzinski
2017-05-09 16:57 ` [PATCH 3/6] libmultipath: refactor calls to get dm device info Benjamin Marzinski
2017-05-11 20:26   ` Martin Wilck
2017-05-12 16:59     ` Benjamin Marzinski
2017-05-09 16:57 ` [PATCH 4/6] libmultipath: fix suspended devs from failed reloads Benjamin Marzinski
2017-05-11 20:26   ` Martin Wilck
2017-05-16 17:33     ` Benjamin Marzinski
2017-05-16 18:56       ` Martin Wilck
2017-05-17 16:25         ` Benjamin Marzinski
2017-05-09 16:57 ` [PATCH 5/6] kpartx: fix device checks Benjamin Marzinski
2017-05-11 20:30   ` Martin Wilck
2017-05-09 16:57 ` [PATCH 6/6] mpath_persist: Don't join threads that don't exist Benjamin Marzinski
2017-05-11 20:33   ` Martin Wilck
2017-05-17 22:32 ` [PATCH 0/6] small multipath fixes Christophe Varoqui

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.