All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/7] Fix muitpath/multipathd flush issue
@ 2020-06-25 20:42 Benjamin Marzinski
  2020-06-25 20:42 ` [PATCH v2 1/7] libmultipath: make dm_get_map/status return codes symbolic Benjamin Marzinski
                   ` (6 more replies)
  0 siblings, 7 replies; 15+ messages in thread
From: Benjamin Marzinski @ 2020-06-25 20:42 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck

If a multipath device is removed, and check_path() checks one of its
paths before multipathd processes either the uevent or the dm event from
removing it, multipathd will recreate the removed device. This happens
because check_path() will continute to check the removed device's former
paths until an event arrives removing the device.  A missing multpath
device will cause the update_multipath_strings() call to fail, setting
pp->dmstate to PSTATE_UNDEF.  If the path is up, this dmstate will cause
reinstate_path() to be called, which will also fail, because the
multipath device doesn't exist.  This will trigger a reload, restoring
the recently removed device.

This patchset handles this is two ways. The first two patches directly
fix these issues in check_path(), so that a missing multipath device
will no longer get recreated when checking one of its former paths.

The other 5 patches add a "multipathd del maps" command, and make the
mutipath command delegate flush operations to multipathd so multipathd's
state remains in sync with the kernel's, while doing removes.

Changes from v1:
0001:	This is completely different. I've change how patch 0002 detects
	when a multipath device has disappeared. Instead of do_get_info,
	I need more return values on dm_get_map/status, and the
	update_multipath_* functions. I have used symbolic return
	values, which was the objection to the previous patch. I tried
	to picked generic enough names that they could be used for other
	devmapper.c functions as well.

0002:	At Martin's suggestion, multipath now differentiates between dm
	failures and a missing multipath device in
	update_multipath_strings(). So there is no need to recheck if
	the device is missing.

0006:	Added some man page info that was previously in patch 0007

0007:	Removed mentions of the -D option in the man page and usage
	output.

I did not change anything about suspending the device on remove.  It was
not obvious how to break that up between multipath and multipathd, and
there are likely still some conversations to be had about what we need
to do to remove a device in multipath, and in what order.  So, for now,
I just kept all those functions the same, and just added the code
necessary to make multpathd work with them correctly, as is.

Benjamin Marzinski (7):
  libmultipath: make dm_get_map/status return codes symbolic
  multipathd: fix check_path errors with removed map
  libmultipath: make dm_flush_maps only return 0 on success
  multipathd: add "del maps" multipathd command
  multipath: make flushing maps work like other commands
  multipath: delegate flushing maps to multipathd
  multipath: add option to skip multipathd delegation

 libmultipath/config.h      |  4 ++-
 libmultipath/configure.h   |  3 --
 libmultipath/devmapper.c   | 63 +++++++++++++++++++++++++-------------
 libmultipath/devmapper.h   |  8 ++++-
 libmultipath/structs_vec.c | 45 +++++++++++++++------------
 multipath/main.c           | 44 ++++++++++++++++++--------
 multipath/multipath.8      |  4 +--
 multipathd/cli.c           |  1 +
 multipathd/cli_handlers.c  | 19 ++++++++++++
 multipathd/cli_handlers.h  |  1 +
 multipathd/main.c          | 50 +++++++++++++-----------------
 multipathd/main.h          |  1 +
 12 files changed, 155 insertions(+), 88 deletions(-)

-- 
2.17.2

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v2 1/7] libmultipath: make dm_get_map/status return codes symbolic
  2020-06-25 20:42 [PATCH v2 0/7] Fix muitpath/multipathd flush issue Benjamin Marzinski
@ 2020-06-25 20:42 ` Benjamin Marzinski
  2020-07-01 20:15   ` Martin Wilck
  2020-06-25 20:42 ` [PATCH v2 2/7] multipathd: fix check_path errors with removed map Benjamin Marzinski
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 15+ messages in thread
From: Benjamin Marzinski @ 2020-06-25 20:42 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck

dm_get_map() and dm_get_status() now use symbolic return codes. They
also differentiate between failing to get information from device-mapper
and not finding the requested device. These symboilc return codes are
also used by update_multipath_* functions.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 libmultipath/devmapper.c   | 51 +++++++++++++++++++++++++-------------
 libmultipath/devmapper.h   |  6 +++++
 libmultipath/structs_vec.c | 45 +++++++++++++++++++--------------
 multipathd/main.c          | 12 ++++-----
 4 files changed, 72 insertions(+), 42 deletions(-)

diff --git a/libmultipath/devmapper.c b/libmultipath/devmapper.c
index 27d52398..f5cfe296 100644
--- a/libmultipath/devmapper.c
+++ b/libmultipath/devmapper.c
@@ -534,36 +534,43 @@ int dm_map_present(const char * str)
 
 int dm_get_map(const char *name, unsigned long long *size, char *outparams)
 {
-	int r = 1;
+	int r = DMP_ERR;
 	struct dm_task *dmt;
 	uint64_t start, length;
 	char *target_type = NULL;
 	char *params = NULL;
 
 	if (!(dmt = libmp_dm_task_create(DM_DEVICE_TABLE)))
-		return 1;
+		return r;
 
 	if (!dm_task_set_name(dmt, name))
 		goto out;
 
 	dm_task_no_open_count(dmt);
 
-	if (!dm_task_run(dmt))
+	errno = 0;
+	if (!dm_task_run(dmt)) {
+		if (errno == ENXIO)
+			r = DMP_FAIL;
 		goto out;
+	}
 
+	r = DMP_FAIL;
 	/* Fetch 1st target */
-	dm_get_next_target(dmt, NULL, &start, &length,
-			   &target_type, &params);
+	if (dm_get_next_target(dmt, NULL, &start, &length,
+			       &target_type, &params) != NULL)
+		/* more than one target */
+		goto out;
 
 	if (size)
 		*size = length;
 
 	if (!outparams) {
-		r = 0;
+		r = DMP_PASS;
 		goto out;
 	}
 	if (snprintf(outparams, PARAMS_SIZE, "%s", params) <= PARAMS_SIZE)
-		r = 0;
+		r = DMP_PASS;
 out:
 	dm_task_destroy(dmt);
 	return r;
@@ -637,35 +644,45 @@ is_mpath_part(const char *part_name, const char *map_name)
 
 int dm_get_status(const char *name, char *outstatus)
 {
-	int r = 1;
+	int r = DMP_ERR;
 	struct dm_task *dmt;
 	uint64_t start, length;
 	char *target_type = NULL;
 	char *status = NULL;
 
 	if (!(dmt = libmp_dm_task_create(DM_DEVICE_STATUS)))
-		return 1;
+		return r;
 
 	if (!dm_task_set_name(dmt, name))
 		goto out;
 
 	dm_task_no_open_count(dmt);
 
-	if (!dm_task_run(dmt))
+	errno = 0;
+	if (!dm_task_run(dmt)) {
+		if (errno == ENXIO)
+			r = DMP_FAIL;
 		goto out;
+	}
 
+	r = DMP_FAIL;
 	/* Fetch 1st target */
-	dm_get_next_target(dmt, NULL, &start, &length,
-			   &target_type, &status);
+	if (dm_get_next_target(dmt, NULL, &start, &length,
+			       &target_type, &status) != NULL)
+		goto out;
+
+	if (!target_type || strcmp(target_type, TGT_MPATH) != 0)
+		goto out;
+
 	if (!status) {
 		condlog(2, "get null status.");
 		goto out;
 	}
 
 	if (snprintf(outstatus, PARAMS_SIZE, "%s", status) <= PARAMS_SIZE)
-		r = 0;
+		r = DMP_PASS;
 out:
-	if (r)
+	if (r != DMP_PASS)
 		condlog(0, "%s: error getting map status string", name);
 
 	dm_task_destroy(dmt);
@@ -920,7 +937,7 @@ int _dm_flush_map (const char * mapname, int need_sync, int deferred_remove,
 			return 1;
 
 	if (need_suspend &&
-	    !dm_get_map(mapname, &mapsize, params) &&
+	    dm_get_map(mapname, &mapsize, params) == DMP_PASS &&
 	    strstr(params, "queue_if_no_path")) {
 		if (!dm_queue_if_no_path(mapname, 0))
 			queue_if_no_path = 1;
@@ -1129,7 +1146,7 @@ struct multipath *dm_get_multipath(const char *name)
 	if (!mpp->alias)
 		goto out;
 
-	if (dm_get_map(name, &mpp->size, NULL))
+	if (dm_get_map(name, &mpp->size, NULL) != DMP_PASS)
 		goto out;
 
 	dm_get_uuid(name, mpp->wwid, WWID_SIZE);
@@ -1313,7 +1330,7 @@ do_foreach_partmaps (const char * mapname,
 		    /*
 		     * and we can fetch the map table from the kernel
 		     */
-		    !dm_get_map(names->name, &size, &params[0]) &&
+		    dm_get_map(names->name, &size, &params[0]) == DMP_PASS &&
 
 		    /*
 		     * and the table maps over the multipath map
diff --git a/libmultipath/devmapper.h b/libmultipath/devmapper.h
index 5ed7edc5..5b18bf4b 100644
--- a/libmultipath/devmapper.h
+++ b/libmultipath/devmapper.h
@@ -27,6 +27,12 @@
 #define UUID_PREFIX "mpath-"
 #define UUID_PREFIX_LEN (sizeof(UUID_PREFIX) - 1)
 
+enum {
+	DMP_ERR = -1,
+	DMP_PASS,
+	DMP_FAIL,
+};
+
 void dm_init(int verbosity);
 int dm_prereq(unsigned int *v);
 void skip_libmp_dm_init(void);
diff --git a/libmultipath/structs_vec.c b/libmultipath/structs_vec.c
index 077f2e42..2225abeb 100644
--- a/libmultipath/structs_vec.c
+++ b/libmultipath/structs_vec.c
@@ -196,43 +196,47 @@ extract_hwe_from_path(struct multipath * mpp)
 int
 update_multipath_table (struct multipath *mpp, vector pathvec, int is_daemon)
 {
+	int r = DMP_ERR;
 	char params[PARAMS_SIZE] = {0};
 
 	if (!mpp)
-		return 1;
+		return r;
 
-	if (dm_get_map(mpp->alias, &mpp->size, params)) {
-		condlog(3, "%s: cannot get map", mpp->alias);
-		return 1;
+	r = dm_get_map(mpp->alias, &mpp->size, params);
+	if (r != DMP_PASS) {
+		condlog(3, "%s: %s", mpp->alias, (r == DMP_ERR)? "error getting table" : "map not present");
+		return r;
 	}
 
 	if (disassemble_map(pathvec, params, mpp, is_daemon)) {
 		condlog(3, "%s: cannot disassemble map", mpp->alias);
-		return 1;
+		return DMP_ERR;
 	}
 
-	return 0;
+	return DMP_PASS;
 }
 
 int
 update_multipath_status (struct multipath *mpp)
 {
+	int r = DMP_ERR;
 	char status[PARAMS_SIZE] = {0};
 
 	if (!mpp)
-		return 1;
+		return r;
 
-	if (dm_get_status(mpp->alias, status)) {
-		condlog(3, "%s: cannot get status", mpp->alias);
-		return 1;
+	r = dm_get_status(mpp->alias, status);
+	if (r != DMP_PASS) {
+		condlog(3, "%s: %s", mpp->alias, (r == DMP_ERR)? "error getting status" : "map not present");
+		return r;
 	}
 
 	if (disassemble_status(status, mpp)) {
 		condlog(3, "%s: cannot disassemble status", mpp->alias);
-		return 1;
+		return DMP_ERR;
 	}
 
-	return 0;
+	return DMP_PASS;
 }
 
 void sync_paths(struct multipath *mpp, vector pathvec)
@@ -264,10 +268,10 @@ int
 update_multipath_strings(struct multipath *mpp, vector pathvec, int is_daemon)
 {
 	struct pathgroup *pgp;
-	int i;
+	int i, r = DMP_ERR;
 
 	if (!mpp)
-		return 1;
+		return r;
 
 	update_mpp_paths(mpp, pathvec);
 	condlog(4, "%s: %s", mpp->alias, __FUNCTION__);
@@ -276,18 +280,21 @@ update_multipath_strings(struct multipath *mpp, vector pathvec, int is_daemon)
 	free_pgvec(mpp->pg, KEEP_PATHS);
 	mpp->pg = NULL;
 
-	if (update_multipath_table(mpp, pathvec, is_daemon))
-		return 1;
+	r = update_multipath_table(mpp, pathvec, is_daemon);
+	if (r != DMP_PASS)
+		return r;
+
 	sync_paths(mpp, pathvec);
 
-	if (update_multipath_status(mpp))
-		return 1;
+	r = update_multipath_status(mpp);
+	if (r != DMP_PASS)
+		return r;
 
 	vector_foreach_slot(mpp->pg, pgp, i)
 		if (pgp->paths)
 			path_group_prio_update(pgp);
 
-	return 0;
+	return DMP_PASS;
 }
 
 static void enter_recovery_mode(struct multipath *mpp)
diff --git a/multipathd/main.c b/multipathd/main.c
index 205ddb32..d73b1b9a 100644
--- a/multipathd/main.c
+++ b/multipathd/main.c
@@ -418,7 +418,7 @@ int __setup_multipath(struct vectors *vecs, struct multipath *mpp,
 		goto out;
 	}
 
-	if (update_multipath_strings(mpp, vecs->pathvec, 1)) {
+	if (update_multipath_strings(mpp, vecs->pathvec, 1) != DMP_PASS) {
 		condlog(0, "%s: failed to setup multipath", mpp->alias);
 		goto out;
 	}
@@ -557,9 +557,9 @@ add_map_without_path (struct vectors *vecs, const char *alias)
 	mpp->mpe = find_mpe(conf->mptable, mpp->wwid);
 	put_multipath_config(conf);
 
-	if (update_multipath_table(mpp, vecs->pathvec, 1))
+	if (update_multipath_table(mpp, vecs->pathvec, 1) != DMP_PASS)
 		goto out;
-	if (update_multipath_status(mpp))
+	if (update_multipath_status(mpp) != DMP_PASS)
 		goto out;
 
 	if (!vector_alloc_slot(vecs->mpvec))
@@ -1350,8 +1350,8 @@ map_discovery (struct vectors * vecs)
 		return 1;
 
 	vector_foreach_slot (vecs->mpvec, mpp, i)
-		if (update_multipath_table(mpp, vecs->pathvec, 1) ||
-		    update_multipath_status(mpp)) {
+		if (update_multipath_table(mpp, vecs->pathvec, 1) != DMP_PASS ||
+		    update_multipath_status(mpp) != DMP_PASS) {
 			remove_map(mpp, vecs, 1);
 			i--;
 		}
@@ -2091,7 +2091,7 @@ check_path (struct vectors * vecs, struct path * pp, unsigned int ticks)
 	/*
 	 * Synchronize with kernel state
 	 */
-	if (update_multipath_strings(pp->mpp, vecs->pathvec, 1)) {
+	if (update_multipath_strings(pp->mpp, vecs->pathvec, 1) != DMP_PASS) {
 		condlog(1, "%s: Could not synchronize with kernel state",
 			pp->dev);
 		pp->dmstate = PSTATE_UNDEF;
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v2 2/7] multipathd: fix check_path errors with removed map
  2020-06-25 20:42 [PATCH v2 0/7] Fix muitpath/multipathd flush issue Benjamin Marzinski
  2020-06-25 20:42 ` [PATCH v2 1/7] libmultipath: make dm_get_map/status return codes symbolic Benjamin Marzinski
@ 2020-06-25 20:42 ` Benjamin Marzinski
  2020-07-01 20:19   ` Martin Wilck
  2020-06-25 20:42 ` [PATCH v2 3/7] libmultipath: make dm_flush_maps only return 0 on success Benjamin Marzinski
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 15+ messages in thread
From: Benjamin Marzinski @ 2020-06-25 20:42 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck

If a multipath device is removed during, or immediately before the call
to check_path(), multipathd can behave incorrectly. A missing multpath
device will cause update_multipath_strings() to fail, setting
pp->dmstate to PSTATE_UNDEF.  If the path is up, this state will cause
reinstate_path() to be called, which will also fail.  This will trigger
a reload, restoring the recently removed device.

If update_multipath_strings() fails because there is no multipath
device, check_path should just quit, since the remove dmevent and uevent
are likely already queued up. Also, I don't see any reason to reload the
multipath device if reinstate fails. This code was added by
fac68d7a99ef17d496079538a5c6836acd7911ab, which clamined that reinstate
could fail if the path was disabled.  Looking through the current kernel
code, I can't see any reason why a reinstate would fail, where a reload
would help. If the path was missing from the multipath device,
update_multipath_strings() would already catch that, and quit
check_path() early, which make more sense to me than reloading does.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 multipathd/main.c | 37 ++++++++++++++-----------------------
 1 file changed, 14 insertions(+), 23 deletions(-)

diff --git a/multipathd/main.c b/multipathd/main.c
index d73b1b9a..22bc4363 100644
--- a/multipathd/main.c
+++ b/multipathd/main.c
@@ -1615,22 +1615,18 @@ fail_path (struct path * pp, int del_active)
 /*
  * caller must have locked the path list before calling that function
  */
-static int
+static void
 reinstate_path (struct path * pp)
 {
-	int ret = 0;
-
 	if (!pp->mpp)
-		return 0;
+		return;
 
-	if (dm_reinstate_path(pp->mpp->alias, pp->dev_t)) {
+	if (dm_reinstate_path(pp->mpp->alias, pp->dev_t))
 		condlog(0, "%s: reinstate failed", pp->dev_t);
-		ret = 1;
-	} else {
+	else {
 		condlog(2, "%s: reinstated", pp->dev_t);
 		update_queue_mode_add_path(pp->mpp);
 	}
-	return ret;
 }
 
 static void
@@ -2091,9 +2087,13 @@ check_path (struct vectors * vecs, struct path * pp, unsigned int ticks)
 	/*
 	 * Synchronize with kernel state
 	 */
-	if (update_multipath_strings(pp->mpp, vecs->pathvec, 1) != DMP_PASS) {
+	ret = update_multipath_strings(pp->mpp, vecs->pathvec, 1);
+	if (ret != DMP_PASS) {
 		condlog(1, "%s: Could not synchronize with kernel state",
 			pp->dev);
+		if (ret == DMP_FAIL)
+			/* multipath device missing. Likely removed */
+			return 0;
 		pp->dmstate = PSTATE_UNDEF;
 	}
 	/* if update_multipath_strings orphaned the path, quit early */
@@ -2183,12 +2183,8 @@ check_path (struct vectors * vecs, struct path * pp, unsigned int ticks)
 		/*
 		 * reinstate this path
 		 */
-		if (!disable_reinstate && reinstate_path(pp)) {
-			condlog(3, "%s: reload map", pp->dev);
-			ev_add_path(pp, vecs, 1);
-			pp->tick = 1;
-			return 0;
-		}
+		if (!disable_reinstate)
+			reinstate_path(pp);
 		new_path_up = 1;
 
 		if (oldchkrstate != PATH_UP && oldchkrstate != PATH_GHOST)
@@ -2204,15 +2200,10 @@ check_path (struct vectors * vecs, struct path * pp, unsigned int ticks)
 	else if (newstate == PATH_UP || newstate == PATH_GHOST) {
 		if ((pp->dmstate == PSTATE_FAILED ||
 		    pp->dmstate == PSTATE_UNDEF) &&
-		    !disable_reinstate) {
+		    !disable_reinstate)
 			/* Clear IO errors */
-			if (reinstate_path(pp)) {
-				condlog(3, "%s: reload map", pp->dev);
-				ev_add_path(pp, vecs, 1);
-				pp->tick = 1;
-				return 0;
-			}
-		} else {
+			reinstate_path(pp);
+		else {
 			LOG_MSG(4, verbosity, pp);
 			if (pp->checkint != max_checkint) {
 				/*
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v2 3/7] libmultipath: make dm_flush_maps only return 0 on success
  2020-06-25 20:42 [PATCH v2 0/7] Fix muitpath/multipathd flush issue Benjamin Marzinski
  2020-06-25 20:42 ` [PATCH v2 1/7] libmultipath: make dm_get_map/status return codes symbolic Benjamin Marzinski
  2020-06-25 20:42 ` [PATCH v2 2/7] multipathd: fix check_path errors with removed map Benjamin Marzinski
@ 2020-06-25 20:42 ` Benjamin Marzinski
  2020-06-25 20:42 ` [PATCH v2 4/7] multipathd: add "del maps" multipathd command Benjamin Marzinski
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 15+ messages in thread
From: Benjamin Marzinski @ 2020-06-25 20:42 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck

dm_flush_maps() returned both 0 and 1 on error, depending on which part
of the function it was in, but the caller was always treating 0 as a
success. Make dm_flush_maps() always return 1 on error and 0 on success.

Reviewed-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 libmultipath/devmapper.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/libmultipath/devmapper.c b/libmultipath/devmapper.c
index f5cfe296..b799634a 100644
--- a/libmultipath/devmapper.c
+++ b/libmultipath/devmapper.c
@@ -1007,13 +1007,13 @@ dm_flush_map_nopaths(const char * mapname, int deferred_remove)
 
 int dm_flush_maps (int retries)
 {
-	int r = 0;
+	int r = 1;
 	struct dm_task *dmt;
 	struct dm_names *names;
 	unsigned next = 0;
 
 	if (!(dmt = libmp_dm_task_create (DM_DEVICE_LIST)))
-		return 0;
+		return r;
 
 	dm_task_no_open_count(dmt);
 
@@ -1026,6 +1026,7 @@ int dm_flush_maps (int retries)
 	if (!names->dev)
 		goto out;
 
+	r = 0;
 	do {
 		r |= dm_suspend_and_flush_map(names->name, retries);
 		next = names->next;
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v2 4/7] multipathd: add "del maps" multipathd command
  2020-06-25 20:42 [PATCH v2 0/7] Fix muitpath/multipathd flush issue Benjamin Marzinski
                   ` (2 preceding siblings ...)
  2020-06-25 20:42 ` [PATCH v2 3/7] libmultipath: make dm_flush_maps only return 0 on success Benjamin Marzinski
@ 2020-06-25 20:42 ` Benjamin Marzinski
  2020-07-01 20:59   ` Martin Wilck
  2020-06-25 20:42 ` [PATCH v2 5/7] multipath: make flushing maps work like other commands Benjamin Marzinski
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 15+ messages in thread
From: Benjamin Marzinski @ 2020-06-25 20:42 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck

This will flush all multipath devices.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 libmultipath/devmapper.c  |  7 +++++--
 libmultipath/devmapper.h  |  2 +-
 multipath/main.c          |  2 +-
 multipathd/cli.c          |  1 +
 multipathd/cli_handlers.c | 19 +++++++++++++++++++
 multipathd/cli_handlers.h |  1 +
 multipathd/main.c         |  3 ++-
 multipathd/main.h         |  1 +
 8 files changed, 31 insertions(+), 5 deletions(-)

diff --git a/libmultipath/devmapper.c b/libmultipath/devmapper.c
index b799634a..899c24d2 100644
--- a/libmultipath/devmapper.c
+++ b/libmultipath/devmapper.c
@@ -1005,7 +1005,7 @@ dm_flush_map_nopaths(const char * mapname, int deferred_remove)
 
 #endif
 
-int dm_flush_maps (int retries)
+int dm_flush_maps (int need_suspend, int retries)
 {
 	int r = 1;
 	struct dm_task *dmt;
@@ -1028,7 +1028,10 @@ int dm_flush_maps (int retries)
 
 	r = 0;
 	do {
-		r |= dm_suspend_and_flush_map(names->name, retries);
+		if (need_suspend)
+			r |= dm_suspend_and_flush_map(names->name, retries);
+		else
+			r |= dm_flush_map(names->name);
 		next = names->next;
 		names = (void *) names + next;
 	} while (next);
diff --git a/libmultipath/devmapper.h b/libmultipath/devmapper.h
index 5b18bf4b..2c5b1f56 100644
--- a/libmultipath/devmapper.h
+++ b/libmultipath/devmapper.h
@@ -57,7 +57,7 @@ int dm_flush_map_nopaths(const char * mapname, int deferred_remove);
 #define dm_suspend_and_flush_map(mapname, retries) \
 	_dm_flush_map(mapname, 1, 0, 1, retries)
 int dm_cancel_deferred_remove(struct multipath *mpp);
-int dm_flush_maps (int retries);
+int dm_flush_maps (int need_suspend, int retries);
 int dm_fail_path(const char * mapname, char * path);
 int dm_reinstate_path(const char * mapname, char * path);
 int dm_queue_if_no_path(const char *mapname, int enable);
diff --git a/multipath/main.c b/multipath/main.c
index c4740fab..d89f0a91 100644
--- a/multipath/main.c
+++ b/multipath/main.c
@@ -1096,7 +1096,7 @@ main (int argc, char *argv[])
 		goto out;
 	}
 	else if (conf->remove == FLUSH_ALL) {
-		r = dm_flush_maps(retries) ? RTVL_FAIL : RTVL_OK;
+		r = dm_flush_maps(1, retries) ? RTVL_FAIL : RTVL_OK;
 		goto out;
 	}
 	while ((r = configure(conf, cmd, dev_type, dev)) == RTVL_RETRY)
diff --git a/multipathd/cli.c b/multipathd/cli.c
index 800c0fbe..bdc9fb10 100644
--- a/multipathd/cli.c
+++ b/multipathd/cli.c
@@ -568,6 +568,7 @@ cli_init (void) {
 	add_handler(DEL+PATH, NULL);
 	add_handler(ADD+MAP, NULL);
 	add_handler(DEL+MAP, NULL);
+	add_handler(DEL+MAPS, NULL);
 	add_handler(SWITCH+MAP+GROUP, NULL);
 	add_handler(RECONFIGURE, NULL);
 	add_handler(SUSPEND+MAP, NULL);
diff --git a/multipathd/cli_handlers.c b/multipathd/cli_handlers.c
index 31c3d9fd..782bb003 100644
--- a/multipathd/cli_handlers.c
+++ b/multipathd/cli_handlers.c
@@ -852,6 +852,25 @@ cli_del_map (void * v, char ** reply, int * len, void * data)
 	return rc;
 }
 
+int
+cli_del_maps (void *v, char **reply, int *len, void *data)
+{
+	struct vectors * vecs = (struct vectors *)data;
+	struct multipath *mpp;
+	int i, ret = 0;
+
+	condlog(2, "remove maps (operator)");
+	vector_foreach_slot(vecs->mpvec, mpp, i) {
+		if (flush_map(mpp, vecs, 0))
+			ret++;
+		else
+			i--;
+	}
+	/* flush any multipath maps that aren't currently known by multipathd */
+	ret |= dm_flush_maps(0, 0);
+	return ret;
+}
+
 int
 cli_reload(void *v, char **reply, int *len, void *data)
 {
diff --git a/multipathd/cli_handlers.h b/multipathd/cli_handlers.h
index 0f451064..6f57b429 100644
--- a/multipathd/cli_handlers.h
+++ b/multipathd/cli_handlers.h
@@ -26,6 +26,7 @@ int cli_add_path (void * v, char ** reply, int * len, void * data);
 int cli_del_path (void * v, char ** reply, int * len, void * data);
 int cli_add_map (void * v, char ** reply, int * len, void * data);
 int cli_del_map (void * v, char ** reply, int * len, void * data);
+int cli_del_maps (void * v, char ** reply, int * len, void * data);
 int cli_switch_group(void * v, char ** reply, int * len, void * data);
 int cli_reconfigure(void * v, char ** reply, int * len, void * data);
 int cli_resize(void * v, char ** reply, int * len, void * data);
diff --git a/multipathd/main.c b/multipathd/main.c
index 22bc4363..104dbe7a 100644
--- a/multipathd/main.c
+++ b/multipathd/main.c
@@ -635,7 +635,7 @@ sync_maps_state(vector mpvec)
 		sync_map_state(mpp);
 }
 
-static int
+int
 flush_map(struct multipath * mpp, struct vectors * vecs, int nopaths)
 {
 	int r;
@@ -1555,6 +1555,7 @@ uxlsnrloop (void * ap)
 	set_handler_callback(DEL+PATH, cli_del_path);
 	set_handler_callback(ADD+MAP, cli_add_map);
 	set_handler_callback(DEL+MAP, cli_del_map);
+	set_handler_callback(DEL+MAPS, cli_del_maps);
 	set_handler_callback(SWITCH+MAP+GROUP, cli_switch_group);
 	set_unlocked_handler_callback(RECONFIGURE, cli_reconfigure);
 	set_handler_callback(SUSPEND+MAP, cli_suspend);
diff --git a/multipathd/main.h b/multipathd/main.h
index 7bb8463f..5dff17e5 100644
--- a/multipathd/main.h
+++ b/multipathd/main.h
@@ -28,6 +28,7 @@ int ev_add_path (struct path *, struct vectors *, int);
 int ev_remove_path (struct path *, struct vectors *, int);
 int ev_add_map (char *, const char *, struct vectors *);
 int ev_remove_map (char *, char *, int, struct vectors *);
+int flush_map(struct multipath *, struct vectors *, int);
 int set_config_state(enum daemon_status);
 void * mpath_alloc_prin_response(int prin_sa);
 int prin_do_scsi_ioctl(char *, int rq_servact, struct prin_resp * resp,
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v2 5/7] multipath: make flushing maps work like other commands
  2020-06-25 20:42 [PATCH v2 0/7] Fix muitpath/multipathd flush issue Benjamin Marzinski
                   ` (3 preceding siblings ...)
  2020-06-25 20:42 ` [PATCH v2 4/7] multipathd: add "del maps" multipathd command Benjamin Marzinski
@ 2020-06-25 20:42 ` Benjamin Marzinski
  2020-06-25 20:42 ` [PATCH v2 6/7] multipath: delegate flushing maps to multipathd Benjamin Marzinski
  2020-06-25 20:42 ` [PATCH v2 7/7] multipath: add option to skip multipathd delegation Benjamin Marzinski
  6 siblings, 0 replies; 15+ messages in thread
From: Benjamin Marzinski @ 2020-06-25 20:42 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck

The config structure doesn't need a special variable just for removes.
Multipath can just use the cmd variable, like it does for the other
commands.

Reviewed-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 libmultipath/config.h    |  3 ++-
 libmultipath/configure.h |  3 ---
 multipath/main.c         | 20 ++++++++++----------
 3 files changed, 12 insertions(+), 14 deletions(-)

diff --git a/libmultipath/config.h b/libmultipath/config.h
index ceecff2d..55569360 100644
--- a/libmultipath/config.h
+++ b/libmultipath/config.h
@@ -38,6 +38,8 @@ enum mpath_cmds {
 	CMD_ADD_WWID,
 	CMD_USABLE_PATHS,
 	CMD_DUMP_CONFIG,
+	CMD_FLUSH_ONE,
+	CMD_FLUSH_ALL,
 };
 
 enum force_reload_types {
@@ -142,7 +144,6 @@ struct config {
 	unsigned int max_checkint;
 	bool use_watchdog;
 	int pgfailback;
-	int remove;
 	int rr_weight;
 	int no_path_retry;
 	int user_friendly_names;
diff --git a/libmultipath/configure.h b/libmultipath/configure.h
index d7509000..0e33bf40 100644
--- a/libmultipath/configure.h
+++ b/libmultipath/configure.h
@@ -45,9 +45,6 @@ enum {
 	CP_RETRY,
 };
 
-#define FLUSH_ONE 1
-#define FLUSH_ALL 2
-
 struct vectors;
 
 int setup_map (struct multipath * mpp, char * params, int params_size,
diff --git a/multipath/main.c b/multipath/main.c
index d89f0a91..101fd656 100644
--- a/multipath/main.c
+++ b/multipath/main.c
@@ -909,10 +909,10 @@ main (int argc, char *argv[])
 				cmd = CMD_DRY_RUN;
 			break;
 		case 'f':
-			conf->remove = FLUSH_ONE;
+			cmd = CMD_FLUSH_ONE;
 			break;
 		case 'F':
-			conf->remove = FLUSH_ALL;
+			cmd = CMD_FLUSH_ALL;
 			break;
 		case 'l':
 			if (optarg && !strncmp(optarg, "l", 1))
@@ -1053,6 +1053,10 @@ main (int argc, char *argv[])
 		condlog(0, "the -w option requires a device");
 		goto out;
 	}
+	if (cmd == CMD_FLUSH_ONE && dev_type != DEV_DEVMAP) {
+		condlog(0, "the -f option requires a map name to remove");
+		goto out;
+	}
 
 	switch(delegate_to_multipathd(cmd, dev, dev_type, conf)) {
 	case DELEGATE_OK:
@@ -1086,16 +1090,12 @@ main (int argc, char *argv[])
 	}
 	if (retries < 0)
 		retries = conf->remove_retries;
-	if (conf->remove == FLUSH_ONE) {
-		if (dev_type == DEV_DEVMAP) {
-			r = dm_suspend_and_flush_map(dev, retries) ?
-				RTVL_FAIL : RTVL_OK;
-		} else
-			condlog(0, "must provide a map name to remove");
-
+	if (cmd == CMD_FLUSH_ONE) {
+		r = dm_suspend_and_flush_map(dev, retries) ?
+		    RTVL_FAIL : RTVL_OK;
 		goto out;
 	}
-	else if (conf->remove == FLUSH_ALL) {
+	else if (cmd == CMD_FLUSH_ALL) {
 		r = dm_flush_maps(1, retries) ? RTVL_FAIL : RTVL_OK;
 		goto out;
 	}
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v2 6/7] multipath: delegate flushing maps to multipathd
  2020-06-25 20:42 [PATCH v2 0/7] Fix muitpath/multipathd flush issue Benjamin Marzinski
                   ` (4 preceding siblings ...)
  2020-06-25 20:42 ` [PATCH v2 5/7] multipath: make flushing maps work like other commands Benjamin Marzinski
@ 2020-06-25 20:42 ` Benjamin Marzinski
  2020-07-01 21:00   ` Martin Wilck
  2020-06-25 20:42 ` [PATCH v2 7/7] multipath: add option to skip multipathd delegation Benjamin Marzinski
  6 siblings, 1 reply; 15+ messages in thread
From: Benjamin Marzinski @ 2020-06-25 20:42 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck

Since there can be problems with removing maps outside of multipathd,
multipath should attempt to delegate this command to multipathd.
However, multipathd doesn't attempt to suspend the device, in order
to avoid potential hangs. If delegating to multipathd fails, multipath
should try the remove itself.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 multipath/main.c      | 14 ++++++++++++++
 multipath/multipath.8 |  4 ++--
 2 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/multipath/main.c b/multipath/main.c
index 101fd656..6a24e483 100644
--- a/multipath/main.c
+++ b/multipath/main.c
@@ -820,6 +820,20 @@ int delegate_to_multipathd(enum mpath_cmds cmd,
 	if (cmd == CMD_CREATE && conf->force_reload == FORCE_RELOAD_YES) {
 		p += snprintf(p, n, "reconfigure");
 	}
+	else if (cmd == CMD_FLUSH_ONE && dev && dev_type == DEV_DEVMAP) {
+		p += snprintf(p, n, "del map %s", dev);
+		/* multipathd doesn't try as hard, to avoid potentially
+		 * hanging. If it fails, retry with the regular multipath
+		 * command */
+		r = NOT_DELEGATED;
+	}
+	else if (cmd == CMD_FLUSH_ALL) {
+		p += snprintf(p, n, "del maps");
+		/* multipathd doesn't try as hard, to avoid potentially
+		 * hanging. If it fails, retry with the regular multipath
+		 * command */
+		r = NOT_DELEGATED;
+	}
 	/* Add other translations here */
 
 	if (strlen(command) == 0)
diff --git a/multipath/multipath.8 b/multipath/multipath.8
index 6fb8645a..5b29a5d9 100644
--- a/multipath/multipath.8
+++ b/multipath/multipath.8
@@ -125,11 +125,11 @@ the system.
 Other operation modes are chosen by using one of the following command line switches:
 .TP
 .B \-f
-Flush (remove) a multipath device map specified as parameter, if unused.
+Flush (remove) a multipath device map specified as parameter, if unused. This operation is delegated to the multipathd daemon if it's running.
 .
 .TP
 .B \-F
-Flush (remove) all unused multipath device maps.
+Flush (remove) all unused multipath device maps. This operation is delegated to the multipathd daemon if it's running.
 .
 .TP
 .B \-l
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v2 7/7] multipath: add option to skip multipathd delegation
  2020-06-25 20:42 [PATCH v2 0/7] Fix muitpath/multipathd flush issue Benjamin Marzinski
                   ` (5 preceding siblings ...)
  2020-06-25 20:42 ` [PATCH v2 6/7] multipath: delegate flushing maps to multipathd Benjamin Marzinski
@ 2020-06-25 20:42 ` Benjamin Marzinski
  2020-07-01 21:01   ` Martin Wilck
  6 siblings, 1 reply; 15+ messages in thread
From: Benjamin Marzinski @ 2020-06-25 20:42 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck

Add the -D option to allow users to skip delegating commands to
multipathd.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 libmultipath/config.h | 1 +
 multipath/main.c      | 8 +++++++-
 2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/libmultipath/config.h b/libmultipath/config.h
index 55569360..92c61a0d 100644
--- a/libmultipath/config.h
+++ b/libmultipath/config.h
@@ -190,6 +190,7 @@ struct config {
 	int ghost_delay;
 	int find_multipaths_timeout;
 	int marginal_pathgroups;
+	int skip_delegate;
 	unsigned int version[3];
 	unsigned int sequence_nr;
 
diff --git a/multipath/main.c b/multipath/main.c
index 6a24e483..4c43314e 100644
--- a/multipath/main.c
+++ b/multipath/main.c
@@ -817,6 +817,9 @@ int delegate_to_multipathd(enum mpath_cmds cmd,
 	*p = '\0';
 	n = sizeof(command);
 
+	if (conf->skip_delegate)
+		return NOT_DELEGATED;
+
 	if (cmd == CMD_CREATE && conf->force_reload == FORCE_RELOAD_YES) {
 		p += snprintf(p, n, "reconfigure");
 	}
@@ -890,7 +893,7 @@ main (int argc, char *argv[])
 	multipath_conf = conf;
 	conf->retrigger_tries = 0;
 	conf->force_sync = 1;
-	while ((arg = getopt(argc, argv, ":adcChl::eFfM:v:p:b:BrR:itTquUwW")) != EOF ) {
+	while ((arg = getopt(argc, argv, ":adDcChl::eFfM:v:p:b:BrR:itTquUwW")) != EOF ) {
 		switch(arg) {
 		case 1: printf("optarg : %s\n",optarg);
 			break;
@@ -922,6 +925,9 @@ main (int argc, char *argv[])
 			if (cmd == CMD_CREATE)
 				cmd = CMD_DRY_RUN;
 			break;
+		case 'D':
+			conf->skip_delegate = 1;
+			break;
 		case 'f':
 			cmd = CMD_FLUSH_ONE;
 			break;
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 1/7] libmultipath: make dm_get_map/status return codes symbolic
  2020-06-25 20:42 ` [PATCH v2 1/7] libmultipath: make dm_get_map/status return codes symbolic Benjamin Marzinski
@ 2020-07-01 20:15   ` Martin Wilck
  2020-07-02 18:33     ` Benjamin Marzinski
  0 siblings, 1 reply; 15+ messages in thread
From: Martin Wilck @ 2020-07-01 20:15 UTC (permalink / raw)
  To: bmarzins, christophe.varoqui; +Cc: dm-devel

On Thu, 2020-06-25 at 15:42 -0500, Benjamin Marzinski wrote:
> dm_get_map() and dm_get_status() now use symbolic return codes. They
> also differentiate between failing to get information from device-
> mapper
> and not finding the requested device. These symboilc return codes are
> also used by update_multipath_* functions.
> 
> Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> ---
>  libmultipath/devmapper.c   | 51 +++++++++++++++++++++++++-----------
> --
>  libmultipath/devmapper.h   |  6 +++++
>  libmultipath/structs_vec.c | 45 +++++++++++++++++++--------------
>  multipathd/main.c          | 12 ++++-----
>  4 files changed, 72 insertions(+), 42 deletions(-)
> 
> diff --git a/libmultipath/devmapper.c b/libmultipath/devmapper.c
> index 27d52398..f5cfe296 100644
> --- a/libmultipath/devmapper.c
> +++ b/libmultipath/devmapper.c
> @@ -534,36 +534,43 @@ int dm_map_present(const char * str)
>  
>  int dm_get_map(const char *name, unsigned long long *size, char
> *outparams)
>  {
> -	int r = 1;
> +	int r = DMP_ERR;
>  	struct dm_task *dmt;
>  	uint64_t start, length;
>  	char *target_type = NULL;
>  	char *params = NULL;
>  
>  	if (!(dmt = libmp_dm_task_create(DM_DEVICE_TABLE)))
> -		return 1;
> +		return r;
>  
>  	if (!dm_task_set_name(dmt, name))
>  		goto out;
>  
>  	dm_task_no_open_count(dmt);
>  
> -	if (!dm_task_run(dmt))
> +	errno = 0;
> +	if (!dm_task_run(dmt)) {
> +		if (errno == ENXIO)

Don't you have to use dm_task_get_errno(dmt) here?
errno might have been overwritten... if you are certain you don't need
it, a comment explaining it would be helpful.


> +			r = DMP_FAIL;

You've created generic names, which is good, but these are perhaps a
bit too generic. What's the difference bewteen DMP_FAIL and DMP_ERR? I
can derive it from your code, but it's not obvious from the retcode
names, and thus doesn't help the reader much. I suggest to keep DMT_ERR
to denote a "generic" error condition, and use e.g. DMP_NOTFOUND for
the other case.

>  		goto out;
> +	}
>  
> +	r = DMP_FAIL;
>  	/* Fetch 1st target */
> -	dm_get_next_target(dmt, NULL, &start, &length,
> -			   &target_type, &params);
> +	if (dm_get_next_target(dmt, NULL, &start, &length,
> +			       &target_type, &params) != NULL)
> +		/* more than one target */
> +		goto out;
>  
>  	if (size)
>  		*size = length;
>  
>  	if (!outparams) {
> -		r = 0;
> +		r = DMP_PASS;

Nit: DMP_OK or DMP_SUCCESS would be more conventional for successful
completion. "pass" sounds like something more specific to me,
like having passed a test or a filter.

>  		goto out;
>  	}
>  	if (snprintf(outparams, PARAMS_SIZE, "%s", params) <=
> PARAMS_SIZE)
> -		r = 0;
> +		r = DMP_PASS;
>  out:
>  	dm_task_destroy(dmt);
>  	return r;
> @@ -637,35 +644,45 @@ is_mpath_part(const char *part_name, const char
> *map_name)
>  
>  int dm_get_status(const char *name, char *outstatus)
>  {
> -	int r = 1;
> +	int r = DMP_ERR;
>  	struct dm_task *dmt;
>  	uint64_t start, length;
>  	char *target_type = NULL;
>  	char *status = NULL;
>  
>  	if (!(dmt = libmp_dm_task_create(DM_DEVICE_STATUS)))
> -		return 1;
> +		return r;
>  
>  	if (!dm_task_set_name(dmt, name))
>  		goto out;
>  
>  	dm_task_no_open_count(dmt);
>  
> -	if (!dm_task_run(dmt))
> +	errno = 0;
> +	if (!dm_task_run(dmt)) {
> +		if (errno == ENXIO)
> +			r = DMP_FAIL;
>  		goto out;
> +	}

see above

>  
> +	r = DMP_FAIL;
>  	/* Fetch 1st target */
> -	dm_get_next_target(dmt, NULL, &start, &length,
> -			   &target_type, &status);
> +	if (dm_get_next_target(dmt, NULL, &start, &length,
> +			       &target_type, &status) != NULL)
> +		goto out;
> +
> +	if (!target_type || strcmp(target_type, TGT_MPATH) != 0)
> +		goto out;
> +
>  	if (!status) {
>  		condlog(2, "get null status.");
>  		goto out;
>  	}
>  
>  	if (snprintf(outstatus, PARAMS_SIZE, "%s", status) <=
> PARAMS_SIZE)
> -		r = 0;
> +		r = DMP_PASS;
>  out:
> -	if (r)
> +	if (r != DMP_PASS)
>  		condlog(0, "%s: error getting map status string",
> name);
>  
>  	dm_task_destroy(dmt);
> @@ -920,7 +937,7 @@ int _dm_flush_map (const char * mapname, int
> need_sync, int deferred_remove,
>  			return 1;
>  
>  	if (need_suspend &&
> -	    !dm_get_map(mapname, &mapsize, params) &&
> +	    dm_get_map(mapname, &mapsize, params) == DMP_PASS &&
>  	    strstr(params, "queue_if_no_path")) {
>  		if (!dm_queue_if_no_path(mapname, 0))
>  			queue_if_no_path = 1;
> @@ -1129,7 +1146,7 @@ struct multipath *dm_get_multipath(const char
> *name)
>  	if (!mpp->alias)
>  		goto out;
>  
> -	if (dm_get_map(name, &mpp->size, NULL))
> +	if (dm_get_map(name, &mpp->size, NULL) != DMP_PASS)
>  		goto out;
>  
>  	dm_get_uuid(name, mpp->wwid, WWID_SIZE);
> @@ -1313,7 +1330,7 @@ do_foreach_partmaps (const char * mapname,
>  		    /*
>  		     * and we can fetch the map table from the kernel
>  		     */
> -		    !dm_get_map(names->name, &size, &params[0]) &&
> +		    dm_get_map(names->name, &size, &params[0]) ==
> DMP_PASS &&
>  
>  		    /*
>  		     * and the table maps over the multipath map
> diff --git a/libmultipath/devmapper.h b/libmultipath/devmapper.h
> index 5ed7edc5..5b18bf4b 100644
> --- a/libmultipath/devmapper.h
> +++ b/libmultipath/devmapper.h
> @@ -27,6 +27,12 @@
>  #define UUID_PREFIX "mpath-"
>  #define UUID_PREFIX_LEN (sizeof(UUID_PREFIX) - 1)
>  
> +enum {
> +	DMP_ERR = -1,
> +	DMP_PASS,
> +	DMP_FAIL,
> +};
> +

Nit: Why use both negative and positive numbers? It's not important,
but I feel that unless we really want to treat DM_ERR in a very special
way, it would be better to use only positive values. (Actually, if we
go for some generalized retcode convention some day, we might save
negative return values for standard libc errno values such
as -ENOENT and use positive values for or own specific ones).

(We can change the numeric values later of course).

Cheers,
Martin

>  void dm_init(int verbosity);
>  int dm_prereq(unsigned int *v);
>  void skip_libmp_dm_init(void);
> diff --git a/libmultipath/structs_vec.c b/libmultipath/structs_vec.c
> index 077f2e42..2225abeb 100644
> --- a/libmultipath/structs_vec.c
> +++ b/libmultipath/structs_vec.c
> @@ -196,43 +196,47 @@ extract_hwe_from_path(struct multipath * mpp)
>  int
>  update_multipath_table (struct multipath *mpp, vector pathvec, int
> is_daemon)
>  {
> +	int r = DMP_ERR;
>  	char params[PARAMS_SIZE] = {0};
>  
>  	if (!mpp)
> -		return 1;
> +		return r;
>  
> -	if (dm_get_map(mpp->alias, &mpp->size, params)) {
> -		condlog(3, "%s: cannot get map", mpp->alias);
> -		return 1;
> +	r = dm_get_map(mpp->alias, &mpp->size, params);
> +	if (r != DMP_PASS) {
> +		condlog(3, "%s: %s", mpp->alias, (r == DMP_ERR)? "error
> getting table" : "map not present");
> +		return r;
>  	}
>  
>  	if (disassemble_map(pathvec, params, mpp, is_daemon)) {
>  		condlog(3, "%s: cannot disassemble map", mpp->alias);
> -		return 1;
> +		return DMP_ERR;
>  	}
>  
> -	return 0;
> +	return DMP_PASS;
>  }
>  
>  int
>  update_multipath_status (struct multipath *mpp)
>  {
> +	int r = DMP_ERR;
>  	char status[PARAMS_SIZE] = {0};
>  
>  	if (!mpp)
> -		return 1;
> +		return r;
>  
> -	if (dm_get_status(mpp->alias, status)) {
> -		condlog(3, "%s: cannot get status", mpp->alias);
> -		return 1;
> +	r = dm_get_status(mpp->alias, status);
> +	if (r != DMP_PASS) {
> +		condlog(3, "%s: %s", mpp->alias, (r == DMP_ERR)? "error
> getting status" : "map not present");
> +		return r;
>  	}
>  
>  	if (disassemble_status(status, mpp)) {
>  		condlog(3, "%s: cannot disassemble status", mpp-
> >alias);
> -		return 1;
> +		return DMP_ERR;
>  	}
>  
> -	return 0;
> +	return DMP_PASS;
>  }
>  
>  void sync_paths(struct multipath *mpp, vector pathvec)
> @@ -264,10 +268,10 @@ int
>  update_multipath_strings(struct multipath *mpp, vector pathvec, int
> is_daemon)
>  {
>  	struct pathgroup *pgp;
> -	int i;
> +	int i, r = DMP_ERR;
>  
>  	if (!mpp)
> -		return 1;
> +		return r;
>  
>  	update_mpp_paths(mpp, pathvec);
>  	condlog(4, "%s: %s", mpp->alias, __FUNCTION__);
> @@ -276,18 +280,21 @@ update_multipath_strings(struct multipath *mpp,
> vector pathvec, int is_daemon)
>  	free_pgvec(mpp->pg, KEEP_PATHS);
>  	mpp->pg = NULL;
>  
> -	if (update_multipath_table(mpp, pathvec, is_daemon))
> -		return 1;
> +	r = update_multipath_table(mpp, pathvec, is_daemon);
> +	if (r != DMP_PASS)
> +		return r;
> +
>  	sync_paths(mpp, pathvec);
>  
> -	if (update_multipath_status(mpp))
> -		return 1;
> +	r = update_multipath_status(mpp);
> +	if (r != DMP_PASS)
> +		return r;
>  
>  	vector_foreach_slot(mpp->pg, pgp, i)
>  		if (pgp->paths)
>  			path_group_prio_update(pgp);
>  
> -	return 0;
> +	return DMP_PASS;
>  }
>  
>  static void enter_recovery_mode(struct multipath *mpp)
> diff --git a/multipathd/main.c b/multipathd/main.c
> index 205ddb32..d73b1b9a 100644
> --- a/multipathd/main.c
> +++ b/multipathd/main.c
> @@ -418,7 +418,7 @@ int __setup_multipath(struct vectors *vecs,
> struct multipath *mpp,
>  		goto out;
>  	}
>  
> -	if (update_multipath_strings(mpp, vecs->pathvec, 1)) {
> +	if (update_multipath_strings(mpp, vecs->pathvec, 1) !=
> DMP_PASS) {
>  		condlog(0, "%s: failed to setup multipath", mpp-
> >alias);
>  		goto out;
>  	}
> @@ -557,9 +557,9 @@ add_map_without_path (struct vectors *vecs, const
> char *alias)
>  	mpp->mpe = find_mpe(conf->mptable, mpp->wwid);
>  	put_multipath_config(conf);
>  
> -	if (update_multipath_table(mpp, vecs->pathvec, 1))
> +	if (update_multipath_table(mpp, vecs->pathvec, 1) != DMP_PASS)
>  		goto out;
> -	if (update_multipath_status(mpp))
> +	if (update_multipath_status(mpp) != DMP_PASS)
>  		goto out;
>  
>  	if (!vector_alloc_slot(vecs->mpvec))
> @@ -1350,8 +1350,8 @@ map_discovery (struct vectors * vecs)
>  		return 1;
>  
>  	vector_foreach_slot (vecs->mpvec, mpp, i)
> -		if (update_multipath_table(mpp, vecs->pathvec, 1) ||
> -		    update_multipath_status(mpp)) {
> +		if (update_multipath_table(mpp, vecs->pathvec, 1) !=
> DMP_PASS ||
> +		    update_multipath_status(mpp) != DMP_PASS) {
>  			remove_map(mpp, vecs, 1);
>  			i--;
>  		}
> @@ -2091,7 +2091,7 @@ check_path (struct vectors * vecs, struct path
> * pp, unsigned int ticks)
>  	/*
>  	 * Synchronize with kernel state
>  	 */
> -	if (update_multipath_strings(pp->mpp, vecs->pathvec, 1)) {
> +	if (update_multipath_strings(pp->mpp, vecs->pathvec, 1) !=
> DMP_PASS) {
>  		condlog(1, "%s: Could not synchronize with kernel
> state",
>  			pp->dev);
>  		pp->dmstate = PSTATE_UNDEF;

-- 
Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
SUSE  Software Solutions Germany GmbH
HRB 36809, AG Nürnberg GF: Felix
Imendörffer

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 2/7] multipathd: fix check_path errors with removed map
  2020-06-25 20:42 ` [PATCH v2 2/7] multipathd: fix check_path errors with removed map Benjamin Marzinski
@ 2020-07-01 20:19   ` Martin Wilck
  2020-07-02 18:34     ` Benjamin Marzinski
  0 siblings, 1 reply; 15+ messages in thread
From: Martin Wilck @ 2020-07-01 20:19 UTC (permalink / raw)
  To: bmarzins, christophe.varoqui; +Cc: dm-devel

On Thu, 2020-06-25 at 15:42 -0500, Benjamin Marzinski wrote:
> If a multipath device is removed during, or immediately before the
> call
> to check_path(), multipathd can behave incorrectly. A missing
> multpath
> device will cause update_multipath_strings() to fail, setting
> pp->dmstate to PSTATE_UNDEF.  If the path is up, this state will
> cause
> reinstate_path() to be called, which will also fail.  This will
> trigger
> a reload, restoring the recently removed device.
> 
> If update_multipath_strings() fails because there is no multipath
> device, check_path should just quit, since the remove dmevent and
> uevent
> are likely already queued up. Also, I don't see any reason to reload
> the
> multipath device if reinstate fails. This code was added by
> fac68d7a99ef17d496079538a5c6836acd7911ab, which clamined that
> reinstate
> could fail if the path was disabled.  Looking through the current
> kernel
> code, I can't see any reason why a reinstate would fail, where a
> reload
> would help. If the path was missing from the multipath device,
> update_multipath_strings() would already catch that, and quit
> check_path() early, which make more sense to me than reloading does.
> 
> Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> ---
>  multipathd/main.c | 37 ++++++++++++++-----------------------
>  1 file changed, 14 insertions(+), 23 deletions(-)
> 
> diff --git a/multipathd/main.c b/multipathd/main.c
> index d73b1b9a..22bc4363 100644
> --- a/multipathd/main.c
> +++ b/multipathd/main.c
> @@ -1615,22 +1615,18 @@ fail_path (struct path * pp, int del_active)
>  /*
>   * caller must have locked the path list before calling that
> function
>   */
> -static int
> +static void
>  reinstate_path (struct path * pp)
>  {
> -	int ret = 0;
> -
>  	if (!pp->mpp)
> -		return 0;
> +		return;
>  
> -	if (dm_reinstate_path(pp->mpp->alias, pp->dev_t)) {
> +	if (dm_reinstate_path(pp->mpp->alias, pp->dev_t))
>  		condlog(0, "%s: reinstate failed", pp->dev_t);
> -		ret = 1;
> -	} else {
> +	else {
>  		condlog(2, "%s: reinstated", pp->dev_t);
>  		update_queue_mode_add_path(pp->mpp);
>  	}
> -	return ret;
>  }
>  
>  static void
> @@ -2091,9 +2087,13 @@ check_path (struct vectors * vecs, struct path
> * pp, unsigned int ticks)
>  	/*
>  	 * Synchronize with kernel state
>  	 */
> -	if (update_multipath_strings(pp->mpp, vecs->pathvec, 1) !=
> DMP_PASS) {
> +	ret = update_multipath_strings(pp->mpp, vecs->pathvec, 1);
> +	if (ret != DMP_PASS) {
>  		condlog(1, "%s: Could not synchronize with kernel
> state",
>  			pp->dev);

We could do even better here by printing different log messages
in the two cases we differentiate.

Apart from that, ACK.

> +		if (ret == DMP_FAIL)
> +			/* multipath device missing. Likely removed */
> +			return 0;
>  		pp->dmstate = PSTATE_UNDEF;
>  	}
>  	/* if update_multipath_strings orphaned the path, quit early */
> @@ -2183,12 +2183,8 @@ check_path (struct vectors * vecs, struct path
> * pp, unsigned int ticks)
>  		/*
>  		 * reinstate this path
>  		 */
> -		if (!disable_reinstate && reinstate_path(pp)) {
> -			condlog(3, "%s: reload map", pp->dev);
> -			ev_add_path(pp, vecs, 1);
> -			pp->tick = 1;
> -			return 0;
> -		}
> +		if (!disable_reinstate)
> +			reinstate_path(pp);
>  		new_path_up = 1;
>  
>  		if (oldchkrstate != PATH_UP && oldchkrstate !=
> PATH_GHOST)
> @@ -2204,15 +2200,10 @@ check_path (struct vectors * vecs, struct
> path * pp, unsigned int ticks)
>  	else if (newstate == PATH_UP || newstate == PATH_GHOST) {
>  		if ((pp->dmstate == PSTATE_FAILED ||
>  		    pp->dmstate == PSTATE_UNDEF) &&
> -		    !disable_reinstate) {
> +		    !disable_reinstate)
>  			/* Clear IO errors */
> -			if (reinstate_path(pp)) {
> -				condlog(3, "%s: reload map", pp->dev);
> -				ev_add_path(pp, vecs, 1);
> -				pp->tick = 1;
> -				return 0;
> -			}
> -		} else {
> +			reinstate_path(pp);
> +		else {
>  			LOG_MSG(4, verbosity, pp);
>  			if (pp->checkint != max_checkint) {
>  				/*

-- 
Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
SUSE  Software Solutions Germany GmbH
HRB 36809, AG Nürnberg GF: Felix
Imendörffer

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 4/7] multipathd: add "del maps" multipathd command
  2020-06-25 20:42 ` [PATCH v2 4/7] multipathd: add "del maps" multipathd command Benjamin Marzinski
@ 2020-07-01 20:59   ` Martin Wilck
  0 siblings, 0 replies; 15+ messages in thread
From: Martin Wilck @ 2020-07-01 20:59 UTC (permalink / raw)
  To: bmarzins, christophe.varoqui; +Cc: dm-devel

On Thu, 2020-06-25 at 15:42 -0500, Benjamin Marzinski wrote:
> This will flush all multipath devices.
> 
> Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>

Reviewed-by: Martin Wilck <mwilck@suse.com>

-- 
Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
SUSE  Software Solutions Germany GmbH
HRB 36809, AG Nürnberg GF: Felix
Imendörffer

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 6/7] multipath: delegate flushing maps to multipathd
  2020-06-25 20:42 ` [PATCH v2 6/7] multipath: delegate flushing maps to multipathd Benjamin Marzinski
@ 2020-07-01 21:00   ` Martin Wilck
  0 siblings, 0 replies; 15+ messages in thread
From: Martin Wilck @ 2020-07-01 21:00 UTC (permalink / raw)
  To: bmarzins, christophe.varoqui; +Cc: dm-devel

On Thu, 2020-06-25 at 15:42 -0500, Benjamin Marzinski wrote:
> Since there can be problems with removing maps outside of multipathd,
> multipath should attempt to delegate this command to multipathd.
> However, multipathd doesn't attempt to suspend the device, in order
> to avoid potential hangs. If delegating to multipathd fails,
> multipath
> should try the remove itself.
> 
> Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>

Reviewed-by: Martin Wilck <mwilck@suse.com>

-- 
Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
SUSE  Software Solutions Germany GmbH
HRB 36809, AG Nürnberg GF: Felix
Imendörffer

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 7/7] multipath: add option to skip multipathd delegation
  2020-06-25 20:42 ` [PATCH v2 7/7] multipath: add option to skip multipathd delegation Benjamin Marzinski
@ 2020-07-01 21:01   ` Martin Wilck
  0 siblings, 0 replies; 15+ messages in thread
From: Martin Wilck @ 2020-07-01 21:01 UTC (permalink / raw)
  To: bmarzins, christophe.varoqui; +Cc: dm-devel

On Thu, 2020-06-25 at 15:42 -0500, Benjamin Marzinski wrote:
> Add the -D option to allow users to skip delegating commands to
> multipathd.
> 
> Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>

Reviewed-by: Martin Wilck <mwilck@suse.com>

> ---
>  libmultipath/config.h | 1 +
>  multipath/main.c      | 8 +++++++-
>  2 files changed, 8 insertions(+), 1 deletion(-)
-- 
Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
SUSE  Software Solutions Germany GmbH
HRB 36809, AG Nürnberg GF: Felix
Imendörffer

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 1/7] libmultipath: make dm_get_map/status return codes symbolic
  2020-07-01 20:15   ` Martin Wilck
@ 2020-07-02 18:33     ` Benjamin Marzinski
  0 siblings, 0 replies; 15+ messages in thread
From: Benjamin Marzinski @ 2020-07-02 18:33 UTC (permalink / raw)
  To: Martin Wilck; +Cc: dm-devel

On Wed, Jul 01, 2020 at 08:15:49PM +0000, Martin Wilck wrote:
> On Thu, 2020-06-25 at 15:42 -0500, Benjamin Marzinski wrote:
> > dm_get_map() and dm_get_status() now use symbolic return codes. They
> > also differentiate between failing to get information from device-
> > mapper
> > and not finding the requested device. These symboilc return codes are
> > also used by update_multipath_* functions.
> > 
> > Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> > ---
> >  libmultipath/devmapper.c   | 51 +++++++++++++++++++++++++-----------
> > --
> >  libmultipath/devmapper.h   |  6 +++++
> >  libmultipath/structs_vec.c | 45 +++++++++++++++++++--------------
> >  multipathd/main.c          | 12 ++++-----
> >  4 files changed, 72 insertions(+), 42 deletions(-)
> > 
> > diff --git a/libmultipath/devmapper.c b/libmultipath/devmapper.c
> > index 27d52398..f5cfe296 100644
> > --- a/libmultipath/devmapper.c
> > +++ b/libmultipath/devmapper.c
> > @@ -534,36 +534,43 @@ int dm_map_present(const char * str)
> >  
> >  int dm_get_map(const char *name, unsigned long long *size, char
> > *outparams)
> >  {
> > -	int r = 1;
> > +	int r = DMP_ERR;
> >  	struct dm_task *dmt;
> >  	uint64_t start, length;
> >  	char *target_type = NULL;
> >  	char *params = NULL;
> >  
> >  	if (!(dmt = libmp_dm_task_create(DM_DEVICE_TABLE)))
> > -		return 1;
> > +		return r;
> >  
> >  	if (!dm_task_set_name(dmt, name))
> >  		goto out;
> >  
> >  	dm_task_no_open_count(dmt);
> >  
> > -	if (!dm_task_run(dmt))
> > +	errno = 0;
> > +	if (!dm_task_run(dmt)) {
> > +		if (errno == ENXIO)
> 
> Don't you have to use dm_task_get_errno(dmt) here?
> errno might have been overwritten... if you are certain you don't need
> it, a comment explaining it would be helpful.

Err.. I didn't know that existed. Sure I can use it, and your other
suggestions are reasonable as well

-Ben

> 
> 
> > +			r = DMP_FAIL;
> 
> You've created generic names, which is good, but these are perhaps a
> bit too generic. What's the difference bewteen DMP_FAIL and DMP_ERR? I
> can derive it from your code, but it's not obvious from the retcode
> names, and thus doesn't help the reader much. I suggest to keep DMT_ERR
> to denote a "generic" error condition, and use e.g. DMP_NOTFOUND for
> the other case.
> 
> >  		goto out;
> > +	}
> >  
> > +	r = DMP_FAIL;
> >  	/* Fetch 1st target */
> > -	dm_get_next_target(dmt, NULL, &start, &length,
> > -			   &target_type, &params);
> > +	if (dm_get_next_target(dmt, NULL, &start, &length,
> > +			       &target_type, &params) != NULL)
> > +		/* more than one target */
> > +		goto out;
> >  
> >  	if (size)
> >  		*size = length;
> >  
> >  	if (!outparams) {
> > -		r = 0;
> > +		r = DMP_PASS;
> 
> Nit: DMP_OK or DMP_SUCCESS would be more conventional for successful
> completion. "pass" sounds like something more specific to me,
> like having passed a test or a filter.
> 
> >  		goto out;
> >  	}
> >  	if (snprintf(outparams, PARAMS_SIZE, "%s", params) <=
> > PARAMS_SIZE)
> > -		r = 0;
> > +		r = DMP_PASS;
> >  out:
> >  	dm_task_destroy(dmt);
> >  	return r;
> > @@ -637,35 +644,45 @@ is_mpath_part(const char *part_name, const char
> > *map_name)
> >  
> >  int dm_get_status(const char *name, char *outstatus)
> >  {
> > -	int r = 1;
> > +	int r = DMP_ERR;
> >  	struct dm_task *dmt;
> >  	uint64_t start, length;
> >  	char *target_type = NULL;
> >  	char *status = NULL;
> >  
> >  	if (!(dmt = libmp_dm_task_create(DM_DEVICE_STATUS)))
> > -		return 1;
> > +		return r;
> >  
> >  	if (!dm_task_set_name(dmt, name))
> >  		goto out;
> >  
> >  	dm_task_no_open_count(dmt);
> >  
> > -	if (!dm_task_run(dmt))
> > +	errno = 0;
> > +	if (!dm_task_run(dmt)) {
> > +		if (errno == ENXIO)
> > +			r = DMP_FAIL;
> >  		goto out;
> > +	}
> 
> see above
> 
> >  
> > +	r = DMP_FAIL;
> >  	/* Fetch 1st target */
> > -	dm_get_next_target(dmt, NULL, &start, &length,
> > -			   &target_type, &status);
> > +	if (dm_get_next_target(dmt, NULL, &start, &length,
> > +			       &target_type, &status) != NULL)
> > +		goto out;
> > +
> > +	if (!target_type || strcmp(target_type, TGT_MPATH) != 0)
> > +		goto out;
> > +
> >  	if (!status) {
> >  		condlog(2, "get null status.");
> >  		goto out;
> >  	}
> >  
> >  	if (snprintf(outstatus, PARAMS_SIZE, "%s", status) <=
> > PARAMS_SIZE)
> > -		r = 0;
> > +		r = DMP_PASS;
> >  out:
> > -	if (r)
> > +	if (r != DMP_PASS)
> >  		condlog(0, "%s: error getting map status string",
> > name);
> >  
> >  	dm_task_destroy(dmt);
> > @@ -920,7 +937,7 @@ int _dm_flush_map (const char * mapname, int
> > need_sync, int deferred_remove,
> >  			return 1;
> >  
> >  	if (need_suspend &&
> > -	    !dm_get_map(mapname, &mapsize, params) &&
> > +	    dm_get_map(mapname, &mapsize, params) == DMP_PASS &&
> >  	    strstr(params, "queue_if_no_path")) {
> >  		if (!dm_queue_if_no_path(mapname, 0))
> >  			queue_if_no_path = 1;
> > @@ -1129,7 +1146,7 @@ struct multipath *dm_get_multipath(const char
> > *name)
> >  	if (!mpp->alias)
> >  		goto out;
> >  
> > -	if (dm_get_map(name, &mpp->size, NULL))
> > +	if (dm_get_map(name, &mpp->size, NULL) != DMP_PASS)
> >  		goto out;
> >  
> >  	dm_get_uuid(name, mpp->wwid, WWID_SIZE);
> > @@ -1313,7 +1330,7 @@ do_foreach_partmaps (const char * mapname,
> >  		    /*
> >  		     * and we can fetch the map table from the kernel
> >  		     */
> > -		    !dm_get_map(names->name, &size, &params[0]) &&
> > +		    dm_get_map(names->name, &size, &params[0]) ==
> > DMP_PASS &&
> >  
> >  		    /*
> >  		     * and the table maps over the multipath map
> > diff --git a/libmultipath/devmapper.h b/libmultipath/devmapper.h
> > index 5ed7edc5..5b18bf4b 100644
> > --- a/libmultipath/devmapper.h
> > +++ b/libmultipath/devmapper.h
> > @@ -27,6 +27,12 @@
> >  #define UUID_PREFIX "mpath-"
> >  #define UUID_PREFIX_LEN (sizeof(UUID_PREFIX) - 1)
> >  
> > +enum {
> > +	DMP_ERR = -1,
> > +	DMP_PASS,
> > +	DMP_FAIL,
> > +};
> > +
> 
> Nit: Why use both negative and positive numbers? It's not important,
> but I feel that unless we really want to treat DM_ERR in a very special
> way, it would be better to use only positive values. (Actually, if we
> go for some generalized retcode convention some day, we might save
> negative return values for standard libc errno values such
> as -ENOENT and use positive values for or own specific ones).
> 
> (We can change the numeric values later of course).
> 
> Cheers,
> Martin
> 
> >  void dm_init(int verbosity);
> >  int dm_prereq(unsigned int *v);
> >  void skip_libmp_dm_init(void);
> > diff --git a/libmultipath/structs_vec.c b/libmultipath/structs_vec.c
> > index 077f2e42..2225abeb 100644
> > --- a/libmultipath/structs_vec.c
> > +++ b/libmultipath/structs_vec.c
> > @@ -196,43 +196,47 @@ extract_hwe_from_path(struct multipath * mpp)
> >  int
> >  update_multipath_table (struct multipath *mpp, vector pathvec, int
> > is_daemon)
> >  {
> > +	int r = DMP_ERR;
> >  	char params[PARAMS_SIZE] = {0};
> >  
> >  	if (!mpp)
> > -		return 1;
> > +		return r;
> >  
> > -	if (dm_get_map(mpp->alias, &mpp->size, params)) {
> > -		condlog(3, "%s: cannot get map", mpp->alias);
> > -		return 1;
> > +	r = dm_get_map(mpp->alias, &mpp->size, params);
> > +	if (r != DMP_PASS) {
> > +		condlog(3, "%s: %s", mpp->alias, (r == DMP_ERR)? "error
> > getting table" : "map not present");
> > +		return r;
> >  	}
> >  
> >  	if (disassemble_map(pathvec, params, mpp, is_daemon)) {
> >  		condlog(3, "%s: cannot disassemble map", mpp->alias);
> > -		return 1;
> > +		return DMP_ERR;
> >  	}
> >  
> > -	return 0;
> > +	return DMP_PASS;
> >  }
> >  
> >  int
> >  update_multipath_status (struct multipath *mpp)
> >  {
> > +	int r = DMP_ERR;
> >  	char status[PARAMS_SIZE] = {0};
> >  
> >  	if (!mpp)
> > -		return 1;
> > +		return r;
> >  
> > -	if (dm_get_status(mpp->alias, status)) {
> > -		condlog(3, "%s: cannot get status", mpp->alias);
> > -		return 1;
> > +	r = dm_get_status(mpp->alias, status);
> > +	if (r != DMP_PASS) {
> > +		condlog(3, "%s: %s", mpp->alias, (r == DMP_ERR)? "error
> > getting status" : "map not present");
> > +		return r;
> >  	}
> >  
> >  	if (disassemble_status(status, mpp)) {
> >  		condlog(3, "%s: cannot disassemble status", mpp-
> > >alias);
> > -		return 1;
> > +		return DMP_ERR;
> >  	}
> >  
> > -	return 0;
> > +	return DMP_PASS;
> >  }
> >  
> >  void sync_paths(struct multipath *mpp, vector pathvec)
> > @@ -264,10 +268,10 @@ int
> >  update_multipath_strings(struct multipath *mpp, vector pathvec, int
> > is_daemon)
> >  {
> >  	struct pathgroup *pgp;
> > -	int i;
> > +	int i, r = DMP_ERR;
> >  
> >  	if (!mpp)
> > -		return 1;
> > +		return r;
> >  
> >  	update_mpp_paths(mpp, pathvec);
> >  	condlog(4, "%s: %s", mpp->alias, __FUNCTION__);
> > @@ -276,18 +280,21 @@ update_multipath_strings(struct multipath *mpp,
> > vector pathvec, int is_daemon)
> >  	free_pgvec(mpp->pg, KEEP_PATHS);
> >  	mpp->pg = NULL;
> >  
> > -	if (update_multipath_table(mpp, pathvec, is_daemon))
> > -		return 1;
> > +	r = update_multipath_table(mpp, pathvec, is_daemon);
> > +	if (r != DMP_PASS)
> > +		return r;
> > +
> >  	sync_paths(mpp, pathvec);
> >  
> > -	if (update_multipath_status(mpp))
> > -		return 1;
> > +	r = update_multipath_status(mpp);
> > +	if (r != DMP_PASS)
> > +		return r;
> >  
> >  	vector_foreach_slot(mpp->pg, pgp, i)
> >  		if (pgp->paths)
> >  			path_group_prio_update(pgp);
> >  
> > -	return 0;
> > +	return DMP_PASS;
> >  }
> >  
> >  static void enter_recovery_mode(struct multipath *mpp)
> > diff --git a/multipathd/main.c b/multipathd/main.c
> > index 205ddb32..d73b1b9a 100644
> > --- a/multipathd/main.c
> > +++ b/multipathd/main.c
> > @@ -418,7 +418,7 @@ int __setup_multipath(struct vectors *vecs,
> > struct multipath *mpp,
> >  		goto out;
> >  	}
> >  
> > -	if (update_multipath_strings(mpp, vecs->pathvec, 1)) {
> > +	if (update_multipath_strings(mpp, vecs->pathvec, 1) !=
> > DMP_PASS) {
> >  		condlog(0, "%s: failed to setup multipath", mpp-
> > >alias);
> >  		goto out;
> >  	}
> > @@ -557,9 +557,9 @@ add_map_without_path (struct vectors *vecs, const
> > char *alias)
> >  	mpp->mpe = find_mpe(conf->mptable, mpp->wwid);
> >  	put_multipath_config(conf);
> >  
> > -	if (update_multipath_table(mpp, vecs->pathvec, 1))
> > +	if (update_multipath_table(mpp, vecs->pathvec, 1) != DMP_PASS)
> >  		goto out;
> > -	if (update_multipath_status(mpp))
> > +	if (update_multipath_status(mpp) != DMP_PASS)
> >  		goto out;
> >  
> >  	if (!vector_alloc_slot(vecs->mpvec))
> > @@ -1350,8 +1350,8 @@ map_discovery (struct vectors * vecs)
> >  		return 1;
> >  
> >  	vector_foreach_slot (vecs->mpvec, mpp, i)
> > -		if (update_multipath_table(mpp, vecs->pathvec, 1) ||
> > -		    update_multipath_status(mpp)) {
> > +		if (update_multipath_table(mpp, vecs->pathvec, 1) !=
> > DMP_PASS ||
> > +		    update_multipath_status(mpp) != DMP_PASS) {
> >  			remove_map(mpp, vecs, 1);
> >  			i--;
> >  		}
> > @@ -2091,7 +2091,7 @@ check_path (struct vectors * vecs, struct path
> > * pp, unsigned int ticks)
> >  	/*
> >  	 * Synchronize with kernel state
> >  	 */
> > -	if (update_multipath_strings(pp->mpp, vecs->pathvec, 1)) {
> > +	if (update_multipath_strings(pp->mpp, vecs->pathvec, 1) !=
> > DMP_PASS) {
> >  		condlog(1, "%s: Could not synchronize with kernel
> > state",
> >  			pp->dev);
> >  		pp->dmstate = PSTATE_UNDEF;
> 
> -- 
> Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
> SUSE  Software Solutions Germany GmbH
> HRB 36809, AG Nürnberg GF: Felix
> Imendörffer
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 2/7] multipathd: fix check_path errors with removed map
  2020-07-01 20:19   ` Martin Wilck
@ 2020-07-02 18:34     ` Benjamin Marzinski
  0 siblings, 0 replies; 15+ messages in thread
From: Benjamin Marzinski @ 2020-07-02 18:34 UTC (permalink / raw)
  To: Martin Wilck; +Cc: dm-devel

On Wed, Jul 01, 2020 at 08:19:57PM +0000, Martin Wilck wrote:
> On Thu, 2020-06-25 at 15:42 -0500, Benjamin Marzinski wrote:
> > If a multipath device is removed during, or immediately before the
> > call
> > to check_path(), multipathd can behave incorrectly. A missing
> > multpath
> > device will cause update_multipath_strings() to fail, setting
> > pp->dmstate to PSTATE_UNDEF.  If the path is up, this state will
> > cause
> > reinstate_path() to be called, which will also fail.  This will
> > trigger
> > a reload, restoring the recently removed device.
> > 
> > If update_multipath_strings() fails because there is no multipath
> > device, check_path should just quit, since the remove dmevent and
> > uevent
> > are likely already queued up. Also, I don't see any reason to reload
> > the
> > multipath device if reinstate fails. This code was added by
> > fac68d7a99ef17d496079538a5c6836acd7911ab, which clamined that
> > reinstate
> > could fail if the path was disabled.  Looking through the current
> > kernel
> > code, I can't see any reason why a reinstate would fail, where a
> > reload
> > would help. If the path was missing from the multipath device,
> > update_multipath_strings() would already catch that, and quit
> > check_path() early, which make more sense to me than reloading does.
> > 
> > Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> > ---
> >  multipathd/main.c | 37 ++++++++++++++-----------------------
> >  1 file changed, 14 insertions(+), 23 deletions(-)
> > 
> > diff --git a/multipathd/main.c b/multipathd/main.c
> > index d73b1b9a..22bc4363 100644
> > --- a/multipathd/main.c
> > +++ b/multipathd/main.c
> > @@ -1615,22 +1615,18 @@ fail_path (struct path * pp, int del_active)
> >  /*
> >   * caller must have locked the path list before calling that
> > function
> >   */
> > -static int
> > +static void
> >  reinstate_path (struct path * pp)
> >  {
> > -	int ret = 0;
> > -
> >  	if (!pp->mpp)
> > -		return 0;
> > +		return;
> >  
> > -	if (dm_reinstate_path(pp->mpp->alias, pp->dev_t)) {
> > +	if (dm_reinstate_path(pp->mpp->alias, pp->dev_t))
> >  		condlog(0, "%s: reinstate failed", pp->dev_t);
> > -		ret = 1;
> > -	} else {
> > +	else {
> >  		condlog(2, "%s: reinstated", pp->dev_t);
> >  		update_queue_mode_add_path(pp->mpp);
> >  	}
> > -	return ret;
> >  }
> >  
> >  static void
> > @@ -2091,9 +2087,13 @@ check_path (struct vectors * vecs, struct path
> > * pp, unsigned int ticks)
> >  	/*
> >  	 * Synchronize with kernel state
> >  	 */
> > -	if (update_multipath_strings(pp->mpp, vecs->pathvec, 1) !=
> > DMP_PASS) {
> > +	ret = update_multipath_strings(pp->mpp, vecs->pathvec, 1);
> > +	if (ret != DMP_PASS) {
> >  		condlog(1, "%s: Could not synchronize with kernel
> > state",
> >  			pp->dev);
> 
> We could do even better here by printing different log messages
> in the two cases we differentiate.

Will do.

-Ben
 
> Apart from that, ACK.
> 
> > +		if (ret == DMP_FAIL)
> > +			/* multipath device missing. Likely removed */
> > +			return 0;
> >  		pp->dmstate = PSTATE_UNDEF;
> >  	}
> >  	/* if update_multipath_strings orphaned the path, quit early */
> > @@ -2183,12 +2183,8 @@ check_path (struct vectors * vecs, struct path
> > * pp, unsigned int ticks)
> >  		/*
> >  		 * reinstate this path
> >  		 */
> > -		if (!disable_reinstate && reinstate_path(pp)) {
> > -			condlog(3, "%s: reload map", pp->dev);
> > -			ev_add_path(pp, vecs, 1);
> > -			pp->tick = 1;
> > -			return 0;
> > -		}
> > +		if (!disable_reinstate)
> > +			reinstate_path(pp);
> >  		new_path_up = 1;
> >  
> >  		if (oldchkrstate != PATH_UP && oldchkrstate !=
> > PATH_GHOST)
> > @@ -2204,15 +2200,10 @@ check_path (struct vectors * vecs, struct
> > path * pp, unsigned int ticks)
> >  	else if (newstate == PATH_UP || newstate == PATH_GHOST) {
> >  		if ((pp->dmstate == PSTATE_FAILED ||
> >  		    pp->dmstate == PSTATE_UNDEF) &&
> > -		    !disable_reinstate) {
> > +		    !disable_reinstate)
> >  			/* Clear IO errors */
> > -			if (reinstate_path(pp)) {
> > -				condlog(3, "%s: reload map", pp->dev);
> > -				ev_add_path(pp, vecs, 1);
> > -				pp->tick = 1;
> > -				return 0;
> > -			}
> > -		} else {
> > +			reinstate_path(pp);
> > +		else {
> >  			LOG_MSG(4, verbosity, pp);
> >  			if (pp->checkint != max_checkint) {
> >  				/*
> 
> -- 
> Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
> SUSE  Software Solutions Germany GmbH
> HRB 36809, AG Nürnberg GF: Felix
> Imendörffer
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2020-07-02 18:34 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-25 20:42 [PATCH v2 0/7] Fix muitpath/multipathd flush issue Benjamin Marzinski
2020-06-25 20:42 ` [PATCH v2 1/7] libmultipath: make dm_get_map/status return codes symbolic Benjamin Marzinski
2020-07-01 20:15   ` Martin Wilck
2020-07-02 18:33     ` Benjamin Marzinski
2020-06-25 20:42 ` [PATCH v2 2/7] multipathd: fix check_path errors with removed map Benjamin Marzinski
2020-07-01 20:19   ` Martin Wilck
2020-07-02 18:34     ` Benjamin Marzinski
2020-06-25 20:42 ` [PATCH v2 3/7] libmultipath: make dm_flush_maps only return 0 on success Benjamin Marzinski
2020-06-25 20:42 ` [PATCH v2 4/7] multipathd: add "del maps" multipathd command Benjamin Marzinski
2020-07-01 20:59   ` Martin Wilck
2020-06-25 20:42 ` [PATCH v2 5/7] multipath: make flushing maps work like other commands Benjamin Marzinski
2020-06-25 20:42 ` [PATCH v2 6/7] multipath: delegate flushing maps to multipathd Benjamin Marzinski
2020-07-01 21:00   ` Martin Wilck
2020-06-25 20:42 ` [PATCH v2 7/7] multipath: add option to skip multipathd delegation Benjamin Marzinski
2020-07-01 21:01   ` Martin Wilck

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.