All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 00/17] Misc Multipath patches
@ 2019-03-30  6:05 Benjamin Marzinski
  2019-03-30  6:05 ` [PATCH v3 01/17] BZ 1668693: disable user_friendly_names for NetApp Benjamin Marzinski
                   ` (16 more replies)
  0 siblings, 17 replies; 24+ messages in thread
From: Benjamin Marzinski @ 2019-03-30  6:05 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck

The series is a mix of resends an new patches.

Patches 0001-0005 are simply resends of patches I've submitted earlier,
with no changes other that adding Reviewed-by's where appropriate.

Patches 0006-0009 are the result of running into some bugs during
firmare updates on an array.

Changes in v3:
0007:	This is actually the same as the v2 version.  Martin's issues
	for this and 0011 are addressed in the new 0012 patch.
	However, since I don't think that failing paths on NULL wwids
	is a good policy for reasons I outlined in
	https://www.redhat.com/archives/dm-devel/2019-March/msg00207.html
	and I scrapped the v2 0012 patch, I'm leaving off Martin's
	ACK from this one.

0010:	Hopefully I've got all the missing man page options now.

0012:	Cleanup to deal with Martin's objection to uid_fallback()
	deciding on whether to actually run the fallback code. This
	also removes the need for the poorly named old_len variable.

0013:	This is heavily based on Martin's
	"multipathd: handle changed wwids by removal and addition"
	The changes stem from not removing paths with NULL wwids.
	Martin, if you want to be creditted as the Author, that's
	fine.

0014 &	The later two patches in Martin's "New approach at handling
0015:	changed WWIDs" patchset

0016:	The more I think about it, the less I like the fallback code
	as well. However, it has helped get multipath devices up in
	situations where udev is having issues, so I don't want
	to scrap it.  But I do think that once the device is up, we
	should only use the configured method of getting the uuid.

0017:	A completely unrelated patch to stop multipath error messages
	when removing a device with kpartx paritions.

Changes in v2:

0007:   simply fixes the issue with multipathd not resetting the wwid
        when get_uid() fails, and makes scsi_uid_fallback return the
        correct value, when it is still waiting for the retriggers
        to max out. The Changes that Martin suggested are dealt with
        in patch 0012

0010:   This patch adds some missing options to the multipath.conf(5)
        man page, including "user_friendly_names" like Martin metioned
        in his earlier review of patch 0001.

0011:   This patch adds a fallback method for nvme devices, to grab the
        wwid directly from sysfs.


Benjamin Marzinski (15):
  libmulitpath: disable user_friendly_names for NetApp
  libmultipath: handle existing paths in marginal_path enqueue
  multipathd: cleanup marginal paths checking timers
  libmultipath: fix marginal paths queueing errors
  libmultipath: fix marginal_paths nr_active check
  multipathd: Fix miscounting active paths
  multipathd: ignore failed wwid recheck
  libmutipath: continue to use old state on PATH_PENDING
  multipathd: use update_path_groups instead of reload_map
  multipath.conf: add missing options to man page
  libmultipath: add get_uid fallback code for NVMe devices
  libmulitpath: cleanup uid_fallback code
  multipathd: handle changed wwids by removal and addition
  multipathd: Don't use fallback code after getting wwid
  libmultipath: silence dm_is_mpath error messages

Martin Wilck (2):
  multipathd: remove "wwid_changed" path attribute
  multipathd: ignore "disable_changed_wwids"

 libmultipath/config.c      |  1 -
 libmultipath/config.h      |  1 -
 libmultipath/devmapper.c   |  2 +-
 libmultipath/dict.c        | 18 ++++++++--
 libmultipath/discovery.c   | 70 +++++++++++++++++++++++++------------
 libmultipath/discovery.h   |  3 +-
 libmultipath/hwtable.c     |  1 +
 libmultipath/io_err_stat.c | 71 +++++++++++++++-----------------------
 libmultipath/io_err_stat.h |  2 +-
 libmultipath/structs.h     |  1 -
 multipath/main.c           |  2 +-
 multipath/multipath.conf.5 | 18 ++++++----
 multipathd/cli_handlers.c  |  2 +-
 multipathd/main.c          | 58 ++++++++++++-------------------
 multipathd/main.h          |  2 ++
 15 files changed, 134 insertions(+), 118 deletions(-)

-- 
2.17.2

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v3 01/17] BZ 1668693: disable user_friendly_names for NetApp
  2019-03-30  6:05 [PATCH v3 00/17] Misc Multipath patches Benjamin Marzinski
@ 2019-03-30  6:05 ` Benjamin Marzinski
  2019-03-30  6:05 ` [PATCH v3 02/17] libmultipath: handle existing paths in marginal_path enqueue Benjamin Marzinski
                   ` (15 subsequent siblings)
  16 siblings, 0 replies; 24+ messages in thread
From: Benjamin Marzinski @ 2019-03-30  6:05 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck

NetApp has tools that rely on devices using WWID names. To avoid
breaking these, NetApp devices should continue to use WWID names, even
if the default config is set to enable user_friendly_names. If users
want to use user_friendly_names on NetApp devices, the must specifically
override the device config.

Reviewed-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 libmultipath/hwtable.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/libmultipath/hwtable.c b/libmultipath/hwtable.c
index d3a8d9b..8776411 100644
--- a/libmultipath/hwtable.c
+++ b/libmultipath/hwtable.c
@@ -719,6 +719,7 @@ static struct hwentry default_hw[] = {
 		.flush_on_last_del = FLUSH_ENABLED,
 		.dev_loss      = MAX_DEV_LOSS_TMO,
 		.prio_name     = PRIO_ONTAP,
+		.user_friendly_names = USER_FRIENDLY_NAMES_OFF,
 	},
 	{
 		/*
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v3 02/17] libmultipath: handle existing paths in marginal_path enqueue
  2019-03-30  6:05 [PATCH v3 00/17] Misc Multipath patches Benjamin Marzinski
  2019-03-30  6:05 ` [PATCH v3 01/17] BZ 1668693: disable user_friendly_names for NetApp Benjamin Marzinski
@ 2019-03-30  6:05 ` Benjamin Marzinski
  2019-03-30  6:05 ` [PATCH v3 03/17] multipathd: cleanup marginal paths checking timers Benjamin Marzinski
                   ` (14 subsequent siblings)
  16 siblings, 0 replies; 24+ messages in thread
From: Benjamin Marzinski @ 2019-03-30  6:05 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck

If the path that enqueue_io_err_stat_by_path() is trying to add
is already on the list, just return success. There's no reason
to fail in this case.

Reviewed-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 libmultipath/io_err_stat.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/libmultipath/io_err_stat.c b/libmultipath/io_err_stat.c
index 02b1453..1cb3ffe 100644
--- a/libmultipath/io_err_stat.c
+++ b/libmultipath/io_err_stat.c
@@ -254,7 +254,6 @@ static void free_io_err_pathvec(struct io_err_stat_pathvec *p)
  * return value
  * 0: enqueue OK
  * 1: fails because of internal error
- * 2: fails because of existing already
  */
 static int enqueue_io_err_stat_by_path(struct path *path)
 {
@@ -264,7 +263,7 @@ static int enqueue_io_err_stat_by_path(struct path *path)
 	p = find_err_path_by_dev(paths->pathvec, path->dev);
 	if (p) {
 		pthread_mutex_unlock(&paths->mutex);
-		return 2;
+		return 0;
 	}
 	pthread_mutex_unlock(&paths->mutex);
 
@@ -418,9 +417,8 @@ int hit_io_err_recheck_time(struct path *pp)
 			io_err_stat_log(3, "%s: enqueue fails, to recover",
 					pp->dev);
 			goto recover;
-		} else if (!r) {
+		} else
 			pp->io_err_pathfail_cnt = PATH_IO_ERR_IN_CHECKING;
-		}
 	}
 
 	return 1;
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v3 03/17] multipathd: cleanup marginal paths checking timers
  2019-03-30  6:05 [PATCH v3 00/17] Misc Multipath patches Benjamin Marzinski
  2019-03-30  6:05 ` [PATCH v3 01/17] BZ 1668693: disable user_friendly_names for NetApp Benjamin Marzinski
  2019-03-30  6:05 ` [PATCH v3 02/17] libmultipath: handle existing paths in marginal_path enqueue Benjamin Marzinski
@ 2019-03-30  6:05 ` Benjamin Marzinski
  2019-03-30  6:05 ` [PATCH v3 04/17] libmultipath: fix marginal paths queueing errors Benjamin Marzinski
                   ` (13 subsequent siblings)
  16 siblings, 0 replies; 24+ messages in thread
From: Benjamin Marzinski @ 2019-03-30  6:05 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck

When a path gets recovered in hit_io_err_recheck_time(), it will
continue running in check_path(), so there is no reason to schedule
another path check as soon as possible (since one is currently
happening).

Also, there isn't much point in restarting the io err stat checking when
the path is down, so hit_io_err_recheck_time() should only be run when
the path is up. Downed marginal paths can be treated just like any other
downed path.

Finally, there is no reason to set reset pp->io_err_dis_reinstate_time
when we decide to enqueue a path. Either th enqueue will fail and the
path will get recovered, or it will succeed, and we won't check the
reinstate time again until poll_io_err_stat() marks the path as needing
a requeue.

Reviewed-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 libmultipath/io_err_stat.c | 8 --------
 multipathd/main.c          | 3 ++-
 2 files changed, 2 insertions(+), 9 deletions(-)

diff --git a/libmultipath/io_err_stat.c b/libmultipath/io_err_stat.c
index 1cb3ffe..416e13a 100644
--- a/libmultipath/io_err_stat.c
+++ b/libmultipath/io_err_stat.c
@@ -400,13 +400,6 @@ int hit_io_err_recheck_time(struct path *pp)
 		io_err_stat_log(4, "%s: reschedule checking after %d seconds",
 				pp->dev,
 				pp->mpp->marginal_path_err_recheck_gap_time);
-		/*
-		 * to reschedule io error checking again
-		 * if the path is good enough, we claim it is good
-		 * and can be reinsated as soon as possible in the
-		 * check_path routine.
-		 */
-		pp->io_err_dis_reinstate_time = curr_time.tv_sec;
 		r = enqueue_io_err_stat_by_path(pp);
 		/*
 		 * Enqueue fails because of internal error.
@@ -426,7 +419,6 @@ int hit_io_err_recheck_time(struct path *pp)
 recover:
 	pp->io_err_pathfail_cnt = 0;
 	pp->io_err_disable_reinstate = 0;
-	pp->tick = 1;
 	return 0;
 }
 
diff --git a/multipathd/main.c b/multipathd/main.c
index fb520b6..fe6d8ef 100644
--- a/multipathd/main.c
+++ b/multipathd/main.c
@@ -2079,7 +2079,8 @@ check_path (struct vectors * vecs, struct path * pp, int ticks)
 		return 1;
 	}
 
-	if (pp->io_err_disable_reinstate && hit_io_err_recheck_time(pp)) {
+	if ((newstate == PATH_UP || newstate == PATH_GHOST) &&
+	    pp->io_err_disable_reinstate && hit_io_err_recheck_time(pp)) {
 		pp->state = PATH_SHAKY;
 		/*
 		 * to reschedule as soon as possible,so that this path can
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v3 04/17] libmultipath: fix marginal paths queueing errors
  2019-03-30  6:05 [PATCH v3 00/17] Misc Multipath patches Benjamin Marzinski
                   ` (2 preceding siblings ...)
  2019-03-30  6:05 ` [PATCH v3 03/17] multipathd: cleanup marginal paths checking timers Benjamin Marzinski
@ 2019-03-30  6:05 ` Benjamin Marzinski
  2019-03-30  6:05 ` [PATCH v3 05/17] libmultipath: fix marginal_paths nr_active check Benjamin Marzinski
                   ` (12 subsequent siblings)
  16 siblings, 0 replies; 24+ messages in thread
From: Benjamin Marzinski @ 2019-03-30  6:05 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck

The current marginal paths code tries to enqueue paths for io error
checking when multipathd receives a uevent on path failure. This can run
into a couple of problems. First, this uevent could happen before or
after multipathd actually fails the path, so simply checking nr_active
doesn't tell us if this is the last active path. Also, The code to fail
the path in enqueue_io_err_stat_by_path() doesn't ever update the path
state. This can cause the path to get failed twice, temporarily leading
to incorrect nr_active counts. Further, The point when multipathd should
care if this is the last active path is when the path has come up again,
not when it goes down. Lastly, if the path is down, it is often
impossible to open the path device, causing setup_directio_ctx() to
fail, which causes multipathd to skip io error checking and mark the
path as not marginal.

Instead, multipathd should just make sure that if the path is marginal,
it gets failed in the uevent, so as not to race with the checkerloop
thread. Then, when the path comes back up, check_path() can enqueue it,
just like it does for paths that need to get rechecked. To make it
obvious that the state PATH_IO_ERR_IN_POLLING_RECHECK and the function
hit_io_err_recheck_time() now apply to paths waiting to be enqueued for
the first time as well, I've also changed their names to
PATH_IO_ERR_WAITING_TO_CHECK and need_io_err_check(), respectively.

Reviewed-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 libmultipath/io_err_stat.c | 55 +++++++++++++++++---------------------
 libmultipath/io_err_stat.h |  2 +-
 multipathd/main.c          |  2 +-
 3 files changed, 27 insertions(+), 32 deletions(-)

diff --git a/libmultipath/io_err_stat.c b/libmultipath/io_err_stat.c
index 416e13a..72aacf3 100644
--- a/libmultipath/io_err_stat.c
+++ b/libmultipath/io_err_stat.c
@@ -41,7 +41,7 @@
 #define CONCUR_NR_EVENT			32
 
 #define PATH_IO_ERR_IN_CHECKING		-1
-#define PATH_IO_ERR_IN_POLLING_RECHECK	-2
+#define PATH_IO_ERR_WAITING_TO_CHECK	-2
 
 #define io_err_stat_log(prio, fmt, args...) \
 	condlog(prio, "io error statistic: " fmt, ##args)
@@ -283,24 +283,6 @@ static int enqueue_io_err_stat_by_path(struct path *path)
 	vector_set_slot(paths->pathvec, p);
 	pthread_mutex_unlock(&paths->mutex);
 
-	if (!path->io_err_disable_reinstate) {
-		/*
-		 *fail the path in the kernel for the time of the to make
-		 *the test more reliable
-		 */
-		io_err_stat_log(3, "%s: fail dm path %s before checking",
-				path->mpp->alias, path->dev);
-		path->io_err_disable_reinstate = 1;
-		dm_fail_path(path->mpp->alias, path->dev_t);
-		update_queue_mode_del_path(path->mpp);
-
-		/*
-		 * schedule path check as soon as possible to
-		 * update path state to delayed state
-		 */
-		path->tick = 1;
-
-	}
 	io_err_stat_log(2, "%s: enqueue path %s to check",
 			path->mpp->alias, path->dev);
 	return 0;
@@ -317,7 +299,6 @@ free_ioerr_path:
 int io_err_stat_handle_pathfail(struct path *path)
 {
 	struct timespec curr_time;
-	int res;
 
 	if (uatomic_read(&io_err_thread_running) == 0)
 		return 1;
@@ -332,8 +313,6 @@ int io_err_stat_handle_pathfail(struct path *path)
 
 	if (!path->mpp)
 		return 1;
-	if (path->mpp->nr_active <= 1)
-		return 1;
 	if (path->mpp->marginal_path_double_failed_time <= 0 ||
 		path->mpp->marginal_path_err_sample_time <= 0 ||
 		path->mpp->marginal_path_err_recheck_gap_time <= 0 ||
@@ -371,17 +350,33 @@ int io_err_stat_handle_pathfail(struct path *path)
 	}
 	path->io_err_pathfail_cnt++;
 	if (path->io_err_pathfail_cnt >= FLAKY_PATHFAIL_THRESHOLD) {
-		res = enqueue_io_err_stat_by_path(path);
-		if (!res)
-			path->io_err_pathfail_cnt = PATH_IO_ERR_IN_CHECKING;
-		else
-			path->io_err_pathfail_cnt = 0;
+		path->io_err_disable_reinstate = 1;
+		path->io_err_pathfail_cnt = PATH_IO_ERR_WAITING_TO_CHECK;
+		/* enqueue path as soon as it comes up */
+		path->io_err_dis_reinstate_time = 0;
+		if (path->state != PATH_DOWN) {
+			struct config *conf;
+			int oldstate = path->state;
+			int checkint;
+
+			conf = get_multipath_config();
+			checkint = conf->checkint;
+			put_multipath_config(conf);
+			io_err_stat_log(2, "%s: mark as failed", path->dev);
+			path->mpp->stat_path_failures++;
+			path->state = PATH_DOWN;
+			path->dmstate = PSTATE_FAILED;
+			if (oldstate == PATH_UP || oldstate == PATH_GHOST)
+				update_queue_mode_del_path(path->mpp);
+			if (path->tick > checkint)
+				path->tick = checkint;
+		}
 	}
 
 	return 0;
 }
 
-int hit_io_err_recheck_time(struct path *pp)
+int need_io_err_check(struct path *pp)
 {
 	struct timespec curr_time;
 	int r;
@@ -392,7 +387,7 @@ int hit_io_err_recheck_time(struct path *pp)
 		io_err_stat_log(2, "%s: recover path early", pp->dev);
 		goto recover;
 	}
-	if (pp->io_err_pathfail_cnt != PATH_IO_ERR_IN_POLLING_RECHECK)
+	if (pp->io_err_pathfail_cnt != PATH_IO_ERR_WAITING_TO_CHECK)
 		return 1;
 	if (clock_gettime(CLOCK_MONOTONIC, &curr_time) != 0 ||
 	    (curr_time.tv_sec - pp->io_err_dis_reinstate_time) >
@@ -489,7 +484,7 @@ static int poll_io_err_stat(struct vectors *vecs, struct io_err_stat_path *pp)
 	} else if (path->mpp && path->mpp->nr_active > 1) {
 		io_err_stat_log(3, "%s: keep failing the dm path %s",
 				path->mpp->alias, path->dev);
-		path->io_err_pathfail_cnt = PATH_IO_ERR_IN_POLLING_RECHECK;
+		path->io_err_pathfail_cnt = PATH_IO_ERR_WAITING_TO_CHECK;
 		path->io_err_disable_reinstate = 1;
 		path->io_err_dis_reinstate_time = currtime.tv_sec;
 		io_err_stat_log(3, "%s: disable reinstating of %s",
diff --git a/libmultipath/io_err_stat.h b/libmultipath/io_err_stat.h
index bbf31b4..53d6d7d 100644
--- a/libmultipath/io_err_stat.h
+++ b/libmultipath/io_err_stat.h
@@ -10,6 +10,6 @@ extern pthread_attr_t io_err_stat_attr;
 int start_io_err_stat_thread(void *data);
 void stop_io_err_stat_thread(void);
 int io_err_stat_handle_pathfail(struct path *path);
-int hit_io_err_recheck_time(struct path *pp);
+int need_io_err_check(struct path *pp);
 
 #endif /* _IO_ERR_STAT_H */
diff --git a/multipathd/main.c b/multipathd/main.c
index fe6d8ef..43830e8 100644
--- a/multipathd/main.c
+++ b/multipathd/main.c
@@ -2080,7 +2080,7 @@ check_path (struct vectors * vecs, struct path * pp, int ticks)
 	}
 
 	if ((newstate == PATH_UP || newstate == PATH_GHOST) &&
-	    pp->io_err_disable_reinstate && hit_io_err_recheck_time(pp)) {
+	    pp->io_err_disable_reinstate && need_io_err_check(pp)) {
 		pp->state = PATH_SHAKY;
 		/*
 		 * to reschedule as soon as possible,so that this path can
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v3 05/17] libmultipath: fix marginal_paths nr_active check
  2019-03-30  6:05 [PATCH v3 00/17] Misc Multipath patches Benjamin Marzinski
                   ` (3 preceding siblings ...)
  2019-03-30  6:05 ` [PATCH v3 04/17] libmultipath: fix marginal paths queueing errors Benjamin Marzinski
@ 2019-03-30  6:05 ` Benjamin Marzinski
  2019-03-30  6:05 ` [PATCH v3 06/17] multipathd: Fix miscounting active paths Benjamin Marzinski
                   ` (11 subsequent siblings)
  16 siblings, 0 replies; 24+ messages in thread
From: Benjamin Marzinski @ 2019-03-30  6:05 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck

Marginal paths are SHAKY, so they don't count towards the number of
active paths. poll_io_err_stat() shouldn't automatically reinstate a
marginal path if there already is an active path.

Reviewed-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 libmultipath/io_err_stat.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libmultipath/io_err_stat.c b/libmultipath/io_err_stat.c
index 72aacf3..554b777 100644
--- a/libmultipath/io_err_stat.c
+++ b/libmultipath/io_err_stat.c
@@ -481,7 +481,7 @@ static int poll_io_err_stat(struct vectors *vecs, struct io_err_stat_path *pp)
 		 */
 		path->tick = 1;
 
-	} else if (path->mpp && path->mpp->nr_active > 1) {
+	} else if (path->mpp && path->mpp->nr_active > 0) {
 		io_err_stat_log(3, "%s: keep failing the dm path %s",
 				path->mpp->alias, path->dev);
 		path->io_err_pathfail_cnt = PATH_IO_ERR_WAITING_TO_CHECK;
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v3 06/17] multipathd: Fix miscounting active paths
  2019-03-30  6:05 [PATCH v3 00/17] Misc Multipath patches Benjamin Marzinski
                   ` (4 preceding siblings ...)
  2019-03-30  6:05 ` [PATCH v3 05/17] libmultipath: fix marginal_paths nr_active check Benjamin Marzinski
@ 2019-03-30  6:05 ` Benjamin Marzinski
  2019-03-30  6:05 ` [PATCH v3 07/17] multipathd: ignore failed wwid recheck Benjamin Marzinski
                   ` (10 subsequent siblings)
  16 siblings, 0 replies; 24+ messages in thread
From: Benjamin Marzinski @ 2019-03-30  6:05 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck

When multipathd gets a change uevent, it calls pathinfo with DI_NOIO.
This sets the path state to the return value of path_offline(). If a
path is in the PATH_DOWN state but path_offline() returns PATH_UP, when
that path gets a change event, its state will get moved to PATH_UP
without either reinstating the path, or reloading the map.  The next
call to check_path() will move the path back to PATH_DOWN. Since
check_path() simply increments and decrements nr_active instead of
calculating it based on the actual number of active paths, nr_active
will get decremented a second time for this failed path, potentially
putting the multipath device into recovery mode.

This commit does two things to avoid this situation. It makes the
DI_NOIO flag only set pp->state in pathinfo() if DI_CHECKER is also set.
This isn't set in uev_update_path() to avoid changing the path state in
this case.  Also, to guard against pp->state getting changed in some
other code path without properly updating the map state, check_path()
now calls set_no_path_retry, which recalculates nr_active based on the
actual number of active paths, and makes sure that the queue_if_no_path
value in the features line is correct.

Reviewed-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 libmultipath/discovery.c | 11 ++++++-----
 multipath/main.c         |  2 +-
 multipathd/main.c        |  4 +++-
 3 files changed, 10 insertions(+), 7 deletions(-)

diff --git a/libmultipath/discovery.c b/libmultipath/discovery.c
index 10bd8cd..729bcb9 100644
--- a/libmultipath/discovery.c
+++ b/libmultipath/discovery.c
@@ -1914,11 +1914,12 @@ int pathinfo(struct path *pp, struct config *conf, int mask)
 	if (path_state == PATH_REMOVED)
 		goto blank;
 	else if (mask & DI_NOIO) {
-		/*
-		 * Avoid any IO on the device itself.
-		 * Behave like DI_CHECKER in the "path unavailable" case.
-		 */
-		pp->chkrstate = pp->state = path_state;
+		if (mask & DI_CHECKER)
+			/*
+			 * Avoid any IO on the device itself.
+			 * simply use the path_offline() return as its state
+			 */
+			pp->chkrstate = pp->state = path_state;
 		return PATHINFO_OK;
 	}
 
diff --git a/multipath/main.c b/multipath/main.c
index 5abb118..69141db 100644
--- a/multipath/main.c
+++ b/multipath/main.c
@@ -356,7 +356,7 @@ static int check_usable_paths(struct config *conf,
 			pp->udev = get_udev_device(pp->dev_t, DEV_DEVT);
 			if (pp->udev == NULL)
 				continue;
-			if (pathinfo(pp, conf, DI_SYSFS|DI_NOIO) != PATHINFO_OK)
+			if (pathinfo(pp, conf, DI_SYSFS|DI_NOIO|DI_CHECKER) != PATHINFO_OK)
 				continue;
 
 			if (pp->state == PATH_UP &&
diff --git a/multipathd/main.c b/multipathd/main.c
index 43830e8..678ecf8 100644
--- a/multipathd/main.c
+++ b/multipathd/main.c
@@ -392,7 +392,8 @@ static void set_no_path_retry(struct multipath *mpp)
 	default:
 		if (mpp->nr_active > 0) {
 			mpp->retry_tick = 0;
-			dm_queue_if_no_path(mpp->alias, 1);
+			if (!is_queueing)
+				dm_queue_if_no_path(mpp->alias, 1);
 		} else if (is_queueing && mpp->retry_tick == 0)
 			enter_recovery_mode(mpp);
 		break;
@@ -2072,6 +2073,7 @@ check_path (struct vectors * vecs, struct path * pp, int ticks)
 	/* if update_multipath_strings orphaned the path, quit early */
 	if (!pp->mpp)
 		return 0;
+	set_no_path_retry(pp->mpp);
 
 	if ((newstate == PATH_UP || newstate == PATH_GHOST) &&
 			check_path_reinstate_state(pp)) {
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v3 07/17] multipathd: ignore failed wwid recheck
  2019-03-30  6:05 [PATCH v3 00/17] Misc Multipath patches Benjamin Marzinski
                   ` (5 preceding siblings ...)
  2019-03-30  6:05 ` [PATCH v3 06/17] multipathd: Fix miscounting active paths Benjamin Marzinski
@ 2019-03-30  6:05 ` Benjamin Marzinski
  2019-04-01 14:32   ` Martin Wilck
  2019-03-30  6:05 ` [PATCH v3 08/17] libmutipath: continue to use old state on PATH_PENDING Benjamin Marzinski
                   ` (9 subsequent siblings)
  16 siblings, 1 reply; 24+ messages in thread
From: Benjamin Marzinski @ 2019-03-30  6:05 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck

If disable_changed_wwids is set, when multipathd gets a change event on
a path, it verifies that the wwid hasn't changed in uev_update_path().
If get_uid() failed, uev_update_path treated this as a wwid change to 0.
This could cause paths to suddenly be dropped due to an issue with
getting the wwid.  Even if get_uid() failed because the path was down,
it no change uevent happend when it later became active, multipathd
would continue to ignore the path. Also, scsi_uid_fallback() clears the
failure return if it doesn't attempt to fallback, causing get_uid()
to return success, when it actually failed.

Multipathd should neither set nor clear wwid_changed if get_uid()
returned failure. Also, scsi_uid_fallback() should retain the old return
value if it doesn't attempt to fallback.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 libmultipath/discovery.c | 6 +++---
 multipathd/main.c        | 6 ++++--
 2 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/libmultipath/discovery.c b/libmultipath/discovery.c
index 729bcb9..b08cb2d 100644
--- a/libmultipath/discovery.c
+++ b/libmultipath/discovery.c
@@ -1755,9 +1755,9 @@ get_vpd_uid(struct path * pp)
 }
 
 static ssize_t scsi_uid_fallback(struct path *pp, int path_state,
-			     const char **origin)
+			     const char **origin, ssize_t old_len)
 {
-	ssize_t len = 0;
+	ssize_t len = old_len;
 	int retrigger;
 	struct config *conf;
 
@@ -1828,7 +1828,7 @@ get_uid (struct path * pp, int path_state, struct udev_device *udev)
 			origin = "sysfs";
 		}
 		if (len <= 0 && pp->bus == SYSFS_BUS_SCSI)
-			len = scsi_uid_fallback(pp, path_state, &origin);
+			len = scsi_uid_fallback(pp, path_state, &origin, len);
 	}
 	if ( len < 0 ) {
 		condlog(1, "%s: failed to get %s uid: %s",
diff --git a/multipathd/main.c b/multipathd/main.c
index 678ecf8..fd83a6a 100644
--- a/multipathd/main.c
+++ b/multipathd/main.c
@@ -1234,9 +1234,11 @@ uev_update_path (struct uevent *uev, struct vectors * vecs)
 			goto out;
 
 		strcpy(wwid, pp->wwid);
-		get_uid(pp, pp->state, uev->udev);
+		rc = get_uid(pp, pp->state, uev->udev);
 
-		if (strncmp(wwid, pp->wwid, WWID_SIZE) != 0) {
+		if (rc != 0)
+			strcpy(pp->wwid, wwid);
+		else if (strncmp(wwid, pp->wwid, WWID_SIZE) != 0) {
 			condlog(0, "%s: path wwid changed from '%s' to '%s'. %s",
 				uev->kernel, wwid, pp->wwid,
 				(disable_changed_wwids ? "disallowing" :
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v3 08/17] libmutipath: continue to use old state on PATH_PENDING
  2019-03-30  6:05 [PATCH v3 00/17] Misc Multipath patches Benjamin Marzinski
                   ` (6 preceding siblings ...)
  2019-03-30  6:05 ` [PATCH v3 07/17] multipathd: ignore failed wwid recheck Benjamin Marzinski
@ 2019-03-30  6:05 ` Benjamin Marzinski
  2019-03-30  6:05 ` [PATCH v3 09/17] multipathd: use update_path_groups instead of reload_map Benjamin Marzinski
                   ` (8 subsequent siblings)
  16 siblings, 0 replies; 24+ messages in thread
From: Benjamin Marzinski @ 2019-03-30  6:05 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck

When pathinfo() sets pp->state to PATH_PENDING, it can cause problems
with path checking.  It should act more like check_path(). When
check_path() sees a new state of PATH_PENDING, it doesn't update the
path state at all, so a path's old state is normally never PATH_PENDING.

As and example of the problems of setting a path to PATH_PENDING, If
check_path() sets a path's state to PATH_UP, then a call to pathinfo()
sets the state to PATH_PENDING, and then another call the check_path()
sets the state to PATH_DOWN, multipathd won't fail the path in the
kernel. Also, if a path's state is PATH_PENDING, and nr_active is
recalculated, that path will count as down, even if the state was
previously PATH_UP. If a path already has a state of PATH_WILD or
PATH_UNCHECKED, changing it to PATH_PENDING won't hurt anything, and it
will help anyone who sees it know what's actually happening. But
otherwise, pathinfo() should leave the previous state alone.

Reviewed-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 libmultipath/discovery.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/libmultipath/discovery.c b/libmultipath/discovery.c
index b08cb2d..28c00e5 100644
--- a/libmultipath/discovery.c
+++ b/libmultipath/discovery.c
@@ -1946,8 +1946,11 @@ int pathinfo(struct path *pp, struct config *conf, int mask)
 
 	if (mask & DI_CHECKER) {
 		if (path_state == PATH_UP) {
-			pp->chkrstate = pp->state = get_state(pp, conf, 0,
-							      path_state);
+			int newstate = get_state(pp, conf, 0, path_state);
+			if (newstate != PATH_PENDING ||
+			    pp->state == PATH_UNCHECKED ||
+			    pp->state == PATH_WILD)
+				pp->chkrstate = pp->state = newstate;
 			if (pp->state == PATH_TIMEOUT)
 				pp->state = PATH_DOWN;
 			if (pp->state == PATH_UP && !pp->size) {
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v3 09/17] multipathd: use update_path_groups instead of reload_map
  2019-03-30  6:05 [PATCH v3 00/17] Misc Multipath patches Benjamin Marzinski
                   ` (7 preceding siblings ...)
  2019-03-30  6:05 ` [PATCH v3 08/17] libmutipath: continue to use old state on PATH_PENDING Benjamin Marzinski
@ 2019-03-30  6:05 ` Benjamin Marzinski
  2019-03-30  6:05 ` [PATCH v3 10/17] multipath.conf: add missing options to man page Benjamin Marzinski
                   ` (7 subsequent siblings)
  16 siblings, 0 replies; 24+ messages in thread
From: Benjamin Marzinski @ 2019-03-30  6:05 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck

reload_map() doesn't do the work to sync the state after reloading the
map.  Instead of calling it directly, cli_reload() and uev_update_path()
should call update_path_groups(), which calls reload_map() with all the
necessary syncing.

Reviewed-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 multipathd/cli_handlers.c |  2 +-
 multipathd/main.c         | 13 ++++++++-----
 multipathd/main.h         |  2 ++
 3 files changed, 11 insertions(+), 6 deletions(-)

diff --git a/multipathd/cli_handlers.c b/multipathd/cli_handlers.c
index f95813e..60e17d6 100644
--- a/multipathd/cli_handlers.c
+++ b/multipathd/cli_handlers.c
@@ -877,7 +877,7 @@ cli_reload(void *v, char **reply, int *len, void *data)
 		return 1;
 	}
 
-	return reload_map(vecs, mpp, 0, 1);
+	return update_path_groups(mpp, vecs, 0);
 }
 
 int resize_map(struct multipath *mpp, unsigned long long size,
diff --git a/multipathd/main.c b/multipathd/main.c
index fd83a6a..7a317d9 100644
--- a/multipathd/main.c
+++ b/multipathd/main.c
@@ -1273,10 +1273,13 @@ uev_update_path (struct uevent *uev, struct vectors * vecs)
 			else {
 				if (ro == 1)
 					pp->mpp->force_readonly = 1;
-				retval = reload_map(vecs, mpp, 0, 1);
-				pp->mpp->force_readonly = 0;
-				condlog(2, "%s: map %s reloaded (retval %d)",
-					uev->kernel, mpp->alias, retval);
+				retval = update_path_groups(mpp, vecs, 0);
+				if (retval == 2)
+					condlog(2, "%s: map removed during reload", pp->dev);
+				else {
+					pp->mpp->force_readonly = 0;
+					condlog(2, "%s: map %s reloaded (retval %d)", uev->kernel, mpp->alias, retval);
+				}
 			}
 		}
 	}
@@ -1832,7 +1835,7 @@ int update_path_groups(struct multipath *mpp, struct vectors *vecs, int refresh)
 
 	dm_lib_release();
 	if (setup_multipath(vecs, mpp) != 0)
-		return 1;
+		return 2;
 	sync_map_state(mpp);
 
 	return 0;
diff --git a/multipathd/main.h b/multipathd/main.h
index 8fd426b..e5c1398 100644
--- a/multipathd/main.h
+++ b/multipathd/main.h
@@ -43,5 +43,7 @@ int __setup_multipath (struct vectors * vecs, struct multipath * mpp,
 		       int reset);
 #define setup_multipath(vecs, mpp) __setup_multipath(vecs, mpp, 1)
 int update_multipath (struct vectors *vecs, char *mapname, int reset);
+int update_path_groups(struct multipath *mpp, struct vectors *vecs,
+		       int refresh);
 
 #endif /* MAIN_H */
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v3 10/17] multipath.conf: add missing options to man page
  2019-03-30  6:05 [PATCH v3 00/17] Misc Multipath patches Benjamin Marzinski
                   ` (8 preceding siblings ...)
  2019-03-30  6:05 ` [PATCH v3 09/17] multipathd: use update_path_groups instead of reload_map Benjamin Marzinski
@ 2019-03-30  6:05 ` Benjamin Marzinski
  2019-04-01 13:11   ` Martin Wilck
  2019-03-30  6:06 ` [PATCH v3 11/17] libmultipath: add get_uid fallback code for NVMe devices Benjamin Marzinski
                   ` (6 subsequent siblings)
  16 siblings, 1 reply; 24+ messages in thread
From: Benjamin Marzinski @ 2019-03-30  6:05 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 multipath/multipath.conf.5 | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/multipath/multipath.conf.5 b/multipath/multipath.conf.5
index 0fe8461..864d7eb 100644
--- a/multipath/multipath.conf.5
+++ b/multipath/multipath.conf.5
@@ -1468,6 +1468,8 @@ section:
 .TP
 .B uid_attribute
 .TP
+.B getuid_callout
+.TP
 .B path_selector
 .TP
 .B path_checker
@@ -1494,6 +1496,8 @@ section:
 .TP
 .B flush_on_last_del
 .TP
+.B user_friendly_names
+.TP
 .B retain_attached_hw_handler
 .TP
 .B detect_prio
@@ -1525,6 +1529,8 @@ section:
 .B max_sectors_kb
 .TP
 .B ghost_delay
+.TP
+.B all_tg_pt
 .RE
 .PD
 .LP
@@ -1604,7 +1610,11 @@ the values are taken from the \fIdevices\fR or \fIdefaults\fR sections:
 .TP
 .B skip_kpartx
 .TP
+.B max_sectors_kb
+.TP
 .B ghost_delay
+.TP
+.B all_tg_pt
 .RE
 .PD
 .LP
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v3 11/17] libmultipath: add get_uid fallback code for NVMe devices
  2019-03-30  6:05 [PATCH v3 00/17] Misc Multipath patches Benjamin Marzinski
                   ` (9 preceding siblings ...)
  2019-03-30  6:05 ` [PATCH v3 10/17] multipath.conf: add missing options to man page Benjamin Marzinski
@ 2019-03-30  6:06 ` Benjamin Marzinski
  2019-03-30  6:06 ` [PATCH v3 12/17] libmulitpath: cleanup uid_fallback code Benjamin Marzinski
                   ` (5 subsequent siblings)
  16 siblings, 0 replies; 24+ messages in thread
From: Benjamin Marzinski @ 2019-03-30  6:06 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck

If multipath can't get the uid for NVMe devices from udev, it can get it
directly from sysfs.

Reviewed-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 libmultipath/discovery.c | 49 ++++++++++++++++++++++++++++------------
 1 file changed, 34 insertions(+), 15 deletions(-)

diff --git a/libmultipath/discovery.c b/libmultipath/discovery.c
index 28c00e5..bece67c 100644
--- a/libmultipath/discovery.c
+++ b/libmultipath/discovery.c
@@ -1754,8 +1754,8 @@ get_vpd_uid(struct path * pp)
 	return get_vpd_sysfs(parent, 0x83, pp->wwid, WWID_SIZE);
 }
 
-static ssize_t scsi_uid_fallback(struct path *pp, int path_state,
-			     const char **origin, ssize_t old_len)
+static ssize_t uid_fallback(struct path *pp, int path_state,
+			    const char **origin, ssize_t old_len)
 {
 	ssize_t len = old_len;
 	int retrigger;
@@ -1764,17 +1764,36 @@ static ssize_t scsi_uid_fallback(struct path *pp, int path_state,
 	conf = get_multipath_config();
 	retrigger = conf->retrigger_tries;
 	put_multipath_config(conf);
-	if (pp->retriggers >= retrigger &&
-	    !strcmp(pp->uid_attribute, DEFAULT_UID_ATTRIBUTE)) {
-		len = get_vpd_uid(pp);
-		*origin = "sysfs";
-		pp->uid_attribute = NULL;
-		if (len < 0 && path_state == PATH_UP) {
-			condlog(1, "%s: failed to get sysfs uid: %s",
-				pp->dev, strerror(-len));
-			len = get_vpd_sgio(pp->fd, 0x83, pp->wwid,
-					   WWID_SIZE);
-			*origin = "sgio";
+	if (pp->retriggers >= retrigger) {
+		if (pp->bus == SYSFS_BUS_SCSI &&
+		    !strcmp(pp->uid_attribute, DEFAULT_UID_ATTRIBUTE)) {
+			len = get_vpd_uid(pp);
+			*origin = "sysfs";
+			pp->uid_attribute = NULL;
+			if (len < 0 && path_state == PATH_UP) {
+				condlog(1, "%s: failed to get sysfs uid: %s",
+					pp->dev, strerror(-len));
+				len = get_vpd_sgio(pp->fd, 0x83, pp->wwid,
+						   WWID_SIZE);
+				*origin = "sgio";
+			}
+		} else if (pp->bus == SYSFS_BUS_NVME) {
+			char value[256];
+			len = sysfs_attr_get_value(pp->udev, "wwid", value,
+						   sizeof(value));
+			if (len <= 0)
+				return -1;
+			len = strlcpy(pp->wwid, value, WWID_SIZE);
+			if (len >= WWID_SIZE) {
+				len = fix_broken_nvme_wwid(pp, value,
+							   WWID_SIZE);
+				if (len > 0)
+					return len;
+				condlog(0, "%s: wwid overflow", pp->dev);
+				len = WWID_SIZE;
+			}
+			*origin = "sysfs";
+			pp->uid_attribute = NULL;
 		}
 	}
 	return len;
@@ -1827,8 +1846,8 @@ get_uid (struct path * pp, int path_state, struct udev_device *udev)
 			len = get_vpd_uid(pp);
 			origin = "sysfs";
 		}
-		if (len <= 0 && pp->bus == SYSFS_BUS_SCSI)
-			len = scsi_uid_fallback(pp, path_state, &origin, len);
+		if (len <= 0)
+			len = uid_fallback(pp, path_state, &origin, len);
 	}
 	if ( len < 0 ) {
 		condlog(1, "%s: failed to get %s uid: %s",
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v3 12/17] libmulitpath: cleanup uid_fallback code
  2019-03-30  6:05 [PATCH v3 00/17] Misc Multipath patches Benjamin Marzinski
                   ` (10 preceding siblings ...)
  2019-03-30  6:06 ` [PATCH v3 11/17] libmultipath: add get_uid fallback code for NVMe devices Benjamin Marzinski
@ 2019-03-30  6:06 ` Benjamin Marzinski
  2019-04-01 13:26   ` Martin Wilck
  2019-03-30  6:06 ` [PATCH v3 13/17] multipathd: handle changed wwids by removal and addition Benjamin Marzinski
                   ` (4 subsequent siblings)
  16 siblings, 1 reply; 24+ messages in thread
From: Benjamin Marzinski @ 2019-03-30  6:06 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck

Instead of always calling uid_fallback() if the configured method to get
the uid failed, get_uid now checks if the path supports fallbacks and if
all the retriggers have occurred. If so, it calls uid_fallback(), which
just attempts to get the uid using the appropriate fallback method. None
of these changes should make the code function any differently.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 libmultipath/discovery.c | 85 ++++++++++++++++++++++------------------
 1 file changed, 46 insertions(+), 39 deletions(-)

diff --git a/libmultipath/discovery.c b/libmultipath/discovery.c
index bece67c..3ec60d6 100644
--- a/libmultipath/discovery.c
+++ b/libmultipath/discovery.c
@@ -1755,50 +1755,50 @@ get_vpd_uid(struct path * pp)
 }
 
 static ssize_t uid_fallback(struct path *pp, int path_state,
-			    const char **origin, ssize_t old_len)
+			    const char **origin)
 {
-	ssize_t len = old_len;
-	int retrigger;
-	struct config *conf;
-
-	conf = get_multipath_config();
-	retrigger = conf->retrigger_tries;
-	put_multipath_config(conf);
-	if (pp->retriggers >= retrigger) {
-		if (pp->bus == SYSFS_BUS_SCSI &&
-		    !strcmp(pp->uid_attribute, DEFAULT_UID_ATTRIBUTE)) {
-			len = get_vpd_uid(pp);
-			*origin = "sysfs";
-			pp->uid_attribute = NULL;
-			if (len < 0 && path_state == PATH_UP) {
-				condlog(1, "%s: failed to get sysfs uid: %s",
-					pp->dev, strerror(-len));
-				len = get_vpd_sgio(pp->fd, 0x83, pp->wwid,
+	ssize_t len = -1;
+
+	if (pp->bus == SYSFS_BUS_SCSI &&
+	    !strcmp(pp->uid_attribute, DEFAULT_UID_ATTRIBUTE)) {
+		len = get_vpd_uid(pp);
+		*origin = "sysfs";
+		pp->uid_attribute = NULL;
+		if (len < 0 && path_state == PATH_UP) {
+			condlog(1, "%s: failed to get sysfs uid: %s",
+				pp->dev, strerror(-len));
+			len = get_vpd_sgio(pp->fd, 0x83, pp->wwid,
+					   WWID_SIZE);
+			*origin = "sgio";
+		}
+	} else if (pp->bus == SYSFS_BUS_NVME) {
+		char value[256];
+		len = sysfs_attr_get_value(pp->udev, "wwid", value,
+					   sizeof(value));
+		if (len <= 0)
+			return -1;
+		len = strlcpy(pp->wwid, value, WWID_SIZE);
+		if (len >= WWID_SIZE) {
+			len = fix_broken_nvme_wwid(pp, value,
 						   WWID_SIZE);
-				*origin = "sgio";
-			}
-		} else if (pp->bus == SYSFS_BUS_NVME) {
-			char value[256];
-			len = sysfs_attr_get_value(pp->udev, "wwid", value,
-						   sizeof(value));
-			if (len <= 0)
-				return -1;
-			len = strlcpy(pp->wwid, value, WWID_SIZE);
-			if (len >= WWID_SIZE) {
-				len = fix_broken_nvme_wwid(pp, value,
-							   WWID_SIZE);
-				if (len > 0)
-					return len;
-				condlog(0, "%s: wwid overflow", pp->dev);
-				len = WWID_SIZE;
-			}
-			*origin = "sysfs";
-			pp->uid_attribute = NULL;
+			if (len > 0)
+				return len;
+			condlog(0, "%s: wwid overflow", pp->dev);
+			len = WWID_SIZE;
 		}
+		*origin = "sysfs";
+		pp->uid_attribute = NULL;
 	}
 	return len;
 }
 
+static int has_uid_fallback(struct path *pp)
+{
+	return ((pp->bus == SYSFS_BUS_SCSI &&
+		 !strcmp(pp->uid_attribute, DEFAULT_UID_ATTRIBUTE)) ||
+		pp->bus == SYSFS_BUS_NVME);
+}
+
 int
 get_uid (struct path * pp, int path_state, struct udev_device *udev)
 {
@@ -1846,8 +1846,15 @@ get_uid (struct path * pp, int path_state, struct udev_device *udev)
 			len = get_vpd_uid(pp);
 			origin = "sysfs";
 		}
-		if (len <= 0)
-			len = uid_fallback(pp, path_state, &origin, len);
+		if (len <= 0 && has_uid_fallback(pp)) {
+			int retrigger_tries;
+
+			conf = get_multipath_config();
+			retrigger_tries = conf->retrigger_tries;
+			put_multipath_config(conf);
+			if (pp->retriggers >= retrigger_tries)
+				len = uid_fallback(pp, path_state, &origin);
+		}
 	}
 	if ( len < 0 ) {
 		condlog(1, "%s: failed to get %s uid: %s",
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v3 13/17] multipathd: handle changed wwids by removal and addition
  2019-03-30  6:05 [PATCH v3 00/17] Misc Multipath patches Benjamin Marzinski
                   ` (11 preceding siblings ...)
  2019-03-30  6:06 ` [PATCH v3 12/17] libmulitpath: cleanup uid_fallback code Benjamin Marzinski
@ 2019-03-30  6:06 ` Benjamin Marzinski
  2019-04-01 13:31   ` Martin Wilck
  2019-03-30  6:06 ` [PATCH v3 14/17] multipathd: remove "wwid_changed" path attribute Benjamin Marzinski
                   ` (3 subsequent siblings)
  16 siblings, 1 reply; 24+ messages in thread
From: Benjamin Marzinski @ 2019-03-30  6:06 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck

If a path's WWID changes, it's not necessarily failed. But it certainly
has to be removed from an existing map, otherwise data corruption is
imminent. Instead of keeping the path in the map, failing it, and
remembering the "changed WWID" state, this patch simply removes and
re-adds the path.

This is patch is heavily based on the previous patch of the same name
by Martin Wilck.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 multipathd/main.c | 28 ++++++----------------------
 1 file changed, 6 insertions(+), 22 deletions(-)

diff --git a/multipathd/main.c b/multipathd/main.c
index 7a317d9..b3571d9 100644
--- a/multipathd/main.c
+++ b/multipathd/main.c
@@ -1191,7 +1191,6 @@ uev_update_path (struct uevent *uev, struct vectors * vecs)
 	int ro, retval = 0, rc;
 	struct path * pp;
 	struct config *conf;
-	int disable_changed_wwids;
 	int needs_reinit = 0;
 
 	switch ((rc = change_foreign(uev->udev))) {
@@ -1209,12 +1208,6 @@ uev_update_path (struct uevent *uev, struct vectors * vecs)
 		break;
 	}
 
-	conf = get_multipath_config();
-	disable_changed_wwids = conf->disable_changed_wwids;
-	put_multipath_config(conf);
-
-	ro = uevent_get_disk_ro(uev);
-
 	pthread_cleanup_push(cleanup_lock, &vecs->lock);
 	lock(&vecs->lock);
 	pthread_testcancel();
@@ -1239,22 +1232,12 @@ uev_update_path (struct uevent *uev, struct vectors * vecs)
 		if (rc != 0)
 			strcpy(pp->wwid, wwid);
 		else if (strncmp(wwid, pp->wwid, WWID_SIZE) != 0) {
-			condlog(0, "%s: path wwid changed from '%s' to '%s'. %s",
-				uev->kernel, wwid, pp->wwid,
-				(disable_changed_wwids ? "disallowing" :
-				 "continuing"));
-			strcpy(pp->wwid, wwid);
-			if (disable_changed_wwids) {
-				if (!pp->wwid_changed) {
-					pp->wwid_changed = 1;
-					pp->tick = 1;
-					if (pp->mpp)
-						dm_fail_path(pp->mpp->alias, pp->dev_t);
-				}
-				goto out;
-			}
+			condlog(0, "%s: path wwid changed from '%s' to '%s'",
+				uev->kernel, wwid, pp->wwid);
+			ev_remove_path(pp, vecs, 1);
+			needs_reinit = 1;
+			goto out;
 		} else {
-			pp->wwid_changed = 0;
 			udev_device_unref(pp->udev);
 			pp->udev = udev_device_ref(uev->udev);
 			conf = get_multipath_config();
@@ -1265,6 +1248,7 @@ uev_update_path (struct uevent *uev, struct vectors * vecs)
 			pthread_cleanup_pop(1);
 		}
 
+		ro = uevent_get_disk_ro(uev);
 		if (mpp && ro >= 0) {
 			condlog(2, "%s: update path write_protect to '%d' (uevent)", uev->kernel, ro);
 
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v3 14/17] multipathd: remove "wwid_changed" path attribute
  2019-03-30  6:05 [PATCH v3 00/17] Misc Multipath patches Benjamin Marzinski
                   ` (12 preceding siblings ...)
  2019-03-30  6:06 ` [PATCH v3 13/17] multipathd: handle changed wwids by removal and addition Benjamin Marzinski
@ 2019-03-30  6:06 ` Benjamin Marzinski
  2019-03-30  6:06 ` [PATCH v3 15/17] multipathd: ignore "disable_changed_wwids" Benjamin Marzinski
                   ` (2 subsequent siblings)
  16 siblings, 0 replies; 24+ messages in thread
From: Benjamin Marzinski @ 2019-03-30  6:06 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck

From: Martin Wilck <mwilck@suse.com>

This is now not needed any more.

Signed-off-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 libmultipath/structs.h | 1 -
 multipathd/main.c      | 6 ------
 2 files changed, 7 deletions(-)

diff --git a/libmultipath/structs.h b/libmultipath/structs.h
index b794b0d..7879d76 100644
--- a/libmultipath/structs.h
+++ b/libmultipath/structs.h
@@ -280,7 +280,6 @@ struct path {
 	int fd;
 	int initialized;
 	int retriggers;
-	int wwid_changed;
 	unsigned int path_failures;
 	time_t dis_reinstate_time;
 	int disable_reinstate;
diff --git a/multipathd/main.c b/multipathd/main.c
index b3571d9..e4f95a0 100644
--- a/multipathd/main.c
+++ b/multipathd/main.c
@@ -2001,12 +2001,6 @@ check_path (struct vectors * vecs, struct path * pp, int ticks)
 	if (newstate == PATH_REMOVED)
 		newstate = PATH_DOWN;
 
-	if (pp->wwid_changed) {
-		condlog(2, "%s: path wwid has changed. Refusing to use",
-			pp->dev);
-		newstate = PATH_DOWN;
-	}
-
 	if (newstate == PATH_WILD || newstate == PATH_UNCHECKED) {
 		condlog(2, "%s: unusable path (%s) - checker failed",
 			pp->dev, checker_state_name(newstate));
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v3 15/17] multipathd: ignore "disable_changed_wwids"
  2019-03-30  6:05 [PATCH v3 00/17] Misc Multipath patches Benjamin Marzinski
                   ` (13 preceding siblings ...)
  2019-03-30  6:06 ` [PATCH v3 14/17] multipathd: remove "wwid_changed" path attribute Benjamin Marzinski
@ 2019-03-30  6:06 ` Benjamin Marzinski
  2019-03-30  6:06 ` [PATCH v3 16/17] multipathd: Don't use fallback code after getting wwid Benjamin Marzinski
  2019-03-30  6:06 ` [PATCH v3 17/17] libmultipath: silence dm_is_mpath error messages Benjamin Marzinski
  16 siblings, 0 replies; 24+ messages in thread
From: Benjamin Marzinski @ 2019-03-30  6:06 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck

From: Martin Wilck <mwilck@suse.com>

This option has no effect any more.

Signed-off-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 libmultipath/config.c      |  1 -
 libmultipath/config.h      |  1 -
 libmultipath/dict.c        | 18 +++++++++++++++---
 multipath/multipath.conf.5 |  8 ++------
 4 files changed, 17 insertions(+), 11 deletions(-)

diff --git a/libmultipath/config.c b/libmultipath/config.c
index 24d71ae..141f092 100644
--- a/libmultipath/config.c
+++ b/libmultipath/config.c
@@ -715,7 +715,6 @@ load_config (char * file)
 	conf->retrigger_tries = DEFAULT_RETRIGGER_TRIES;
 	conf->retrigger_delay = DEFAULT_RETRIGGER_DELAY;
 	conf->uev_wait_timeout = DEFAULT_UEV_WAIT_TIMEOUT;
-	conf->disable_changed_wwids = DEFAULT_DISABLE_CHANGED_WWIDS;
 	conf->remove_retries = 0;
 	conf->ghost_delay = DEFAULT_GHOST_DELAY;
 	conf->all_tg_pt = DEFAULT_ALL_TG_PT;
diff --git a/libmultipath/config.h b/libmultipath/config.h
index b938c26..f5bf5b1 100644
--- a/libmultipath/config.h
+++ b/libmultipath/config.h
@@ -182,7 +182,6 @@ struct config {
 	int delayed_reconfig;
 	int uev_wait_timeout;
 	int skip_kpartx;
-	int disable_changed_wwids;
 	int remove_retries;
 	int max_sectors_kb;
 	int ghost_delay;
diff --git a/libmultipath/dict.c b/libmultipath/dict.c
index eaad4f1..96815f8 100644
--- a/libmultipath/dict.c
+++ b/libmultipath/dict.c
@@ -156,6 +156,12 @@ out:
 	return len;
 }
 
+static int
+print_ignored (char *buff, int len)
+{
+	return snprintf(buff, len, "ignored");
+}
+
 static int
 print_yes_no (char *buff, int len, long v)
 {
@@ -548,9 +554,15 @@ declare_hw_handler(skip_kpartx, set_yes_no_undef)
 declare_hw_snprint(skip_kpartx, print_yes_no_undef)
 declare_mp_handler(skip_kpartx, set_yes_no_undef)
 declare_mp_snprint(skip_kpartx, print_yes_no_undef)
-
-declare_def_handler(disable_changed_wwids, set_yes_no)
-declare_def_snprint(disable_changed_wwids, print_yes_no)
+static int def_disable_changed_wwids_handler(struct config *conf, vector strvec)
+{
+	return 0;
+}
+static int snprint_def_disable_changed_wwids(struct config *conf, char *buff,
+					     int len, const void *data)
+{
+	return print_ignored(buff, len);
+}
 
 declare_def_handler(remove_retries, set_int)
 declare_def_snprint(remove_retries, print_int)
diff --git a/multipath/multipath.conf.5 b/multipath/multipath.conf.5
index 864d7eb..646c156 100644
--- a/multipath/multipath.conf.5
+++ b/multipath/multipath.conf.5
@@ -1148,12 +1148,8 @@ The default is: \fBno\fR
 .
 .TP
 .B disable_changed_wwids
-If set to \fIyes\fR, multipathd will check the path wwid on change events, and
-if it has changed from the wwid of the multipath device, multipathd will
-disable access to the path until the wwid changes back.
-.RS
-.TP
-The default is: \fBno\fR
+This option is deprecated and ignored. If the WWID of a path suddenly changes,
+multipathd handles it as if it was removed and then added again.
 .RE
 .
 .
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v3 16/17] multipathd: Don't use fallback code after getting wwid
  2019-03-30  6:05 [PATCH v3 00/17] Misc Multipath patches Benjamin Marzinski
                   ` (14 preceding siblings ...)
  2019-03-30  6:06 ` [PATCH v3 15/17] multipathd: ignore "disable_changed_wwids" Benjamin Marzinski
@ 2019-03-30  6:06 ` Benjamin Marzinski
  2019-04-01 14:21   ` Martin Wilck
  2019-03-30  6:06 ` [PATCH v3 17/17] libmultipath: silence dm_is_mpath error messages Benjamin Marzinski
  16 siblings, 1 reply; 24+ messages in thread
From: Benjamin Marzinski @ 2019-03-30  6:06 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck

The fallback code is necessary to set up mutipath devices, if multipath
temporarily can't get the information from udev.  However, once the
devices are set up, udev is the definitive source of this information.

The wwid gotten from the fallback code and the udev code should always
be the same, in which case it doesn't matter where we get the wwid
from. But if they are different, it's important to try to do the
right thing.

Working under the assumption that udev will either never give us this
information, or that it usually will. multipath should assume that if
there are multiple paths to a device, either they will all never get
a wwid from udev, or some of them will likely already have gotten the
correct wwid from udev.  In this case, we should fix this as soon as
possible.

This does mean that devices where udev will never give out the uuid
will not notice if the wwid changes, but that's a small price to pay
for doing the right thing most of the time.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 libmultipath/discovery.c | 22 +++++++++-------------
 libmultipath/discovery.h |  3 ++-
 multipathd/main.c        |  2 +-
 3 files changed, 12 insertions(+), 15 deletions(-)

diff --git a/libmultipath/discovery.c b/libmultipath/discovery.c
index 3ec60d6..744cf2c 100644
--- a/libmultipath/discovery.c
+++ b/libmultipath/discovery.c
@@ -1763,7 +1763,6 @@ static ssize_t uid_fallback(struct path *pp, int path_state,
 	    !strcmp(pp->uid_attribute, DEFAULT_UID_ATTRIBUTE)) {
 		len = get_vpd_uid(pp);
 		*origin = "sysfs";
-		pp->uid_attribute = NULL;
 		if (len < 0 && path_state == PATH_UP) {
 			condlog(1, "%s: failed to get sysfs uid: %s",
 				pp->dev, strerror(-len));
@@ -1787,7 +1786,6 @@ static ssize_t uid_fallback(struct path *pp, int path_state,
 			len = WWID_SIZE;
 		}
 		*origin = "sysfs";
-		pp->uid_attribute = NULL;
 	}
 	return len;
 }
@@ -1800,12 +1798,14 @@ static int has_uid_fallback(struct path *pp)
 }
 
 int
-get_uid (struct path * pp, int path_state, struct udev_device *udev)
+get_uid (struct path * pp, int path_state, struct udev_device *udev,
+	 int allow_fallback)
 {
 	char *c;
 	const char *origin = "unknown";
 	ssize_t len = 0;
 	struct config *conf;
+	int used_fallback = 0;
 
 	if (!pp->uid_attribute && !pp->getuid) {
 		conf = get_multipath_config();
@@ -1846,14 +1846,9 @@ get_uid (struct path * pp, int path_state, struct udev_device *udev)
 			len = get_vpd_uid(pp);
 			origin = "sysfs";
 		}
-		if (len <= 0 && has_uid_fallback(pp)) {
-			int retrigger_tries;
-
-			conf = get_multipath_config();
-			retrigger_tries = conf->retrigger_tries;
-			put_multipath_config(conf);
-			if (pp->retriggers >= retrigger_tries)
-				len = uid_fallback(pp, path_state, &origin);
+		if (len <= 0 && allow_fallback && has_uid_fallback(pp)) {
+			used_fallback = 1;
+			len = uid_fallback(pp, path_state, &origin);
 		}
 	}
 	if ( len < 0 ) {
@@ -1870,7 +1865,7 @@ get_uid (struct path * pp, int path_state, struct udev_device *udev)
 			c--;
 		}
 	}
-	condlog(3, "%s: uid = %s (%s)", pp->dev,
+	condlog((used_fallback)? 1 : 3, "%s: uid = %s (%s)", pp->dev,
 		*pp->wwid == '\0' ? "<empty>" : pp->wwid, origin);
 	return 0;
 }
@@ -1994,7 +1989,8 @@ int pathinfo(struct path *pp, struct config *conf, int mask)
 	}
 
 	if ((mask & DI_WWID) && !strlen(pp->wwid)) {
-		get_uid(pp, path_state, pp->udev);
+		get_uid(pp, path_state, pp->udev,
+			(pp->retriggers >= conf->retrigger_tries));
 		if (!strlen(pp->wwid)) {
 			if (pp->bus == SYSFS_BUS_UNDEF)
 				return PATHINFO_SKIPPED;
diff --git a/libmultipath/discovery.h b/libmultipath/discovery.h
index 9aacf75..8fd126b 100644
--- a/libmultipath/discovery.h
+++ b/libmultipath/discovery.h
@@ -52,7 +52,8 @@ ssize_t sysfs_get_vpd (struct udev_device * udev, int pg, unsigned char * buff,
 		       size_t len);
 int sysfs_get_asymmetric_access_state(struct path *pp,
 				      char *buff, int buflen);
-int get_uid(struct path * pp, int path_state, struct udev_device *udev);
+int get_uid(struct path * pp, int path_state, struct udev_device *udev,
+	    int allow_fallback);
 
 /*
  * discovery bitmask
diff --git a/multipathd/main.c b/multipathd/main.c
index e4f95a0..1413c6d 100644
--- a/multipathd/main.c
+++ b/multipathd/main.c
@@ -1227,7 +1227,7 @@ uev_update_path (struct uevent *uev, struct vectors * vecs)
 			goto out;
 
 		strcpy(wwid, pp->wwid);
-		rc = get_uid(pp, pp->state, uev->udev);
+		rc = get_uid(pp, pp->state, uev->udev, 0);
 
 		if (rc != 0)
 			strcpy(pp->wwid, wwid);
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v3 17/17] libmultipath: silence dm_is_mpath error messages
  2019-03-30  6:05 [PATCH v3 00/17] Misc Multipath patches Benjamin Marzinski
                   ` (15 preceding siblings ...)
  2019-03-30  6:06 ` [PATCH v3 16/17] multipathd: Don't use fallback code after getting wwid Benjamin Marzinski
@ 2019-03-30  6:06 ` Benjamin Marzinski
  2019-04-01 14:27   ` Martin Wilck
  16 siblings, 1 reply; 24+ messages in thread
From: Benjamin Marzinski @ 2019-03-30  6:06 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck

When "multipath -F" is run, dm_is_mpath was printing error messages
about partition devices, because they had already been removed, when
it checked.  Lower the error logging level so this doesn't happen on
the default verbosity.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 libmultipath/devmapper.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libmultipath/devmapper.c b/libmultipath/devmapper.c
index 3294bd4..2e79667 100644
--- a/libmultipath/devmapper.c
+++ b/libmultipath/devmapper.c
@@ -746,7 +746,7 @@ out_task:
 	dm_task_destroy(dmt);
 out:
 	if (r < 0)
-		condlog(2, "%s: dm command failed in %s", name, __FUNCTION__);
+		condlog(3, "%s: dm command failed in %s: %s", name, __FUNCTION__, strerror(errno));
 	return r;
 }
 
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 10/17] multipath.conf: add missing options to man page
  2019-03-30  6:05 ` [PATCH v3 10/17] multipath.conf: add missing options to man page Benjamin Marzinski
@ 2019-04-01 13:11   ` Martin Wilck
  0 siblings, 0 replies; 24+ messages in thread
From: Martin Wilck @ 2019-04-01 13:11 UTC (permalink / raw)
  To: Benjamin Marzinski, Christophe Varoqui; +Cc: device-mapper development

On Sat, 2019-03-30 at 01:05 -0500, Benjamin Marzinski wrote:
> Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> ---

Reviewed-by: Martin Wilck <mwilck@suse.com>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 12/17] libmulitpath: cleanup uid_fallback code
  2019-03-30  6:06 ` [PATCH v3 12/17] libmulitpath: cleanup uid_fallback code Benjamin Marzinski
@ 2019-04-01 13:26   ` Martin Wilck
  0 siblings, 0 replies; 24+ messages in thread
From: Martin Wilck @ 2019-04-01 13:26 UTC (permalink / raw)
  To: Benjamin Marzinski, Christophe Varoqui; +Cc: device-mapper development

On Sat, 2019-03-30 at 01:06 -0500, Benjamin Marzinski wrote:
> Instead of always calling uid_fallback() if the configured method to
> get
> the uid failed, get_uid now checks if the path supports fallbacks and
> if
> all the retriggers have occurred. If so, it calls uid_fallback(),
> which
> just attempts to get the uid using the appropriate fallback method.
> None
> of these changes should make the code function any differently.
> 
> Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> ---
>  libmultipath/discovery.c | 85 ++++++++++++++++++++++--------------
> ----
>  1 file changed, 46 insertions(+), 39 deletions(-)
> 
> diff --git a/libmultipath/discovery.c b/libmultipath/discovery.c
> index bece67c..3ec60d6 100644
> --- a/libmultipath/discovery.c
> +++ b/libmultipath/discovery.c
> @@ -1755,50 +1755,50 @@ get_vpd_uid(struct path * pp)
>  }
>  
>  static ssize_t uid_fallback(struct path *pp, int path_state,
> -			    const char **origin, ssize_t old_len)
> +			    const char **origin)
>  {
> -	ssize_t len = old_len;
> -	int retrigger;
> -	struct config *conf;
> -
> -	conf = get_multipath_config();
> -	retrigger = conf->retrigger_tries;
> -	put_multipath_config(conf);
> -	if (pp->retriggers >= retrigger) {
> -		if (pp->bus == SYSFS_BUS_SCSI &&
> -		    !strcmp(pp->uid_attribute, DEFAULT_UID_ATTRIBUTE))
> {
> -			len = get_vpd_uid(pp);
> -			*origin = "sysfs";
> -			pp->uid_attribute = NULL;
> -			if (len < 0 && path_state == PATH_UP) {
> -				condlog(1, "%s: failed to get sysfs
> uid: %s",
> -					pp->dev, strerror(-len));
> -				len = get_vpd_sgio(pp->fd, 0x83, pp-
> >wwid,
> +	ssize_t len = -1;
> +
> +	if (pp->bus == SYSFS_BUS_SCSI &&
> +	    !strcmp(pp->uid_attribute, DEFAULT_UID_ATTRIBUTE)) {

Even the check for DEFAULT_UID_ATTRIBUTE might be moved to get_uid().
But we can do that later.

Reviewed-by: Martin Wilck <mwilck@suse.com>

-- 
Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)


--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 13/17] multipathd: handle changed wwids by removal and addition
  2019-03-30  6:06 ` [PATCH v3 13/17] multipathd: handle changed wwids by removal and addition Benjamin Marzinski
@ 2019-04-01 13:31   ` Martin Wilck
  0 siblings, 0 replies; 24+ messages in thread
From: Martin Wilck @ 2019-04-01 13:31 UTC (permalink / raw)
  To: Benjamin Marzinski, Christophe Varoqui; +Cc: device-mapper development

On Sat, 2019-03-30 at 01:06 -0500, Benjamin Marzinski wrote:
> If a path's WWID changes, it's not necessarily failed. But it
> certainly
> has to be removed from an existing map, otherwise data corruption is
> imminent. Instead of keeping the path in the map, failing it, and
> remembering the "changed WWID" state, this patch simply removes and
> re-adds the path.
> 
> This is patch is heavily based on the previous patch of the same name
> by Martin Wilck.
> 
> Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> ---

Reviewed-by: Martin Wilck <mwilck@suse.com>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 16/17] multipathd: Don't use fallback code after getting wwid
  2019-03-30  6:06 ` [PATCH v3 16/17] multipathd: Don't use fallback code after getting wwid Benjamin Marzinski
@ 2019-04-01 14:21   ` Martin Wilck
  0 siblings, 0 replies; 24+ messages in thread
From: Martin Wilck @ 2019-04-01 14:21 UTC (permalink / raw)
  To: Benjamin Marzinski, Christophe Varoqui; +Cc: device-mapper development

On Sat, 2019-03-30 at 01:06 -0500, Benjamin Marzinski wrote:
> The fallback code is necessary to set up mutipath devices, if
> multipath
> temporarily can't get the information from udev.  However, once the
> devices are set up, udev is the definitive source of this
> information.
> 
> The wwid gotten from the fallback code and the udev code should
> always
> be the same, in which case it doesn't matter where we get the wwid
> from. But if they are different, it's important to try to do the
> right thing.
> 
> Working under the assumption that udev will either never give us this
> information, or that it usually will. multipath should assume that if
> there are multiple paths to a device, either they will all never get
> a wwid from udev, or some of them will likely already have gotten the
> correct wwid from udev.  In this case, we should fix this as soon as
> possible.
> 
> This does mean that devices where udev will never give out the uuid
> will not notice if the wwid changes, but that's a small price to pay
> for doing the right thing most of the time.
> 
> Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> ---
>  libmultipath/discovery.c | 22 +++++++++-------------
>  libmultipath/discovery.h |  3 ++-
>  multipathd/main.c        |  2 +-
>  3 files changed, 12 insertions(+), 15 deletions(-)
> 
> diff --git a/libmultipath/discovery.c b/libmultipath/discovery.c
> index 3ec60d6..744cf2c 100644
> --- a/libmultipath/discovery.c
> +++ b/libmultipath/discovery.c
> @@ -1763,7 +1763,6 @@ static ssize_t uid_fallback(struct path *pp,
> int path_state,
>  	    !strcmp(pp->uid_attribute, DEFAULT_UID_ATTRIBUTE)) {
>  		len = get_vpd_uid(pp);
>  		*origin = "sysfs";
> -		pp->uid_attribute = NULL;

Hm, that makes me realize that the previous code was broken, as we
used strcmp(pp->uid_attribute, DEFAULT_UID_ATTRIBUTE) elsewhere without
checking if uid_attribute was NULL ...

@@ -1846,14 +1846,9 @@ get_uid (struct path * pp, int path_state,
> struct udev_device *udev)
>  			len = get_vpd_uid(pp);
>  			origin = "sysfs";

On the premise that "udev rules", maybe we should also remove the above
code, which is nothing but yet another fallback?

But that can be done on top of your set.

Reviewed-by: Martin Wilck <mwilck@suse.com>
-- 
Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)


--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 17/17] libmultipath: silence dm_is_mpath error messages
  2019-03-30  6:06 ` [PATCH v3 17/17] libmultipath: silence dm_is_mpath error messages Benjamin Marzinski
@ 2019-04-01 14:27   ` Martin Wilck
  0 siblings, 0 replies; 24+ messages in thread
From: Martin Wilck @ 2019-04-01 14:27 UTC (permalink / raw)
  To: Benjamin Marzinski, Christophe Varoqui; +Cc: device-mapper development

On Sat, 2019-03-30 at 01:06 -0500, Benjamin Marzinski wrote:
> When "multipath -F" is run, dm_is_mpath was printing error messages
> about partition devices, because they had already been removed, when
> it checked.  Lower the error logging level so this doesn't happen on
> the default verbosity.
> 
> Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> ---

Reviewed-by: Martin Wilck <mwilck@suse.com>

-- 
Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)


--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 07/17] multipathd: ignore failed wwid recheck
  2019-03-30  6:05 ` [PATCH v3 07/17] multipathd: ignore failed wwid recheck Benjamin Marzinski
@ 2019-04-01 14:32   ` Martin Wilck
  0 siblings, 0 replies; 24+ messages in thread
From: Martin Wilck @ 2019-04-01 14:32 UTC (permalink / raw)
  To: Benjamin Marzinski, Christophe Varoqui; +Cc: device-mapper development

On Sat, 2019-03-30 at 01:05 -0500, Benjamin Marzinski wrote:
> If disable_changed_wwids is set, when multipathd gets a change event
> on
> a path, it verifies that the wwid hasn't changed in
> uev_update_path().
> If get_uid() failed, uev_update_path treated this as a wwid change to
> 0.
> This could cause paths to suddenly be dropped due to an issue with
> getting the wwid.  Even if get_uid() failed because the path was
> down,
> it no change uevent happend when it later became active, multipathd
> would continue to ignore the path. Also, scsi_uid_fallback() clears
> the
> failure return if it doesn't attempt to fallback, causing get_uid()
> to return success, when it actually failed.
> 
> Multipathd should neither set nor clear wwid_changed if get_uid()
> returned failure. Also, scsi_uid_fallback() should retain the old
> return
> value if it doesn't attempt to fallback.
> 
> Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>

Reviewed-by: Martin Wilck <mwilck@suse.com>


-- 
Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)


--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2019-04-01 14:32 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-30  6:05 [PATCH v3 00/17] Misc Multipath patches Benjamin Marzinski
2019-03-30  6:05 ` [PATCH v3 01/17] BZ 1668693: disable user_friendly_names for NetApp Benjamin Marzinski
2019-03-30  6:05 ` [PATCH v3 02/17] libmultipath: handle existing paths in marginal_path enqueue Benjamin Marzinski
2019-03-30  6:05 ` [PATCH v3 03/17] multipathd: cleanup marginal paths checking timers Benjamin Marzinski
2019-03-30  6:05 ` [PATCH v3 04/17] libmultipath: fix marginal paths queueing errors Benjamin Marzinski
2019-03-30  6:05 ` [PATCH v3 05/17] libmultipath: fix marginal_paths nr_active check Benjamin Marzinski
2019-03-30  6:05 ` [PATCH v3 06/17] multipathd: Fix miscounting active paths Benjamin Marzinski
2019-03-30  6:05 ` [PATCH v3 07/17] multipathd: ignore failed wwid recheck Benjamin Marzinski
2019-04-01 14:32   ` Martin Wilck
2019-03-30  6:05 ` [PATCH v3 08/17] libmutipath: continue to use old state on PATH_PENDING Benjamin Marzinski
2019-03-30  6:05 ` [PATCH v3 09/17] multipathd: use update_path_groups instead of reload_map Benjamin Marzinski
2019-03-30  6:05 ` [PATCH v3 10/17] multipath.conf: add missing options to man page Benjamin Marzinski
2019-04-01 13:11   ` Martin Wilck
2019-03-30  6:06 ` [PATCH v3 11/17] libmultipath: add get_uid fallback code for NVMe devices Benjamin Marzinski
2019-03-30  6:06 ` [PATCH v3 12/17] libmulitpath: cleanup uid_fallback code Benjamin Marzinski
2019-04-01 13:26   ` Martin Wilck
2019-03-30  6:06 ` [PATCH v3 13/17] multipathd: handle changed wwids by removal and addition Benjamin Marzinski
2019-04-01 13:31   ` Martin Wilck
2019-03-30  6:06 ` [PATCH v3 14/17] multipathd: remove "wwid_changed" path attribute Benjamin Marzinski
2019-03-30  6:06 ` [PATCH v3 15/17] multipathd: ignore "disable_changed_wwids" Benjamin Marzinski
2019-03-30  6:06 ` [PATCH v3 16/17] multipathd: Don't use fallback code after getting wwid Benjamin Marzinski
2019-04-01 14:21   ` Martin Wilck
2019-03-30  6:06 ` [PATCH v3 17/17] libmultipath: silence dm_is_mpath error messages Benjamin Marzinski
2019-04-01 14:27   ` Martin Wilck

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.