dm-devel.redhat.com archive mirror
 help / color / mirror / Atom feed
* [dm-devel] [PATCH 0/2] Handle remapped LUNs better
@ 2021-02-09  5:19 Benjamin Marzinski
  2021-02-09  5:19 ` [dm-devel] [PATCH 1/2] libmultipath: fix use-after-free in uev_add_path Benjamin Marzinski
  2021-02-09  5:19 ` [dm-devel] [PATCH 2/2] multipathd: add recheck_wwid_time option to verify the path wwid Benjamin Marzinski
  0 siblings, 2 replies; 13+ messages in thread
From: Benjamin Marzinski @ 2021-02-09  5:19 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: device-mapper development, Chongyun Wu, Martin Wilck

This patchset adds a new config option, recheck_wwid_time, to help deal
with devices getting remapped. It's based on Chongyun's patch, but
instead of always checking if the LUN is remapped, users can set how
many seconds the LUN must be down before it gets rechecked, or disable
this checking entirely, since it is simply there to keep users from
shooting themselves in the foot.  Setting the value to 0 makes this
always recheck when a path is restored.

Unlike Chongyun's patch, it doesn't issue a remove uevent. I'm not
actually sure what the purpose of the remove uevent was, since it
doesn't cause the path to be removed, and multipath already removed it.
Instead, it works like Martin's uev_update_path code, and re-adds the
path after it removes it.  There is one issue I noticed here.  While
udev will update the WWID in the database when it gets a new uevent,
sysfs will still record the original WWID, so if multipathd ever fails
back to using sysfs for the WWID, it will see the wrong value.

Another possible check that I didn't include is this patchset is to have
multipath look at the LUN number whenever it adds a new path to a
multipath device. If the LUN number doesn't match the existing LUN
numbers, all the old paths should have their wwids checked as soon as
possible (immediately, if the path is up), since mismatching LUN numbers
is a red flag that something has gone wrong.

Benjamin Marzinski (2):
  libmultipath: fix use-after-free in uev_add_path
  multipathd: add recheck_wwid_time option to verify the path wwid

 libmultipath/config.c             |   1 +
 libmultipath/config.h             |   1 +
 libmultipath/configure.c          |   4 +-
 libmultipath/configure.h          |   2 +
 libmultipath/defaults.h           |   1 +
 libmultipath/dict.c               |  36 ++++++++++
 libmultipath/libmultipath.version |   6 ++
 libmultipath/structs.h            |  10 +++
 multipath/multipath.conf.5        |  18 +++++
 multipathd/cli_handlers.c         |   9 +++
 multipathd/main.c                 | 107 +++++++++++++++++++++++++++---
 multipathd/main.h                 |   2 +
 12 files changed, 187 insertions(+), 10 deletions(-)

-- 
2.17.2

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [dm-devel] [PATCH 1/2] libmultipath: fix use-after-free in uev_add_path
  2021-02-09  5:19 [dm-devel] [PATCH 0/2] Handle remapped LUNs better Benjamin Marzinski
@ 2021-02-09  5:19 ` Benjamin Marzinski
  2021-02-09 20:57   ` Martin Wilck
  2021-02-09  5:19 ` [dm-devel] [PATCH 2/2] multipathd: add recheck_wwid_time option to verify the path wwid Benjamin Marzinski
  1 sibling, 1 reply; 13+ messages in thread
From: Benjamin Marzinski @ 2021-02-09  5:19 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: device-mapper development, Chongyun Wu, Martin Wilck

if ev_remove_path() returns success the path has very likely been
deleted. However, if pathinfo() returned something besides PATHINFO_OK,
but ev_remove_path() succeeded, uev_add_path() was still accessing the
the path afterwards, which would likely cause a use-after-free error.
Insted, uev_add_path() should only continue to access the path if
ev_remove_path() didn't succeed.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 multipathd/main.c | 15 +++++++--------
 1 file changed, 7 insertions(+), 8 deletions(-)

diff --git a/multipathd/main.c b/multipathd/main.c
index 425492a9..19679848 100644
--- a/multipathd/main.c
+++ b/multipathd/main.c
@@ -890,13 +890,7 @@ uev_add_path (struct uevent *uev, struct vectors * vecs, int need_do_map)
 				 */
 				pp->mpp = prev_mpp;
 				ret = ev_remove_path(pp, vecs, true);
-				if (r == PATHINFO_OK && !ret)
-					/*
-					 * Path successfully freed, move on to
-					 * "new path" code path below
-					 */
-					pp = NULL;
-				else {
+				if (ret != 0) {
 					/*
 					 * Failure in ev_remove_path will keep
 					 * path in pathvec in INIT_REMOVED state
@@ -907,7 +901,12 @@ uev_add_path (struct uevent *uev, struct vectors * vecs, int need_do_map)
 					dm_fail_path(pp->mpp->alias, pp->dev_t);
 					condlog(1, "%s: failed to re-add path still mapped in %s",
 						pp->dev, pp->mpp->alias);
-				}
+				} else if (r == PATHINFO_OK)
+					/*
+					 * Path successfully freed, move on to
+					 * "new path" code path below
+					 */
+					pp = NULL;
 			} else if (r == PATHINFO_SKIPPED) {
 				condlog(3, "%s: remove blacklisted path",
 					uev->kernel);
-- 
2.17.2

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [dm-devel] [PATCH 2/2] multipathd: add recheck_wwid_time option to verify the path wwid
  2021-02-09  5:19 [dm-devel] [PATCH 0/2] Handle remapped LUNs better Benjamin Marzinski
  2021-02-09  5:19 ` [dm-devel] [PATCH 1/2] libmultipath: fix use-after-free in uev_add_path Benjamin Marzinski
@ 2021-02-09  5:19 ` Benjamin Marzinski
  2021-02-09 22:19   ` Martin Wilck
  1 sibling, 1 reply; 13+ messages in thread
From: Benjamin Marzinski @ 2021-02-09  5:19 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: device-mapper development, Chongyun Wu, Martin Wilck

There are cases where the wwid of a path changes due to LUN remapping
without triggering uevent for the changed path. Multipathd has no method
for trying to catch these cases, and corruption has resulted because of
it.

In order to have a better chance at catching these cases, multipath now
has a recheck_wwid_time option, which can either be set to "off" or a
number of seconds. If a path is failed for equal to or greater than the
configured number of seconds, multipathd will recheck its wwid before
restoring it, when the path checker sees that it has come back up. If
multipathd notices that a path's wwid has changed it will remove and
re-add the path, just like the existing wwid checking code for change
events does.  In cases where the no uevent occurs, both the udev
database entry and sysfs will have the old wwid, so the only way to get
a current wwid is to ask the device directly. Currently multipath only
has code to directly get the wwid for scsi devices, so this option only
effects scsi devices. Also, since it's possible the the udev wwid won't
match the wwid from get_vpd_sgio(), if multipathd doesn't initially see
the two values matching for a device, it will disable this option for
that device.

If recheck_wwid_time is not turned off, multipathd will also
automatically recheck the wwid whenever an existing path gets a add
event, or is manually re-added with cli_add_path().

Co-developed-by: Chongyun Wu <wucy11@chinatelecom.cn>
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 libmultipath/config.c             |  1 +
 libmultipath/config.h             |  1 +
 libmultipath/configure.c          |  4 +-
 libmultipath/configure.h          |  2 +
 libmultipath/defaults.h           |  1 +
 libmultipath/dict.c               | 36 ++++++++++++
 libmultipath/libmultipath.version |  6 ++
 libmultipath/structs.h            | 10 ++++
 multipath/multipath.conf.5        | 18 ++++++
 multipathd/cli_handlers.c         |  9 +++
 multipathd/main.c                 | 92 +++++++++++++++++++++++++++++++
 multipathd/main.h                 |  2 +
 12 files changed, 180 insertions(+), 2 deletions(-)

diff --git a/libmultipath/config.c b/libmultipath/config.c
index be310159..c79ebbeb 100644
--- a/libmultipath/config.c
+++ b/libmultipath/config.c
@@ -867,6 +867,7 @@ int _init_config (const char *file, struct config *conf)
 	conf->remove_retries = 0;
 	conf->ghost_delay = DEFAULT_GHOST_DELAY;
 	conf->all_tg_pt = DEFAULT_ALL_TG_PT;
+	conf->recheck_wwid_time = DEFAULT_RECHECK_WWID;
 	/*
 	 * preload default hwtable
 	 */
diff --git a/libmultipath/config.h b/libmultipath/config.h
index 9ce37f16..02ae2407 100644
--- a/libmultipath/config.h
+++ b/libmultipath/config.h
@@ -187,6 +187,7 @@ struct config {
 	int marginal_pathgroups;
 	int skip_delegate;
 	unsigned int sequence_nr;
+	int recheck_wwid_time;
 
 	char * multipath_dir;
 	char * selector;
diff --git a/libmultipath/configure.c b/libmultipath/configure.c
index 598efe05..6ca1f4bb 100644
--- a/libmultipath/configure.c
+++ b/libmultipath/configure.c
@@ -527,8 +527,8 @@ get_udev_for_mpp(const struct multipath *mpp)
 	return udd;
 }
 
-static void trigger_partitions_udev_change(struct udev_device *dev,
-					   const char *action, int len)
+void trigger_partitions_udev_change(struct udev_device *dev,
+				    const char *action, int len)
 {
 	struct udev_enumerate *part_enum;
 	struct udev_list_entry *item;
diff --git a/libmultipath/configure.h b/libmultipath/configure.h
index 6b23ccbb..70cf77a3 100644
--- a/libmultipath/configure.h
+++ b/libmultipath/configure.h
@@ -58,3 +58,5 @@ int get_refwwid (enum mpath_cmds cmd, const char *dev, enum devtypes dev_type,
 		 vector pathvec, char **wwid);
 struct udev_device *get_udev_device(const char *dev, enum devtypes dev_type);
 void trigger_paths_udev_change(struct multipath *mpp, bool is_mpath);
+void trigger_partitions_udev_change(struct udev_device *dev, const char *action,
+				    int len);
diff --git a/libmultipath/defaults.h b/libmultipath/defaults.h
index 947ba467..d24b69a0 100644
--- a/libmultipath/defaults.h
+++ b/libmultipath/defaults.h
@@ -55,6 +55,7 @@
 #define DEFAULT_ALL_TG_PT ALL_TG_PT_OFF
 /* Enable no foreign libraries by default */
 #define DEFAULT_ENABLE_FOREIGN "NONE"
+#define DEFAULT_RECHECK_WWID RECHECK_WWID_OFF
 
 #define CHECKINT_UNDEF		UINT_MAX
 #define DEFAULT_CHECKINT	5
diff --git a/libmultipath/dict.c b/libmultipath/dict.c
index bab96146..85782ea3 100644
--- a/libmultipath/dict.c
+++ b/libmultipath/dict.c
@@ -1401,6 +1401,41 @@ declare_hw_snprint(all_tg_pt, print_yes_no_undef)
 declare_def_handler(marginal_pathgroups, set_yes_no)
 declare_def_snprint(marginal_pathgroups, print_yes_no)
 
+static int
+set_recheck_wwid_time(vector strvec, void *ptr)
+{
+	int *int_ptr = (int *)ptr;
+	char * buff;
+
+	buff = set_value(strvec);
+	if (!buff)
+		return 1;
+
+	if (!strcmp(buff, "no") || !strcmp(buff, "off"))
+		*int_ptr = RECHECK_WWID_OFF;
+	else if (sscanf(buff, "%d", int_ptr) != 1 || *int_ptr < 0)
+		*int_ptr = DEFAULT_RECHECK_WWID;
+
+	FREE(buff);
+	return 0;
+}
+
+int
+print_recheck_wwid_time(char * buff, int len, long v)
+{
+	switch(v) {
+	case RECHECK_WWID_OFF:
+		return snprintf(buff, len, "\"off\"");
+	default:
+		return snprintf(buff, len, "%li", v);
+	}
+}
+
+
+
+declare_def_handler(recheck_wwid_time, set_recheck_wwid_time)
+declare_def_snprint(recheck_wwid_time, print_recheck_wwid_time)
+
 static int
 def_uxsock_timeout_handler(struct config *conf, vector strvec)
 {
@@ -1819,6 +1854,7 @@ init_keywords(vector keywords)
 	install_keyword("enable_foreign", &def_enable_foreign_handler,
 			&snprint_def_enable_foreign);
 	install_keyword("marginal_pathgroups", &def_marginal_pathgroups_handler, &snprint_def_marginal_pathgroups);
+	install_keyword("recheck_wwid_time", &def_recheck_wwid_time_handler, &snprint_def_recheck_wwid_time);
 	__deprecated install_keyword("default_selector", &def_selector_handler, NULL);
 	__deprecated install_keyword("default_path_grouping_policy", &def_pgpolicy_handler, NULL);
 	__deprecated install_keyword("default_uid_attribute", &def_uid_attribute_handler, NULL);
diff --git a/libmultipath/libmultipath.version b/libmultipath/libmultipath.version
index 2228f4ec..e9b4608f 100644
--- a/libmultipath/libmultipath.version
+++ b/libmultipath/libmultipath.version
@@ -280,3 +280,9 @@ LIBMULTIPATH_4.4.0 {
 global:
 	get_next_string;
 } LIBMULTIPATH_4.3.0;
+
+LIBMULITIPATH_4.5.0 {
+global:
+	get_vpd_sgio;
+	trigger_partitions_udev_change;
+} LIBMULTIPATH_4.4.0;
diff --git a/libmultipath/structs.h b/libmultipath/structs.h
index d6ff6762..357fdc24 100644
--- a/libmultipath/structs.h
+++ b/libmultipath/structs.h
@@ -38,6 +38,7 @@
 #define NO_PATH_RETRY_FAIL	-1
 #define NO_PATH_RETRY_QUEUE	-2
 
+#define RECHECK_WWID_OFF	-1
 
 enum free_path_mode {
 	KEEP_PATHS,
@@ -242,6 +243,13 @@ enum eh_deadline_states {
 	EH_DEADLINE_ZERO = UOZ_ZERO,
 };
 
+
+enum allow_wwid_recheck_states {
+	ALLOW_WWID_RECHECK_UNSET,
+	ALLOW_WWID_RECHECK_OFF,
+	ALLOW_WWID_RECHECK_ON,
+};
+
 struct vpd_vendor_page {
 	int pg;
 	const char *name;
@@ -316,6 +324,8 @@ struct path {
 	int find_multipaths_timeout;
 	int marginal;
 	int vpd_vendor_id;
+	int failed_time;
+	int allow_wwid_recheck;
 	/* configlet pointers */
 	vector hwe;
 	struct gen_path generic_path;
diff --git a/multipath/multipath.conf.5 b/multipath/multipath.conf.5
index 8ef3a747..50409d57 100644
--- a/multipath/multipath.conf.5
+++ b/multipath/multipath.conf.5
@@ -1273,6 +1273,24 @@ The default is: \fB\(dqNONE\(dq\fR
 .RE
 .
 .
+.TP
+.B recheck_wwid_time
+Sets the number of seconds that a path device can be failed for, before
+multipathd rechecks its wwid when it is restored. If the wwid has changed,
+the path is removed from the current multipath device, and re-added as a new
+path. If set to \fIoff\fR these rechecks are disabled. If set to \fI0\fR,
+multipath will always recheck the path's wwid whenever it is restored. If
+rechecks are enabled, they will also happen if an \fIadd\fR uevent occurs
+for an existing path, or an existing path is manually re-added. These rechecks
+will only happen for scsi devices. \fBNote:\fR When the path is originally
+added, if the path's configured wwid doesn't match the wwid retrieved directly
+from the scsi device, rechecks will be disabled for the device.
+.RS
+.TP
+The default is \fBoff\fR
+.RE
+.
+.
 
 .
 .\" ----------------------------------------------------------------------------
diff --git a/multipathd/cli_handlers.c b/multipathd/cli_handlers.c
index 54635738..970d5e21 100644
--- a/multipathd/cli_handlers.c
+++ b/multipathd/cli_handlers.c
@@ -715,6 +715,15 @@ cli_add_path (void * v, char ** reply, int * len, void * data)
 	pp = find_path_by_dev(vecs->pathvec, param);
 	if (pp && pp->initialized != INIT_REMOVED) {
 		condlog(2, "%s: path already in pathvec", param);
+
+		if (pp->allow_wwid_recheck == ALLOW_WWID_RECHECK_ON &&
+		    check_path_wwid_change(pp)) {
+			condlog(0, "%s: wwid changed. Removing device",
+				pp->dev);
+			handle_path_wwid_change(pp, vecs);
+			return 1;
+		}
+
 		if (pp->mpp)
 			return 0;
 	} else if (pp) {
diff --git a/multipathd/main.c b/multipathd/main.c
index 19679848..17eef3a4 100644
--- a/multipathd/main.c
+++ b/multipathd/main.c
@@ -823,6 +823,59 @@ ev_remove_map (char * devname, char * alias, int minor, struct vectors * vecs)
 	return flush_map(mpp, vecs, 0);
 }
 
+void
+handle_path_wwid_change(struct path *pp, struct vectors *vecs)
+{
+	const char *action = "add";
+	if (!pp || !pp->udev)
+		return;
+
+	sysfs_attr_set_value(pp->udev, "uevent", action, strlen(action));
+	trigger_partitions_udev_change(pp->udev, action, strlen(action));
+	if (ev_remove_path(pp, vecs, 1) != 0 && pp->mpp) {
+		pp->dmstate = PSTATE_FAILED;
+		dm_fail_path(pp->mpp->alias, pp->dev_t);
+	}
+}
+
+bool
+check_path_wwid_change(struct path *pp)
+{
+	char wwid[WWID_SIZE];
+	int len = 0;
+	char *c;
+
+	if (!strlen(pp->wwid) || pp->bus != SYSFS_BUS_SCSI)
+		return false;
+
+	/* Get the real fresh device wwid by sgio. sysfs still has old
+	 * data, so only get_vpd_sgio will work to get the new wwid */
+	len = get_vpd_sgio(pp->fd, 0x83, 0, wwid, WWID_SIZE);
+
+	if (len <= 0) {
+		condlog(2, "%s: failed to check wwid by sgio: len = %d",
+			pp->dev, len);
+		return false;
+	}
+
+	/*Strip any trailing blanks */
+	c = strchr(wwid, '\0');
+	c--;
+	while (c && c >= wwid && *c == ' ') {
+		*c = '\0';
+		c--;
+	}
+	condlog(4, "%s: Got wwid %s by sgio", pp->dev, wwid);
+
+	if (strncmp(wwid, pp->wwid, WWID_SIZE)) {
+		condlog(0, "%s: wwid '%s' doesn't match wwid '%s' from device",
+			pp->dev, pp->wwid, wwid);
+		return true;
+	}
+
+	return false;
+}
+
 static int
 uev_add_path (struct uevent *uev, struct vectors * vecs, int need_do_map)
 {
@@ -919,6 +972,21 @@ uev_add_path (struct uevent *uev, struct vectors * vecs, int need_do_map)
 					uev->kernel);
 				ret = 1;
 			}
+		} else if (pp->allow_wwid_recheck == ALLOW_WWID_RECHECK_ON &&
+			   check_path_wwid_change(pp)) {
+			condlog(2, "%s wwid change detected when processing add uevent. removing path", pp->dev);
+			/*
+			 * don't use handle_path_wwid_change here,
+			 * since that would just trigger another add
+			 * uevent
+			 */
+			ret = ev_remove_path(pp, vecs, true);
+			if (ret == 0)
+				pp = NULL;
+			else if (pp->mpp) {
+				pp->dmstate = PSTATE_FAILED;
+				dm_fail_path(pp->mpp->alias, pp->dev_t);
+			}
 		}
 	}
 	if (pp)
@@ -2049,6 +2117,7 @@ check_path (struct vectors * vecs, struct path * pp, unsigned int ticks)
 	unsigned int checkint, max_checkint;
 	struct config *conf;
 	int marginal_pathgroups, marginal_changed = 0;
+	int recheck_wwid_time;
 	int ret;
 
 	if (((pp->initialized == INIT_OK ||
@@ -2066,6 +2135,7 @@ check_path (struct vectors * vecs, struct path * pp, unsigned int ticks)
 	checkint = conf->checkint;
 	max_checkint = conf->max_checkint;
 	marginal_pathgroups = conf->marginal_pathgroups;
+	recheck_wwid_time = conf->recheck_wwid_time;
 	put_multipath_config(conf);
 
 	if (pp->checkint == CHECKINT_UNDEF) {
@@ -2197,6 +2267,26 @@ check_path (struct vectors * vecs, struct path * pp, unsigned int ticks)
 		return 0;
 	set_no_path_retry(pp->mpp);
 
+	if (recheck_wwid_time != RECHECK_WWID_OFF &&
+	    (newstate == PATH_UP || newstate == PATH_GHOST)) {
+		if (pp->allow_wwid_recheck == ALLOW_WWID_RECHECK_UNSET) {
+			if (check_path_wwid_change(pp))
+				pp->allow_wwid_recheck = ALLOW_WWID_RECHECK_OFF;
+			else
+				pp->allow_wwid_recheck = ALLOW_WWID_RECHECK_ON;
+		} else if (((pp->state != PATH_UP && pp->state != PATH_GHOST) ||
+			    pp->dmstate == PSTATE_FAILED) &&
+			   pp->failed_time >= recheck_wwid_time &&
+			   pp->allow_wwid_recheck == ALLOW_WWID_RECHECK_ON &&
+			   check_path_wwid_change(pp)) {
+			condlog(0, "%s: path wwid change detected. Removing",
+				pp->dev);
+			handle_path_wwid_change(pp, vecs);
+			return 0;
+		}
+		pp->failed_time = 0;
+	}
+
 	if ((newstate == PATH_UP || newstate == PATH_GHOST) &&
 	    (san_path_check_enabled(pp->mpp) ||
 	     marginal_path_check_enabled(pp->mpp))) {
@@ -2330,6 +2420,8 @@ check_path (struct vectors * vecs, struct path * pp, unsigned int ticks)
 		if (newstate == PATH_DOWN) {
 			int log_checker_err;
 
+			if (recheck_wwid_time != RECHECK_WWID_OFF)
+				pp->failed_time += pp->checkint;
 			conf = get_multipath_config();
 			log_checker_err = conf->log_checker_err;
 			put_multipath_config(conf);
diff --git a/multipathd/main.h b/multipathd/main.h
index 5abbe97b..ddd953f9 100644
--- a/multipathd/main.h
+++ b/multipathd/main.h
@@ -50,4 +50,6 @@ int update_multipath (struct vectors *vecs, char *mapname, int reset);
 int reload_and_sync_map(struct multipath *mpp, struct vectors *vecs,
 			int refresh);
 
+void handle_path_wwid_change(struct path *pp, struct vectors *vecs);
+bool check_path_wwid_change(struct path *pp);
 #endif /* MAIN_H */
-- 
2.17.2

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [dm-devel] [PATCH 1/2] libmultipath: fix use-after-free in uev_add_path
  2021-02-09  5:19 ` [dm-devel] [PATCH 1/2] libmultipath: fix use-after-free in uev_add_path Benjamin Marzinski
@ 2021-02-09 20:57   ` Martin Wilck
  0 siblings, 0 replies; 13+ messages in thread
From: Martin Wilck @ 2021-02-09 20:57 UTC (permalink / raw)
  To: bmarzins, christophe.varoqui; +Cc: dm-devel, wucy11

On Mon, 2021-02-08 at 23:19 -0600, Benjamin Marzinski wrote:
> if ev_remove_path() returns success the path has very likely been
> deleted. However, if pathinfo() returned something besides
> PATHINFO_OK,
> but ev_remove_path() succeeded, uev_add_path() was still accessing
> the
> the path afterwards, which would likely cause a use-after-free error.
> Insted, uev_add_path() should only continue to access the path if
> ev_remove_path() didn't succeed.
> 
> Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>

Reviewed-by: Martin Wilck <mwilck@suse.com>


-- 
Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
SUSE Software Solutions Germany GmbH
HRB 36809, AG Nürnberg GF: Felix Imendörffer



--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [dm-devel] [PATCH 2/2] multipathd: add recheck_wwid_time option to verify the path wwid
  2021-02-09  5:19 ` [dm-devel] [PATCH 2/2] multipathd: add recheck_wwid_time option to verify the path wwid Benjamin Marzinski
@ 2021-02-09 22:19   ` Martin Wilck
  2021-02-10 18:09     ` Benjamin Block
                       ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Martin Wilck @ 2021-02-09 22:19 UTC (permalink / raw)
  To: bmarzins, christophe.varoqui; +Cc: dm-devel, wucy11

On Mon, 2021-02-08 at 23:19 -0600, Benjamin Marzinski wrote:
> There are cases where the wwid of a path changes due to LUN remapping
> without triggering uevent for the changed path. Multipathd has no
> method
> for trying to catch these cases, and corruption has resulted because
> of
> it.
> 
> In order to have a better chance at catching these cases, multipath
> now
> has a recheck_wwid_time option, which can either be set to "off" or a
> number of seconds. If a path is failed for equal to or greater than
> the
> configured number of seconds, multipathd will recheck its wwid before
> restoring it, when the path checker sees that it has come back up.

Can't the WWID change also happen without the path going offline, or
at least without being offline long enough that multipathd would
notice?

>  If
> multipathd notices that a path's wwid has changed it will remove and
> re-add the path, just like the existing wwid checking code for change
> events does.  In cases where the no uevent occurs, both the udev
> database entry and sysfs will have the old wwid, so the only way to
> get
> a current wwid is to ask the device directly. 

sysfs is wrong too, really? In that case I fear triggering an uevent
won't fix the situation. You need to force the kernel to rescan the
device, otherwise udev will fetch the WWID from sysfs again, which
still has the wrong ID... or what am I missing here?

> > Currently multipath only
> has code to directly get the wwid for scsi devices, so this option
> only
> effects scsi devices. Also, since it's possible the the udev wwid
> won't
> match the wwid from get_vpd_sgio(), if multipathd doesn't initially
> see
> the two values matching for a device, it will disable this option for
> that device.
> 
> If recheck_wwid_time is not turned off, multipathd will also
> automatically recheck the wwid whenever an existing path gets a add
> event, or is manually re-added with cli_add_path().
> 
> Co-developed-by: Chongyun Wu <wucy11@chinatelecom.cn>
> Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>

I am uncertain about this.

We get one more configuration option that defaults to off and that only
the truly inaugurated will understand and use. And even those will not
know how to set the recheck time. Should it be 1s, 10, or 100? We
already have too many of these options in multipath-tools. We shy away
from giving users reasonable defaults, with the result that most people
won't bother.

I generally don't understand what the UP/DOWN state has to do with
this. If the WWID can change without any events seen by either the
kernel or user space, why would the path go down and up again? And even
if so, why would it matter how long the device remained down?

But foremost, do we really have to try to deal with configuration
mistakes as blatant as this? What if a user sets the same WWID for
different devices, or re-uses the same WWID on different storage
servers? I already hesitated about the code I added myself for catching
user errors in the WWIDs file, but this seems even more far-fetched.

Please convince me.

This said, I'd like to understand why there are no events in these
cases. I'd have thought we'd at least get a UNIT ATTENTION (REPORTED
LUNS DATA HAS CHANGED), which would have caused a uevent. If there was
no UNIT ATTENTION, I'd blame the storage side. 

Maybe we need to monitor scsi_device uevents.

Technical remarks below.


> ---
>  libmultipath/config.c             |  1 +
>  libmultipath/config.h             |  1 +
>  libmultipath/configure.c          |  4 +-
>  libmultipath/configure.h          |  2 +
>  libmultipath/defaults.h           |  1 +
>  libmultipath/dict.c               | 36 ++++++++++++
>  libmultipath/libmultipath.version |  6 ++
>  libmultipath/structs.h            | 10 ++++
>  multipath/multipath.conf.5        | 18 ++++++
>  multipathd/cli_handlers.c         |  9 +++
>  multipathd/main.c                 | 92
> +++++++++++++++++++++++++++++++
>  multipathd/main.h                 |  2 +
>  12 files changed, 180 insertions(+), 2 deletions(-)
>  .
>  .\" ----------------------------------------------------------------
> ------------
> diff --git a/multipathd/cli_handlers.c b/multipathd/cli_handlers.c
> index 54635738..970d5e21 100644
> --- a/multipathd/cli_handlers.c
> +++ b/multipathd/cli_handlers.c
> @@ -715,6 +715,15 @@ cli_add_path (void * v, char ** reply, int *
> len, void * data)
>         pp = find_path_by_dev(vecs->pathvec, param);
>         if (pp && pp->initialized != INIT_REMOVED) {
>                 condlog(2, "%s: path already in pathvec", param);
> +
> +               if (pp->allow_wwid_recheck == ALLOW_WWID_RECHECK_ON
> &&
> +                   check_path_wwid_change(pp)) {
> +                       condlog(0, "%s: wwid changed. Removing
> device",
> +                               pp->dev);
> +                       handle_path_wwid_change(pp, vecs);
> +                       return 1;
> +               }
> +
>                 if (pp->mpp)
>                         return 0;
>         } else if (pp) {
> diff --git a/multipathd/main.c b/multipathd/main.c
> index 19679848..17eef3a4 100644
> --- a/multipathd/main.c
> +++ b/multipathd/main.c
> @@ -823,6 +823,59 @@ ev_remove_map (char * devname, char * alias, int
> minor, struct vectors * vecs)
>         return flush_map(mpp, vecs, 0);
>  }
>  
> +void
> +handle_path_wwid_change(struct path *pp, struct vectors *vecs)
> +{
> +       const char *action = "add";
> +       if (!pp || !pp->udev)
> +               return;
> +
> +       sysfs_attr_set_value(pp->udev, "uevent", action,
> strlen(action));
> +       trigger_partitions_udev_change(pp->udev, action,
> strlen(action));

Nit: it looks a bit weird to use a const char * variable for "action"
and a constant for "uevent".

> +       if (ev_remove_path(pp, vecs, 1) != 0 && pp->mpp) {
> +               pp->dmstate = PSTATE_FAILED;
> +               dm_fail_path(pp->mpp->alias, pp->dev_t);
> +       }

I suggest taking a ref on pp->udev, calling ev_remove_path(), and
triggering the uevent after that.

> +}
> +
> +bool
> +check_path_wwid_change(struct path *pp)
> +{
> +       char wwid[WWID_SIZE];
> +       int len = 0;
> +       char *c;
> +
> +       if (!strlen(pp->wwid) || pp->bus != SYSFS_BUS_SCSI)
> +               return false;

Maybe you should look at uid_attribute here, to be consistent with
has_uid_fallback()?

> +
> +       /* Get the real fresh device wwid by sgio. sysfs still has
> old
> +        * data, so only get_vpd_sgio will work to get the new wwid
> */
> +       len = get_vpd_sgio(pp->fd, 0x83, 0, wwid, WWID_SIZE);
> +
> +       if (len <= 0) {
> +               condlog(2, "%s: failed to check wwid by sgio: len =
> %d",
> +                       pp->dev, len);
> +               return false;
> +       }
> +
> +       /*Strip any trailing blanks */
> +       c = strchr(wwid, '\0');

Quite unusual, why not use strlen() or strnlen()?

> +       c--;
> +       while (c && c >= wwid && *c == ' ') {
> +               *c = '\0';
> +               c--;
> +       }

Nit: You don't have to nullify every space. Just the first one.


> +       condlog(4, "%s: Got wwid %s by sgio", pp->dev, wwid);
> +
> +       if (strncmp(wwid, pp->wwid, WWID_SIZE)) {
> +               condlog(0, "%s: wwid '%s' doesn't match wwid '%s'
> from device",
> +                       pp->dev, pp->wwid, wwid);
> +               return true;
> +       }
> +
> +       return false;
> +}
> +
>  static int
>  uev_add_path (struct uevent *uev, struct vectors * vecs, int
> need_do_map)
>  {
> @@ -919,6 +972,21 @@ uev_add_path (struct uevent *uev, struct vectors
> * vecs, int need_do_map)
>                                         uev->kernel);
>                                 ret = 1;
>                         }
> +               } else if (pp->allow_wwid_recheck ==
> ALLOW_WWID_RECHECK_ON &&
> +                          check_path_wwid_change(pp)) {
> +                       condlog(2, "%s wwid change detected when
> processing add uevent. removing path", pp->dev);
> +                       /*
> +                        * don't use handle_path_wwid_change here,
> +                        * since that would just trigger another add
> +                        * uevent
> +                        */
> +                       ret = ev_remove_path(pp, vecs, true);
> +                       if (ret == 0)
> +                               pp = NULL;
> +                       else if (pp->mpp) {
> +                               pp->dmstate = PSTATE_FAILED;
> +                               dm_fail_path(pp->mpp->alias, pp-
> >dev_t);
> +                       }

What's the purpose of this code? Are you handling your own artificial
"add" event here, which you triggered before in
handle_path_wwid_change()? Or are there real cases where the kernel
would send an "add" event without sending a "remove" event first?

>                 }
>         }
>         if (pp)
> @@ -2049,6 +2117,7 @@ check_path (struct vectors * vecs, struct path
> * pp, unsigned int ticks)
>         unsigned int checkint, max_checkint;
>         struct config *conf;
>         int marginal_pathgroups, marginal_changed = 0;
> +       int recheck_wwid_time;
>         int ret;
>  
>         if (((pp->initialized == INIT_OK ||
> @@ -2066,6 +2135,7 @@ check_path (struct vectors * vecs, struct path
> * pp, unsigned int ticks)
>         checkint = conf->checkint;
>         max_checkint = conf->max_checkint;
>         marginal_pathgroups = conf->marginal_pathgroups;
> +       recheck_wwid_time = conf->recheck_wwid_time;
>         put_multipath_config(conf);
>  
>         if (pp->checkint == CHECKINT_UNDEF) {
> @@ -2197,6 +2267,26 @@ check_path (struct vectors * vecs, struct path
> * pp, unsigned int ticks)
>                 return 0;
>         set_no_path_retry(pp->mpp);
>  
> +       if (recheck_wwid_time != RECHECK_WWID_OFF &&
> +           (newstate == PATH_UP || newstate == PATH_GHOST)) {
> +               if (pp->allow_wwid_recheck ==
> ALLOW_WWID_RECHECK_UNSET) {
> +                       if (check_path_wwid_change(pp))
> +                               pp->allow_wwid_recheck =
> ALLOW_WWID_RECHECK_OFF;
> +                       else
> +                               pp->allow_wwid_recheck =
> ALLOW_WWID_RECHECK_ON;

This is confusing. I pulled my hair over this code before I read your
man page hunk: "When the path is originally added, if the path's
configured wwid doesn't match the wwid retrieved directly
from the scsi device, rechecks will be disabled for the device."

So, I gather this is an alternative to the has_uid_fallback() logic
mentioned above. It deserves at least a comment here. Please consider
if using the same logic as has_uid_falback() isn't just as good as
this.

Btw, as we're already pretty much on corner-case ground here, if the
path fails quickly after being detected, a WWID change could have
occured already when it comes UP first, and this code is run for the
first time.


> +               } else if (((pp->state != PATH_UP && pp->state !=
> PATH_GHOST) ||
> +                           pp->dmstate == PSTATE_FAILED) &&
> +                          pp->failed_time >= recheck_wwid_time &&
> +                          pp->allow_wwid_recheck ==
> ALLOW_WWID_RECHECK_ON &&
> +                          check_path_wwid_change(pp)) {
> +                       condlog(0, "%s: path wwid change detected.
> Removing",
> +                               pp->dev);
> +                       handle_path_wwid_change(pp, vecs);
> +                       return 0;
> +               }
> +               pp->failed_time = 0;
> +       }
> +
>         if ((newstate == PATH_UP || newstate == PATH_GHOST) &&
>             (san_path_check_enabled(pp->mpp) ||
>              marginal_path_check_enabled(pp->mpp))) {
> @@ -2330,6 +2420,8 @@ check_path (struct vectors * vecs, struct path
> * pp, unsigned int ticks)
>                 if (newstate == PATH_DOWN) {
>                         int log_checker_err;
>  
> +                       if (recheck_wwid_time != RECHECK_WWID_OFF)
> +                               pp->failed_time += pp->checkint;
>                         conf = get_multipath_config();
>                         log_checker_err = conf->log_checker_err;
>                         put_multipath_config(conf);
> diff --git a/multipathd/main.h b/multipathd/main.h
> index 5abbe97b..ddd953f9 100644
> --- a/multipathd/main.h
> +++ b/multipathd/main.h
> @@ -50,4 +50,6 @@ int update_multipath (struct vectors *vecs, char
> *mapname, int reset);
>  int reload_and_sync_map(struct multipath *mpp, struct vectors *vecs,
>                         int refresh);
>  
> +void handle_path_wwid_change(struct path *pp, struct vectors *vecs);
> +bool check_path_wwid_change(struct path *pp);
>  #endif /* MAIN_H */

-- 
Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
SUSE Software Solutions Germany GmbH
HRB 36809, AG Nürnberg GF: Felix Imendörffer



--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [dm-devel] [PATCH 2/2] multipathd: add recheck_wwid_time option to verify the path wwid
  2021-02-09 22:19   ` Martin Wilck
@ 2021-02-10 18:09     ` Benjamin Block
  2021-02-10 19:57       ` Martin Wilck
  2021-02-11 11:25       ` Benjamin Block
  2021-02-11  4:48     ` Benjamin Marzinski
  2021-02-27  6:02     ` Benjamin Marzinski
  2 siblings, 2 replies; 13+ messages in thread
From: Benjamin Block @ 2021-02-10 18:09 UTC (permalink / raw)
  To: Martin Wilck; +Cc: dm-devel, wucy11

On Tue, Feb 09, 2021 at 10:19:45PM +0000, Martin Wilck wrote:
> On Mon, 2021-02-08 at 23:19 -0600, Benjamin Marzinski wrote:
> > There are cases where the wwid of a path changes due to LUN remapping
> > without triggering uevent for the changed path. Multipathd has no
> > method
> > for trying to catch these cases, and corruption has resulted because
> > of
> > it.
> > 
> > In order to have a better chance at catching these cases, multipath
> > now
> > has a recheck_wwid_time option, which can either be set to "off" or a
> > number of seconds. If a path is failed for equal to or greater than
> > the
> > configured number of seconds, multipathd will recheck its wwid before
> > restoring it, when the path checker sees that it has come back up.
> 
> Can't the WWID change also happen without the path going offline, or
> at least without being offline long enough that multipathd would
> notice?
> 
> >  If
> > multipathd notices that a path's wwid has changed it will remove and
> > re-add the path, just like the existing wwid checking code for change
> > events does.  In cases where the no uevent occurs, both the udev
> > database entry and sysfs will have the old wwid, so the only way to
> > get
> > a current wwid is to ask the device directly. 
> 
> sysfs is wrong too, really? In that case I fear triggering an uevent
> won't fix the situation. You need to force the kernel to rescan the
> device, otherwise udev will fetch the WWID from sysfs again, which
> still has the wrong ID... or what am I missing here?
> 
> > > Currently multipath only
> > has code to directly get the wwid for scsi devices, so this option
> > only
> > effects scsi devices. Also, since it's possible the the udev wwid
> > won't
> > match the wwid from get_vpd_sgio(), if multipathd doesn't initially
> > see
> > the two values matching for a device, it will disable this option for
> > that device.
> > 
> > If recheck_wwid_time is not turned off, multipathd will also
> > automatically recheck the wwid whenever an existing path gets a add
> > event, or is manually re-added with cli_add_path().
> > 
> > Co-developed-by: Chongyun Wu <wucy11@chinatelecom.cn>
> > Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> 
> I am uncertain about this.
> 
> We get one more configuration option that defaults to off and that only
> the truly inaugurated will understand and use. And even those will not
> know how to set the recheck time. Should it be 1s, 10, or 100? We
> already have too many of these options in multipath-tools. We shy away
> from giving users reasonable defaults, with the result that most people
> won't bother.
> 
> I generally don't understand what the UP/DOWN state has to do with
> this. If the WWID can change without any events seen by either the
> kernel or user space, why would the path go down and up again? And even
> if so, why would it matter how long the device remained down?
> 
> But foremost, do we really have to try to deal with configuration
> mistakes as blatant as this? What if a user sets the same WWID for
> different devices, or re-uses the same WWID on different storage
> servers? I already hesitated about the code I added myself for catching
> user errors in the WWIDs file, but this seems even more far-fetched.
> 
> Please convince me.
> 
> This said, I'd like to understand why there are no events in these
> cases. I'd have thought we'd at least get a UNIT ATTENTION (REPORTED
> LUNS DATA HAS CHANGED), which would have caused a uevent. If there was
> no UNIT ATTENTION, I'd blame the storage side. 

Yeah, just for reference, I saw this happening in practice when
something with the LU mapping changed on IBM storage - IIRC I saw it
with capacity changes. You end up in this code in the kernel:
    https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/scsi/scsi_error.c?id=92bf22614b21a2706f4993b278017e437f7785b3#n416

And from there you ought to get an uevent for the sdev.

The WWID in sysfs might still be wrong though AFAIK. The kernel seems to
ignore the UA after it delivered the uevent.

> 
> Maybe we need to monitor scsi_device uevents.
> 
> Technical remarks below.
> 
> 

-- 
Best Regards, Benjamin Block  / Linux on IBM Z Kernel Development / IBM Systems
IBM Deutschland Research & Development GmbH    /    https://www.ibm.com/privacy
Vorsitz. AufsR.: Gregor Pillen         /        Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen / Registergericht: AmtsG Stuttgart, HRB 243294


--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [dm-devel] [PATCH 2/2] multipathd: add recheck_wwid_time option to verify the path wwid
  2021-02-10 18:09     ` Benjamin Block
@ 2021-02-10 19:57       ` Martin Wilck
  2021-02-11 11:25       ` Benjamin Block
  1 sibling, 0 replies; 13+ messages in thread
From: Martin Wilck @ 2021-02-10 19:57 UTC (permalink / raw)
  To: bblock; +Cc: dm-devel, wucy11

On Wed, 2021-02-10 at 19:09 +0100, Benjamin Block wrote:
> 
> Yeah, just for reference, I saw this happening in practice when
> something with the LU mapping changed on IBM storage - IIRC I saw it
> with capacity changes. You end up in this code in the kernel:
>     
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/scsi/scsi_error.c?id=92bf22614b21a2706f4993b278017e437f7785b3#n416
> 
> And from there you ought to get an uevent for the sdev.
> 
> The WWID in sysfs might still be wrong though AFAIK. The kernel seems
> to
> ignore the UA after it delivered the uevent.

Right.

We could trigger a device rescan if such an event was received. Could
be done from udev rules, or even from multipathd itself.

Thanks,
Martin

-- 
Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
SUSE Software Solutions Germany GmbH
HRB 36809, AG Nürnberg GF: Felix Imendörffer



--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [dm-devel] [PATCH 2/2] multipathd: add recheck_wwid_time option to verify the path wwid
  2021-02-09 22:19   ` Martin Wilck
  2021-02-10 18:09     ` Benjamin Block
@ 2021-02-11  4:48     ` Benjamin Marzinski
  2021-02-11 12:14       ` Martin Wilck
  2021-02-27  6:02     ` Benjamin Marzinski
  2 siblings, 1 reply; 13+ messages in thread
From: Benjamin Marzinski @ 2021-02-11  4:48 UTC (permalink / raw)
  To: Martin Wilck; +Cc: dm-devel, wucy11

On Tue, Feb 09, 2021 at 10:19:45PM +0000, Martin Wilck wrote:
> On Mon, 2021-02-08 at 23:19 -0600, Benjamin Marzinski wrote:
> > There are cases where the wwid of a path changes due to LUN remapping
> > without triggering uevent for the changed path. Multipathd has no
> > method
> > for trying to catch these cases, and corruption has resulted because
> > of
> > it.
> > 
> > In order to have a better chance at catching these cases, multipath
> > now
> > has a recheck_wwid_time option, which can either be set to "off" or a
> > number of seconds. If a path is failed for equal to or greater than
> > the
> > configured number of seconds, multipathd will recheck its wwid before
> > restoring it, when the path checker sees that it has come back up.
> 
> Can't the WWID change also happen without the path going offline, or
> at least without being offline long enough that multipathd would
> notice?

Yes. There is no way to guarantee that you won't hit this issue. That's
why this is configurable and disableable. People get to choose how
likely they are to shoot themselves in the foot.

> 
> >  If
> > multipathd notices that a path's wwid has changed it will remove and
> > re-add the path, just like the existing wwid checking code for change
> > events does.  In cases where the no uevent occurs, both the udev
> > database entry and sysfs will have the old wwid, so the only way to
> > get
> > a current wwid is to ask the device directly. 
> 
> sysfs is wrong too, really? In that case I fear triggering an uevent
> won't fix the situation. You need to force the kernel to rescan the
> device, otherwise udev will fetch the WWID from sysfs again, which
> still has the wrong ID... or what am I missing here?

In the reproducer I posted using targetcli and FC devices, sysfs is
wrong.  That does make me a little leary about simply re-adding these
devices as new.  udev will run scsi_id to grab the new WWID and store
that in the udev database, but if we ever fail back to looking at sysfs,
we will still see the old data.  In general, having devices on the
system with bad sysfs data seems dangerous.

> > > Currently multipath only
> > has code to directly get the wwid for scsi devices, so this option
> > only
> > effects scsi devices. Also, since it's possible the the udev wwid
> > won't
> > match the wwid from get_vpd_sgio(), if multipathd doesn't initially
> > see
> > the two values matching for a device, it will disable this option for
> > that device.
> > 
> > If recheck_wwid_time is not turned off, multipathd will also
> > automatically recheck the wwid whenever an existing path gets a add
> > event, or is manually re-added with cli_add_path().
> > 
> > Co-developed-by: Chongyun Wu <wucy11@chinatelecom.cn>
> > Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> 
> I am uncertain about this.
> 
> We get one more configuration option that defaults to off and that only
> the truly inaugurated will understand and use. And even those will not
> know how to set the recheck time. Should it be 1s, 10, or 100? We
> already have too many of these options in multipath-tools. We shy away
> from giving users reasonable defaults, with the result that most people
> won't bother.
> 
> I generally don't understand what the UP/DOWN state has to do with
> this. If the WWID can change without any events seen by either the
> kernel or user space, why would the path go down and up again? And even
> if so, why would it matter how long the device remained down?

A new LUN can't get remapped to a still-being-used LUN id. The LUN that
was previously mapped to that id must get unmapped first.  That will
cause IO and and checker commands to fail. So unless a LUN gets unmapped
from a LUN id, and a new one remapped to that id quick enough for no IO
and or checker commands to go to it, the path should go down. In the
case that spurred this development, the path was down for hours before a
new LUN was remapped to that ID.

> But foremost, do we really have to try to deal with configuration
> mistakes as blatant as this? What if a user sets the same WWID for
> different devices, or re-uses the same WWID on different storage
> servers? I already hesitated about the code I added myself for catching
> user errors in the WWIDs file, but this seems even more far-fetched.
> 
> Please convince me.

Err.. An important customer corrupted their data and while they admit
that they were at fault, it's hard for them to guarantee that something
like this won't happen in the future, and they asked if multipath could
do a better job of catching these sorts of mistakes. Obviously this is
more convincing when it's coming from your customer. But the fact still
stands that this has happened to multiple users even with our existing
code to catch this.

Since this isn't a problem that can always be fixed, the best we can do
is try to catch it before something bad happens.  If the path is
remapped before it goes down, then corruption can happen before there
is any possiblity of doing anything. That case is unsolvable. But if
the path does go down when the LUN is unmapped, then there is a chance
to catch this before damage is done.

Now, obviously if you don't set this to 0, then it's possible that the
path gets unmapped and goes down, but comes up before your timeout, and
you don't catch it in time.  Really, this is a dial that nobody that
hasn't got bitten by this issue cares about, but everyone who has had
this happen really wants to be there.

> This said, I'd like to understand why there are no events in these
> cases. I'd have thought we'd at least get a UNIT ATTENTION (REPORTED
> LUNS DATA HAS CHANGED), which would have caused a uevent. If there was
> no UNIT ATTENTION, I'd blame the storage side. 

Are you able to try the reproducer I targetcli FC reproducer I posted?

> Maybe we need to monitor scsi_device uevents.
> 
> Technical remarks below.
> 
> 
> > ---
> >  libmultipath/config.c             |  1 +
> >  libmultipath/config.h             |  1 +
> >  libmultipath/configure.c          |  4 +-
> >  libmultipath/configure.h          |  2 +
> >  libmultipath/defaults.h           |  1 +
> >  libmultipath/dict.c               | 36 ++++++++++++
> >  libmultipath/libmultipath.version |  6 ++
> >  libmultipath/structs.h            | 10 ++++
> >  multipath/multipath.conf.5        | 18 ++++++
> >  multipathd/cli_handlers.c         |  9 +++
> >  multipathd/main.c                 | 92
> > +++++++++++++++++++++++++++++++
> >  multipathd/main.h                 |  2 +
> >  12 files changed, 180 insertions(+), 2 deletions(-)
> >  .
> >  .\" ----------------------------------------------------------------
> > ------------
> > diff --git a/multipathd/cli_handlers.c b/multipathd/cli_handlers.c
> > index 54635738..970d5e21 100644
> > --- a/multipathd/cli_handlers.c
> > +++ b/multipathd/cli_handlers.c
> > @@ -715,6 +715,15 @@ cli_add_path (void * v, char ** reply, int *
> > len, void * data)
> >         pp = find_path_by_dev(vecs->pathvec, param);
> >         if (pp && pp->initialized != INIT_REMOVED) {
> >                 condlog(2, "%s: path already in pathvec", param);
> > +
> > +               if (pp->allow_wwid_recheck == ALLOW_WWID_RECHECK_ON
> > &&
> > +                   check_path_wwid_change(pp)) {
> > +                       condlog(0, "%s: wwid changed. Removing
> > device",
> > +                               pp->dev);
> > +                       handle_path_wwid_change(pp, vecs);
> > +                       return 1;
> > +               }
> > +
> >                 if (pp->mpp)
> >                         return 0;
> >         } else if (pp) {
> > diff --git a/multipathd/main.c b/multipathd/main.c
> > index 19679848..17eef3a4 100644
> > --- a/multipathd/main.c
> > +++ b/multipathd/main.c
> > @@ -823,6 +823,59 @@ ev_remove_map (char * devname, char * alias, int
> > minor, struct vectors * vecs)
> >         return flush_map(mpp, vecs, 0);
> >  }
> >  
> > +void
> > +handle_path_wwid_change(struct path *pp, struct vectors *vecs)
> > +{
> > +       const char *action = "add";
> > +       if (!pp || !pp->udev)
> > +               return;
> > +
> > +       sysfs_attr_set_value(pp->udev, "uevent", action,
> > strlen(action));
> > +       trigger_partitions_udev_change(pp->udev, action,
> > strlen(action));
> 
> Nit: it looks a bit weird to use a const char * variable for "action"
> and a constant for "uevent".

It does mean that there's no chance of typo'ing it one of the four times
it's used and not having it be caught, but I'm fine with changing it.

> > +       if (ev_remove_path(pp, vecs, 1) != 0 && pp->mpp) {
> > +               pp->dmstate = PSTATE_FAILED;
> > +               dm_fail_path(pp->mpp->alias, pp->dev_t);
> > +       }
> 
> I suggest taking a ref on pp->udev, calling ev_remove_path(), and
> triggering the uevent after that.
>

vecs locking will save us from handling the uevent before we remove the
path, but your ordering does make things look more obviously correct.
I'm fine with changing it.

> > +}
> > +
> > +bool
> > +check_path_wwid_change(struct path *pp)
> > +{
> > +       char wwid[WWID_SIZE];
> > +       int len = 0;
> > +       char *c;
> > +
> > +       if (!strlen(pp->wwid) || pp->bus != SYSFS_BUS_SCSI)
> > +               return false;
> 
> Maybe you should look at uid_attribute here, to be consistent with
> has_uid_fallback()?

Possibly, be this code will never be run to see if wwid has changed
without first having been run and verifying that the wwids match. If
the wwids don't match the first time, then it's disabled.
 
> > +
> > +       /* Get the real fresh device wwid by sgio. sysfs still has
> > old
> > +        * data, so only get_vpd_sgio will work to get the new wwid
> > */
> > +       len = get_vpd_sgio(pp->fd, 0x83, 0, wwid, WWID_SIZE);
> > +
> > +       if (len <= 0) {
> > +               condlog(2, "%s: failed to check wwid by sgio: len =
> > %d",
> > +                       pp->dev, len);
> > +               return false;
> > +       }
> > +
> > +       /*Strip any trailing blanks */
> > +       c = strchr(wwid, '\0');
> 
> Quite unusual, why not use strlen() or strnlen()?

This was pulled directly from get_uid(). But I agree it could be cleaned
up in both places

> 
> > +       c--;
> > +       while (c && c >= wwid && *c == ' ') {
> > +               *c = '\0';
> > +               c--;
> > +       }
> 
> Nit: You don't have to nullify every space. Just the first one.
Again, this is just a copy of get_uid().

> 
> > +       condlog(4, "%s: Got wwid %s by sgio", pp->dev, wwid);
> > +
> > +       if (strncmp(wwid, pp->wwid, WWID_SIZE)) {
> > +               condlog(0, "%s: wwid '%s' doesn't match wwid '%s'
> > from device",
> > +                       pp->dev, pp->wwid, wwid);
> > +               return true;
> > +       }
> > +
> > +       return false;
> > +}
> > +
> >  static int
> >  uev_add_path (struct uevent *uev, struct vectors * vecs, int
> > need_do_map)
> >  {
> > @@ -919,6 +972,21 @@ uev_add_path (struct uevent *uev, struct vectors
> > * vecs, int need_do_map)
> >                                         uev->kernel);
> >                                 ret = 1;
> >                         }
> > +               } else if (pp->allow_wwid_recheck ==
> > ALLOW_WWID_RECHECK_ON &&
> > +                          check_path_wwid_change(pp)) {
> > +                       condlog(2, "%s wwid change detected when
> > processing add uevent. removing path", pp->dev);
> > +                       /*
> > +                        * don't use handle_path_wwid_change here,
> > +                        * since that would just trigger another add
> > +                        * uevent
> > +                        */
> > +                       ret = ev_remove_path(pp, vecs, true);
> > +                       if (ret == 0)
> > +                               pp = NULL;
> > +                       else if (pp->mpp) {
> > +                               pp->dmstate = PSTATE_FAILED;
> > +                               dm_fail_path(pp->mpp->alias, pp-
> > >dev_t);
> > +                       }
> 
> What's the purpose of this code? Are you handling your own artificial
> "add" event here, which you triggered before in
> handle_path_wwid_change()? Or are there real cases where the kernel
> would send an "add" event without sending a "remove" event first?
>

This shouldn't be for handling our own add event. Either the
ev_remove_path() in handle_path_wwid_changed() succeeded, and there is
no path in pathvec, or it failed, and pp->initialized should be set to
INIT_REMOVED. Either way, we miss this code path. This is simply if we
get an "add" event without the "remove" event, which is a real thing
that has happened when LUNs get remapped.

> >                 }
> >         }
> >         if (pp)
> > @@ -2049,6 +2117,7 @@ check_path (struct vectors * vecs, struct path
> > * pp, unsigned int ticks)
> >         unsigned int checkint, max_checkint;
> >         struct config *conf;
> >         int marginal_pathgroups, marginal_changed = 0;
> > +       int recheck_wwid_time;
> >         int ret;
> >  
> >         if (((pp->initialized == INIT_OK ||
> > @@ -2066,6 +2135,7 @@ check_path (struct vectors * vecs, struct path
> > * pp, unsigned int ticks)
> >         checkint = conf->checkint;
> >         max_checkint = conf->max_checkint;
> >         marginal_pathgroups = conf->marginal_pathgroups;
> > +       recheck_wwid_time = conf->recheck_wwid_time;
> >         put_multipath_config(conf);
> >  
> >         if (pp->checkint == CHECKINT_UNDEF) {
> > @@ -2197,6 +2267,26 @@ check_path (struct vectors * vecs, struct path
> > * pp, unsigned int ticks)
> >                 return 0;
> >         set_no_path_retry(pp->mpp);
> >  
> > +       if (recheck_wwid_time != RECHECK_WWID_OFF &&
> > +           (newstate == PATH_UP || newstate == PATH_GHOST)) {
> > +               if (pp->allow_wwid_recheck ==
> > ALLOW_WWID_RECHECK_UNSET) {
> > +                       if (check_path_wwid_change(pp))
> > +                               pp->allow_wwid_recheck =
> > ALLOW_WWID_RECHECK_OFF;
> > +                       else
> > +                               pp->allow_wwid_recheck =
> > ALLOW_WWID_RECHECK_ON;
> 
> This is confusing. I pulled my hair over this code before I read your
> man page hunk: "When the path is originally added, if the path's
> configured wwid doesn't match the wwid retrieved directly
> from the scsi device, rechecks will be disabled for the device."
> 
> So, I gather this is an alternative to the has_uid_fallback() logic
> mentioned above. It deserves at least a comment here. Please consider
> if using the same logic as has_uid_falback() isn't just as good as
> this.

I'm fine with using the same logic as has_uid_fallback() to determine if
if we can use get_vpd_sgio().

-Ben

> Btw, as we're already pretty much on corner-case ground here, if the
> path fails quickly after being detected, a WWID change could have
> occured already when it comes UP first, and this code is run for the
> first time.
> 
> 
> > +               } else if (((pp->state != PATH_UP && pp->state !=
> > PATH_GHOST) ||
> > +                           pp->dmstate == PSTATE_FAILED) &&
> > +                          pp->failed_time >= recheck_wwid_time &&
> > +                          pp->allow_wwid_recheck ==
> > ALLOW_WWID_RECHECK_ON &&
> > +                          check_path_wwid_change(pp)) {
> > +                       condlog(0, "%s: path wwid change detected.
> > Removing",
> > +                               pp->dev);
> > +                       handle_path_wwid_change(pp, vecs);
> > +                       return 0;
> > +               }
> > +               pp->failed_time = 0;
> > +       }
> > +
> >         if ((newstate == PATH_UP || newstate == PATH_GHOST) &&
> >             (san_path_check_enabled(pp->mpp) ||
> >              marginal_path_check_enabled(pp->mpp))) {
> > @@ -2330,6 +2420,8 @@ check_path (struct vectors * vecs, struct path
> > * pp, unsigned int ticks)
> >                 if (newstate == PATH_DOWN) {
> >                         int log_checker_err;
> >  
> > +                       if (recheck_wwid_time != RECHECK_WWID_OFF)
> > +                               pp->failed_time += pp->checkint;
> >                         conf = get_multipath_config();
> >                         log_checker_err = conf->log_checker_err;
> >                         put_multipath_config(conf);
> > diff --git a/multipathd/main.h b/multipathd/main.h
> > index 5abbe97b..ddd953f9 100644
> > --- a/multipathd/main.h
> > +++ b/multipathd/main.h
> > @@ -50,4 +50,6 @@ int update_multipath (struct vectors *vecs, char
> > *mapname, int reset);
> >  int reload_and_sync_map(struct multipath *mpp, struct vectors *vecs,
> >                         int refresh);
> >  
> > +void handle_path_wwid_change(struct path *pp, struct vectors *vecs);
> > +bool check_path_wwid_change(struct path *pp);
> >  #endif /* MAIN_H */
> 
> -- 
> Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
> SUSE Software Solutions Germany GmbH
> HRB 36809, AG Nürnberg GF: Felix Imendörffer
> 

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [dm-devel] [PATCH 2/2] multipathd: add recheck_wwid_time option to verify the path wwid
  2021-02-10 18:09     ` Benjamin Block
  2021-02-10 19:57       ` Martin Wilck
@ 2021-02-11 11:25       ` Benjamin Block
  1 sibling, 0 replies; 13+ messages in thread
From: Benjamin Block @ 2021-02-11 11:25 UTC (permalink / raw)
  To: dm-devel

On Wed, Feb 10, 2021 at 07:09:31PM +0100, Benjamin Block wrote:
> On Tue, Feb 09, 2021 at 10:19:45PM +0000, Martin Wilck wrote:
> > On Mon, 2021-02-08 at 23:19 -0600, Benjamin Marzinski wrote:
> > > There are cases where the wwid of a path changes due to LUN remapping
> > > without triggering uevent for the changed path. Multipathd has no
> > > method
> > > for trying to catch these cases, and corruption has resulted because
> > > of
> > > it.
> > > 
> > > In order to have a better chance at catching these cases, multipath
> > > now
> > > has a recheck_wwid_time option, which can either be set to "off" or a
> > > number of seconds. If a path is failed for equal to or greater than
> > > the
> > > configured number of seconds, multipathd will recheck its wwid before
> > > restoring it, when the path checker sees that it has come back up.
> > 
> > Can't the WWID change also happen without the path going offline, or
> > at least without being offline long enough that multipathd would
> > notice?
> > 
> > >  If
> > > multipathd notices that a path's wwid has changed it will remove and
> > > re-add the path, just like the existing wwid checking code for change
> > > events does.  In cases where the no uevent occurs, both the udev
> > > database entry and sysfs will have the old wwid, so the only way to
> > > get
> > > a current wwid is to ask the device directly. 
> > 
> > sysfs is wrong too, really? In that case I fear triggering an uevent
> > won't fix the situation. You need to force the kernel to rescan the
> > device, otherwise udev will fetch the WWID from sysfs again, which
> > still has the wrong ID... or what am I missing here?
> > 
> > > > Currently multipath only
> > > has code to directly get the wwid for scsi devices, so this option
> > > only
> > > effects scsi devices. Also, since it's possible the the udev wwid
> > > won't
> > > match the wwid from get_vpd_sgio(), if multipathd doesn't initially
> > > see
> > > the two values matching for a device, it will disable this option for
> > > that device.
> > > 
> > > If recheck_wwid_time is not turned off, multipathd will also
> > > automatically recheck the wwid whenever an existing path gets a add
> > > event, or is manually re-added with cli_add_path().
> > > 
> > > Co-developed-by: Chongyun Wu <wucy11@chinatelecom.cn>
> > > Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> > 
> > I am uncertain about this.
> > 
> > We get one more configuration option that defaults to off and that only
> > the truly inaugurated will understand and use. And even those will not
> > know how to set the recheck time. Should it be 1s, 10, or 100? We
> > already have too many of these options in multipath-tools. We shy away
> > from giving users reasonable defaults, with the result that most people
> > won't bother.
> > 
> > I generally don't understand what the UP/DOWN state has to do with
> > this. If the WWID can change without any events seen by either the
> > kernel or user space, why would the path go down and up again? And even
> > if so, why would it matter how long the device remained down?
> > 
> > But foremost, do we really have to try to deal with configuration
> > mistakes as blatant as this? What if a user sets the same WWID for
> > different devices, or re-uses the same WWID on different storage
> > servers? I already hesitated about the code I added myself for catching
> > user errors in the WWIDs file, but this seems even more far-fetched.
> > 
> > Please convince me.
> > 
> > This said, I'd like to understand why there are no events in these
> > cases. I'd have thought we'd at least get a UNIT ATTENTION (REPORTED
> > LUNS DATA HAS CHANGED), which would have caused a uevent. If there was
> > no UNIT ATTENTION, I'd blame the storage side. 
> 
> Yeah, just for reference, I saw this happening in practice when
> something with the LU mapping changed on IBM storage - IIRC I saw it
> with capacity changes. You end up in this code in the kernel:
>     https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/scsi/scsi_error.c?id=92bf22614b21a2706f4993b278017e437f7785b3#n416
> 
> And from there you ought to get an uevent for the sdev.
> 
> The WWID in sysfs might still be wrong though AFAIK. The kernel seems to
> ignore the UA after it delivered the uevent.
> 

Sorry, I replied with the wrong mail address.

-- 
Best Regards, Benjamin Block  / Linux on IBM Z Kernel Development / IBM Systems
IBM Deutschland Research & Development GmbH    /    https://www.ibm.com/privacy
Vorsitz. AufsR.: Gregor Pillen         /        Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen / Registergericht: AmtsG Stuttgart, HRB 243294


--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [dm-devel] [PATCH 2/2] multipathd: add recheck_wwid_time option to verify the path wwid
  2021-02-11  4:48     ` Benjamin Marzinski
@ 2021-02-11 12:14       ` Martin Wilck
  2021-02-18  3:22         ` Chongyun Wu
  0 siblings, 1 reply; 13+ messages in thread
From: Martin Wilck @ 2021-02-11 12:14 UTC (permalink / raw)
  To: bmarzins; +Cc: dm-devel, wucy11

On Wed, 2021-02-10 at 22:48 -0600, Benjamin Marzinski wrote:
> On Tue, Feb 09, 2021 at 10:19:45PM +0000, Martin Wilck wrote:
> > On Mon, 2021-02-08 at 23:19 -0600, Benjamin Marzinski wrote:
> > > There are cases where the wwid of a path changes due to LUN
> > > remapping
> > > without triggering uevent for the changed path. Multipathd has no
> > > method
> > > for trying to catch these cases, and corruption has resulted
> > > because
> > > of
> > > it.
> > > 
> > > In order to have a better chance at catching these cases,
> > > multipath
> > > now
> > > has a recheck_wwid_time option, which can either be set to "off"
> > > or a
> > > number of seconds. If a path is failed for equal to or greater
> > > than
> > > the
> > > configured number of seconds, multipathd will recheck its wwid
> > > before
> > > restoring it, when the path checker sees that it has come back
> > > up.
> > 
> > Can't the WWID change also happen without the path going offline,
> > or
> > at least without being offline long enough that multipathd would
> > notice?
> 
> Yes. There is no way to guarantee that you won't hit this issue.
> That's
> why this is configurable and disableable. People get to choose how
> likely they are to shoot themselves in the foot.

We have seen quite a few cases lately where TUR checker was "wrong"
about path state. TUR would succeed, but the regular IO on the path
would either fail or, worse, hang. So by adding an SG_IO right in the
middle of check_path(), with vecs lock held, we increase the risk of
making multipathd hang (it can happen elsewhere, too, like in
get_prio(); I'm still pondering how to avoid it in general).

Thus, it's definitely reasonable to let users switch this on and off.
(it should perhaps be a hwtable option. Are these WWID configuration
mistakes more likely to happen on specific storage arrays than on
others?).

But I fail to see the reason of making the likelihood configurable.

Every sane user would strive for a minimum likelihood for this kind of
failure. As you pointed out, we can't guarantee a zero likelihood, but
*if* we do this, and *if* we do it when paths come back alive as you
implemented it, then it makes sense to do it always as quickly as
possible.

IOW: I think it should be a boolean option, and if "on", it should test
after every down time.

> 
> > 
> > >  If
> > > multipathd notices that a path's wwid has changed it will remove
> > > and
> > > re-add the path, just like the existing wwid checking code for
> > > change
> > > events does.  In cases where the no uevent occurs, both the udev
> > > database entry and sysfs will have the old wwid, so the only way
> > > to
> > > get
> > > a current wwid is to ask the device directly. 
> > 
> > sysfs is wrong too, really? In that case I fear triggering an
> > uevent
> > won't fix the situation. You need to force the kernel to rescan the
> > device, otherwise udev will fetch the WWID from sysfs again, which
> > still has the wrong ID... or what am I missing here?
> 
> In the reproducer I posted using targetcli and FC devices, sysfs is
> wrong.  That does make me a little leary about simply re-adding these
> devices as new.  udev will run scsi_id to grab the new WWID and store
> that in the udev database, but if we ever fail back to looking at
> sysfs,
> we will still see the old data.  In general, having devices on the
> system with bad sysfs data seems dangerous.

That's right, we shouldn't do this. Either we force a SCSI-level rescan
and verify the WWID is correct in sysfs therafter, or we mark this path
as unusable/bad and spit out BIG FAT ERROR MESSAGES.

> 
> > > > Currently multipath only
> > > has code to directly get the wwid for scsi devices, so this
> > > option
> > > only
> > > effects scsi devices. Also, since it's possible the the udev wwid
> > > won't
> > > match the wwid from get_vpd_sgio(), if multipathd doesn't
> > > initially
> > > see
> > > the two values matching for a device, it will disable this option
> > > for
> > > that device.
> > > 
> > > If recheck_wwid_time is not turned off, multipathd will also
> > > automatically recheck the wwid whenever an existing path gets a
> > > add
> > > event, or is manually re-added with cli_add_path().
> > > 
> > > Co-developed-by: Chongyun Wu <wucy11@chinatelecom.cn>
> > > Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> > 
> > I am uncertain about this.
> > 
> > We get one more configuration option that defaults to off and that
> > only
> > the truly inaugurated will understand and use. And even those will
> > not
> > know how to set the recheck time. Should it be 1s, 10, or 100? We
> > already have too many of these options in multipath-tools. We shy
> > away
> > from giving users reasonable defaults, with the result that most
> > people
> > won't bother.
> > 
> > I generally don't understand what the UP/DOWN state has to do with
> > this. If the WWID can change without any events seen by either the
> > kernel or user space, why would the path go down and up again? And
> > even
> > if so, why would it matter how long the device remained down?
> 
> A new LUN can't get remapped to a still-being-used LUN id. The LUN
> that
> was previously mapped to that id must get unmapped first.  That will
> cause IO and and checker commands to fail. So unless a LUN gets
> unmapped
> from a LUN id, and a new one remapped to that id quick enough for no
> IO
> and or checker commands to go to it, the path should go down. In the
> case that spurred this development, the path was down for hours
> before a
> new LUN was remapped to that ID.

OK. My thinking was that storage configuration changes are made by
scripts or software more often than not, and in this case down times
may be quite short. But if they say they need it - fair enough.

> 
> > But foremost, do we really have to try to deal with configuration
> > mistakes as blatant as this? What if a user sets the same WWID for
> > different devices, or re-uses the same WWID on different storage
> > servers? I already hesitated about the code I added myself for
> > catching
> > user errors in the WWIDs file, but this seems even more far-
> > fetched.
> > 
> > Please convince me.
> 
> Err.. An important customer corrupted their data and while they admit
> that they were at fault, it's hard for them to guarantee that
> something
> like this won't happen in the future, and they asked if multipath
> could
> do a better job of catching these sorts of mistakes. Obviously this
> is
> more convincing when it's coming from your customer. But the fact
> still
> stands that this has happened to multiple users even with our
> existing
> code to catch this.

I wasn't aware of multiple affected users. I saw Chongyun's post and it
looked to me as if this had happend once, likely in his organization.
It wasn't even clear to me whether production data were affected.

I still believe that this is primarily a problem on the storage vendor
side. They should make it harder for users to shoot themselves in the
foot (low-level LIO doesn't count - of course you can do all kinds of
ugly stuff with it).

> Since this isn't a problem that can always be fixed, the best we can
> do
> is try to catch it before something bad happens.  If the path is
> remapped before it goes down, then corruption can happen before there
> is any possiblity of doing anything. That case is unsolvable. But if
> the path does go down when the LUN is unmapped, then there is a
> chance
> to catch this before damage is done.
> 
> Now, obviously if you don't set this to 0, then it's possible that
> the
> path gets unmapped and goes down, but comes up before your timeout,
> and
> you don't catch it in time.  Really, this is a dial that nobody that
> hasn't got bitten by this issue cares about, but everyone who has had
> this happen really wants to be there.

OK, then. Please consider making it boolean though, as argued above.
If the time must _really_ be configurable, maybe those affected parties
can come up with reasonable suggestions for a default?

> 
> > This said, I'd like to understand why there are no events in these
> > cases. I'd have thought we'd at least get a UNIT ATTENTION
> > (REPORTED
> > LUNS DATA HAS CHANGED), which would have caused a uevent. If there
> > was
> > no UNIT ATTENTION, I'd blame the storage side. 
> 
> Are you able to try the reproducer I targetcli FC reproducer I
> posted?

I wanted to, but I had no time to set this up yet.


> 
> > Maybe we need to monitor scsi_device uevents.
> > 
> > Technical remarks below.
> > 
> > 
> > > ---
> > >  libmultipath/config.c             |  1 +
> > >  libmultipath/config.h             |  1 +
> > >  libmultipath/configure.c          |  4 +-
> > >  libmultipath/configure.h          |  2 +
> > >  libmultipath/defaults.h           |  1 +
> > >  libmultipath/dict.c               | 36 ++++++++++++
> > >  libmultipath/libmultipath.version |  6 ++
> > >  libmultipath/structs.h            | 10 ++++
> > >  multipath/multipath.conf.5        | 18 ++++++
> > >  multipathd/cli_handlers.c         |  9 +++
> > >  multipathd/main.c                 | 92
> > > +++++++++++++++++++++++++++++++
> > >  multipathd/main.h                 |  2 +
> > >  12 files changed, 180 insertions(+), 2 deletions(-)
> > >  .
> > >  .\" ------------------------------------------------------------
> > > ----
> > > ------------
> > > diff --git a/multipathd/cli_handlers.c
> > > b/multipathd/cli_handlers.c
> > > index 54635738..970d5e21 100644
> > > --- a/multipathd/cli_handlers.c
> > > +++ b/multipathd/cli_handlers.c
> > > @@ -715,6 +715,15 @@ cli_add_path (void * v, char ** reply, int *
> > > len, void * data)
> > >         pp = find_path_by_dev(vecs->pathvec, param);
> > >         if (pp && pp->initialized != INIT_REMOVED) {
> > >                 condlog(2, "%s: path already in pathvec", param);
> > > +
> > > +               if (pp->allow_wwid_recheck ==
> > > ALLOW_WWID_RECHECK_ON
> > > &&
> > > +                   check_path_wwid_change(pp)) {
> > > +                       condlog(0, "%s: wwid changed. Removing
> > > device",
> > > +                               pp->dev);
> > > +                       handle_path_wwid_change(pp, vecs);
> > > +                       return 1;
> > > +               }
> > > +
> > >                 if (pp->mpp)
> > >                         return 0;
> > >         } else if (pp) {
> > > diff --git a/multipathd/main.c b/multipathd/main.c
> > > index 19679848..17eef3a4 100644
> > > --- a/multipathd/main.c
> > > +++ b/multipathd/main.c
> > > @@ -823,6 +823,59 @@ ev_remove_map (char * devname, char * alias,
> > > int
> > > minor, struct vectors * vecs)
> > >         return flush_map(mpp, vecs, 0);
> > >  }
> > >  
> > > +void
> > > +handle_path_wwid_change(struct path *pp, struct vectors *vecs)
> > > +{
> > > +       const char *action = "add";
> > > +       if (!pp || !pp->udev)
> > > +               return;
> > > +
> > > +       sysfs_attr_set_value(pp->udev, "uevent", action,
> > > strlen(action));
> > > +       trigger_partitions_udev_change(pp->udev, action,
> > > strlen(action));
> > 
> > Nit: it looks a bit weird to use a const char * variable for
> > "action"
> > and a constant for "uevent".
> 
> It does mean that there's no chance of typo'ing it one of the four
> times
> it's used and not having it be caught, but I'm fine with changing it.
> 
> > > +       if (ev_remove_path(pp, vecs, 1) != 0 && pp->mpp) {
> > > +               pp->dmstate = PSTATE_FAILED;
> > > +               dm_fail_path(pp->mpp->alias, pp->dev_t);
> > > +       }
> > 
> > I suggest taking a ref on pp->udev, calling ev_remove_path(), and
> > triggering the uevent after that.
> > 
> 
> vecs locking will save us from handling the uevent before we remove
> the
> path, but your ordering does make things look more obviously correct.
> I'm fine with changing it.
> 
> > > +}
> > > +
> > > +bool
> > > +check_path_wwid_change(struct path *pp)
> > > +{
> > > +       char wwid[WWID_SIZE];
> > > +       int len = 0;
> > > +       char *c;
> > > +
> > > +       if (!strlen(pp->wwid) || pp->bus != SYSFS_BUS_SCSI)
> > > +               return false;
> > 
> > Maybe you should look at uid_attribute here, to be consistent with
> > has_uid_fallback()?
> 
> Possibly, be this code will never be run to see if wwid has changed
> without first having been run and verifying that the wwids match. If
> the wwids don't match the first time, then it's disabled.
>  
> > > +
> > > +       /* Get the real fresh device wwid by sgio. sysfs still
> > > has
> > > old
> > > +        * data, so only get_vpd_sgio will work to get the new
> > > wwid
> > > */
> > > +       len = get_vpd_sgio(pp->fd, 0x83, 0, wwid, WWID_SIZE);
> > > +
> > > +       if (len <= 0) {
> > > +               condlog(2, "%s: failed to check wwid by sgio: len
> > > =
> > > %d",
> > > +                       pp->dev, len);
> > > +               return false;
> > > +       }
> > > +
> > > +       /*Strip any trailing blanks */
> > > +       c = strchr(wwid, '\0');
> > 
> > Quite unusual, why not use strlen() or strnlen()?
> 
> This was pulled directly from get_uid(). But I agree it could be
> cleaned
> up in both places

Strange, it never caught my eye. Maybe we should do a start-from-zero
review of our code :-/

> 
> > 
> > > +       c--;
> > > +       while (c && c >= wwid && *c == ' ') {
> > > +               *c = '\0';
> > > +               c--;
> > > +       }
> > 
> > Nit: You don't have to nullify every space. Just the first one.
> Again, this is just a copy of get_uid().
> 
> > 
> > > +       condlog(4, "%s: Got wwid %s by sgio", pp->dev, wwid);
> > > +
> > > +       if (strncmp(wwid, pp->wwid, WWID_SIZE)) {
> > > +               condlog(0, "%s: wwid '%s' doesn't match wwid '%s'
> > > from device",
> > > +                       pp->dev, pp->wwid, wwid);
> > > +               return true;
> > > +       }
> > > +
> > > +       return false;
> > > +}
> > > +
> > >  static int
> > >  uev_add_path (struct uevent *uev, struct vectors * vecs, int
> > > need_do_map)
> > >  {
> > > @@ -919,6 +972,21 @@ uev_add_path (struct uevent *uev, struct
> > > vectors
> > > * vecs, int need_do_map)
> > >                                         uev->kernel);
> > >                                 ret = 1;
> > >                         }
> > > +               } else if (pp->allow_wwid_recheck ==
> > > ALLOW_WWID_RECHECK_ON &&
> > > +                          check_path_wwid_change(pp)) {
> > > +                       condlog(2, "%s wwid change detected when
> > > processing add uevent. removing path", pp->dev);
> > > +                       /*
> > > +                        * don't use handle_path_wwid_change
> > > here,
> > > +                        * since that would just trigger another
> > > add
> > > +                        * uevent
> > > +                        */
> > > +                       ret = ev_remove_path(pp, vecs, true);
> > > +                       if (ret == 0)
> > > +                               pp = NULL;
> > > +                       else if (pp->mpp) {
> > > +                               pp->dmstate = PSTATE_FAILED;
> > > +                               dm_fail_path(pp->mpp->alias, pp-
> > > > dev_t);
> > > +                       }
> > 
> > What's the purpose of this code? Are you handling your own
> > artificial
> > "add" event here, which you triggered before in
> > handle_path_wwid_change()? Or are there real cases where the kernel
> > would send an "add" event without sending a "remove" event first?
> > 
> 
> This shouldn't be for handling our own add event. Either the
> ev_remove_path() in handle_path_wwid_changed() succeeded, and there
> is
> no path in pathvec, or it failed, and pp->initialized should be set
> to
> INIT_REMOVED. Either way, we miss this code path. This is simply if
> we
> get an "add" event without the "remove" event, which is a real thing
> that has happened when LUNs get remapped.

That sounds like a kernel bug, or something else really awkward and bad
which we should try to understand and fix.

Thanks,
Martin

-- 
Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
SUSE Software Solutions Germany GmbH
HRB 36809, AG Nürnberg GF: Felix Imendörffer



--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [dm-devel] [PATCH 2/2] multipathd: add recheck_wwid_time option to verify the path wwid
  2021-02-11 12:14       ` Martin Wilck
@ 2021-02-18  3:22         ` Chongyun Wu
  2021-02-19 10:46           ` Martin Wilck
  0 siblings, 1 reply; 13+ messages in thread
From: Chongyun Wu @ 2021-02-18  3:22 UTC (permalink / raw)
  To: Martin Wilck, bmarzins; +Cc: dm-devel



On 2021/2/11 20:14, Martin Wilck wrote:

>>> But foremost, do we really have to try to deal with configuration
>>> mistakes as blatant as this? What if a user sets the same WWID for
>>> different devices, or re-uses the same WWID on different storage
>>> servers? I already hesitated about the code I added myself for
>>> catching
>>> user errors in the WWIDs file, but this seems even more far-
>>> fetched.
>>>
>>> Please convince me.
>>
>> Err.. An important customer corrupted their data and while they admit
>> that they were at fault, it's hard for them to guarantee that
>> something
>> like this won't happen in the future, and they asked if multipath
>> could
>> do a better job of catching these sorts of mistakes. Obviously this
>> is
>> more convincing when it's coming from your customer. But the fact
>> still
>> stands that this has happened to multiple users even with our
>> existing
>> code to catch this.
> 
> I wasn't aware of multiple affected users. I saw Chongyun's post and it
> looked to me as if this had happend once, likely in his organization.
> It wasn't even clear to me whether production data were affected.
> 
Had customers have made incorrect storage back-end operations during the 
expansion process, which has caused the destruction of customer data, which 
cannot be repaired using any tools. And I reproduce it in our lab too.
Customers definitely hope that multipath-tools can probide protection in
this case, becuase if it is important custmer data storage in the LUN, the
loss will be great.

-- 
Best Regard,
Chongyun Wu

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [dm-devel] [PATCH 2/2] multipathd: add recheck_wwid_time option to verify the path wwid
  2021-02-18  3:22         ` Chongyun Wu
@ 2021-02-19 10:46           ` Martin Wilck
  0 siblings, 0 replies; 13+ messages in thread
From: Martin Wilck @ 2021-02-19 10:46 UTC (permalink / raw)
  To: wucy11, bmarzins; +Cc: dm-devel

On Thu, 2021-02-18 at 11:22 +0800, Chongyun Wu wrote:
> > 
> > I wasn't aware of multiple affected users. I saw Chongyun's post
> > and it
> > looked to me as if this had happend once, likely in his
> > organization.
> > It wasn't even clear to me whether production data were affected.
> > 
> Had customers have made incorrect storage back-end operations during
> the 
> expansion process, which has caused the destruction of customer data,
> which 
> cannot be repaired using any tools. And I reproduce it in our lab
> too.
> Customers definitely hope that multipath-tools can probide protection
> in
> this case, becuase if it is important custmer data storage in the
> LUN, the
> loss will be great.

I understand that. My concern was that it's still just quite a weak
protection, because it depends on the check time interval being short
enough to catch the change. So in a way, this pretends to offer safety
that doesn't truly exist. If the user assigns the wrong WWID 30s after
taking the LUN offline, all fine; if she is quicker (or has the process
automated / scripted), data corruption still occurs.

And of course, there are general limits to which extent software can
avoid user mistakes.

However, I trust your and Ben's judegement that by checking this in a
suitably short time span, the issues your customers have had would have
been avoided, and I'm not going to oppose the patch on these grounds.

Martin

-- 
Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
SUSE Software Solutions Germany GmbH
HRB 36809, AG Nürnberg GF: Felix Imendörffer



--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [dm-devel] [PATCH 2/2] multipathd: add recheck_wwid_time option to verify the path wwid
  2021-02-09 22:19   ` Martin Wilck
  2021-02-10 18:09     ` Benjamin Block
  2021-02-11  4:48     ` Benjamin Marzinski
@ 2021-02-27  6:02     ` Benjamin Marzinski
  2 siblings, 0 replies; 13+ messages in thread
From: Benjamin Marzinski @ 2021-02-27  6:02 UTC (permalink / raw)
  To: Martin Wilck; +Cc: dm-devel, wucy11

On Tue, Feb 09, 2021 at 10:19:45PM +0000, Martin Wilck wrote:
> On Mon, 2021-02-08 at 23:19 -0600, Benjamin Marzinski wrote:
> >  If
> > multipathd notices that a path's wwid has changed it will remove and
> > re-add the path, just like the existing wwid checking code for change
> > events does.  In cases where the no uevent occurs, both the udev
> > database entry and sysfs will have the old wwid, so the only way to
> > get
> > a current wwid is to ask the device directly. 
> 
> sysfs is wrong too, really? In that case I fear triggering an uevent
> won't fix the situation. You need to force the kernel to rescan the
> device, otherwise udev will fetch the WWID from sysfs again, which
> still has the wrong ID... or what am I missing here?

I just figured out the confusion here.  Redhat has not always included
55-scsi-sg3_id.rules in sg3_utils. It does now, but installs it as
61-scsi-sg3_id.rules instead (This is why we install multipath.rules as
62-multipath.rules). The result of this is that ID_SERIAL is set by
"scsi_id --export --whitelisted -d $devnode" from
60-persistent-storage.rules. This means that when a uevent comes in,
ID_SERIAL is repopulated by ioctling the device, unlike the case where
55-scsi-sg3_id.rules sets ID_SERIAL, which gets the information from
sysfs if possible.

Clearly, for distributions that set ID_SERIAL from 55-scsi-sg3_id.rules,
multipathd needs to take more drastic measures if the sysfs vpd_page83
file is wrong.

Looking at this made me notice another issue. scsi_id and the rules in
55-scsi-sg3_id.rules don't agree on the priority of the different id
types for choosing the wwid from the various ids in vpd page 83. And
mulitpath's parse_vpd_pg83() doesn't agree with either of them.

Here are the different priority orderings for the ID types

		sg_inq +
scsi_id		55-scsi-sg3_id.rules	parse_vpd_pg83
-------------------------------------------------------
36		36			36
35		35			35
3[^56]		32			32
2		2			8
		8			2
1		1			1
		3[^652]			3[^652]
0		0			0

Since Redhat is changing the default ordering of the
55-scsi-sg3_id.rules, it seems reasonable that we make parse_vpd_pg83()
match the 55-scsi-sg3_id.rules ordering, and I'll add a redhat specific
patch to match it to the scsi_id ordering. However, the fact that
differences which udev rules are installed or their ordering can changed
the WWID does lend some weight to the code that I originally had for
recheck_wwid, where we would only enable it after verifying that at one
point, the parse_vpd_pg83() wwid agreed with the uid_attribute wwid. On
the other hand, assuming that distributions make sure that
parse_vpd_pg83() matches their udev setup, then not requiring them to be
equal at some point allows recheck_wwid to catch problems that happened
before multipath started using the device, for whatever that's worth.

-Ben

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2021-02-27  6:03 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-09  5:19 [dm-devel] [PATCH 0/2] Handle remapped LUNs better Benjamin Marzinski
2021-02-09  5:19 ` [dm-devel] [PATCH 1/2] libmultipath: fix use-after-free in uev_add_path Benjamin Marzinski
2021-02-09 20:57   ` Martin Wilck
2021-02-09  5:19 ` [dm-devel] [PATCH 2/2] multipathd: add recheck_wwid_time option to verify the path wwid Benjamin Marzinski
2021-02-09 22:19   ` Martin Wilck
2021-02-10 18:09     ` Benjamin Block
2021-02-10 19:57       ` Martin Wilck
2021-02-11 11:25       ` Benjamin Block
2021-02-11  4:48     ` Benjamin Marzinski
2021-02-11 12:14       ` Martin Wilck
2021-02-18  3:22         ` Chongyun Wu
2021-02-19 10:46           ` Martin Wilck
2021-02-27  6:02     ` Benjamin Marzinski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).