All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 00/16] multipath path classification
@ 2018-01-19  0:29 Martin Wilck
  2018-01-19  0:29 ` [RFC PATCH 01/16] Revert "multipath: ignore -i if find_multipaths is set" Martin Wilck
                   ` (16 more replies)
  0 siblings, 17 replies; 30+ messages in thread
From: Martin Wilck @ 2018-01-19  0:29 UTC (permalink / raw)
  To: Christophe Varoqui, dm-devel, Benjamin Marzinski
  Cc: Xose Vazquez Perez, Martin Wilck

This patch series implements the recommendation in my recent posting
"Multipath path classification revisited". My testing has been surprisingly
successful so far; more testing is of course needed.
Anyway, I think it's in a shape that I can ask for review.

I've seen Ben's detailed reply to my posting, but his comments haven't been
taken into account in this series yet.

Benjamin Marzinski (1):
  libmultipath: trigger change uevent on new device creation

Martin Wilck (15):
  Revert "multipath: ignore -i if find_multipaths is set"
  Revert "multipathd: imply -n if find_multipaths is set"
  libmultipath: add mpvec param to should_multipath()
  libmultipath: should_multipath: keep existing maps
  multipath -u -i: change logic for find_multipaths
  libmultipath: let ignore_wwids be set in config file
  multipathd: replace -n with !ignore_wwids
  multipath.conf.5: document "ignore_wwids"
  multipath.8: adapt documentation of '-i'
  multipathd.8: document that '-n' is now ignored
  multipath: common code path for CMD_VALID_PATH
  multipath -u/-c: change output to environment/key format
  multipath -u/-c: add "$DEV is maybe a valid path"
  multipath.rules: find_multipaths+ignore_wwids logic
  libmultipath: trigger path uevent only when necessary

 libmultipath/config.c      |  1 +
 libmultipath/config.h      |  1 -
 libmultipath/configure.c   | 48 ++++++++++++++++++++++++++++++++---
 libmultipath/configure.h   |  1 +
 libmultipath/defaults.h    |  1 +
 libmultipath/dict.c        |  4 +++
 libmultipath/wwids.c       | 13 +++++++---
 libmultipath/wwids.h       |  2 +-
 multipath/main.c           | 58 ++++++++++++++++++++++--------------------
 multipath/multipath.8      |  3 ++-
 multipath/multipath.conf.5 | 31 +++++++++++++++++++++++
 multipath/multipath.rules  | 63 +++++++++++++++++++++++++++++++++++++++++++---
 multipathd/main.c          | 16 +++---------
 multipathd/multipathd.8    |  5 ++--
 14 files changed, 192 insertions(+), 55 deletions(-)

-- 
2.15.1

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [RFC PATCH 01/16] Revert "multipath: ignore -i if find_multipaths is set"
  2018-01-19  0:29 [RFC PATCH 00/16] multipath path classification Martin Wilck
@ 2018-01-19  0:29 ` Martin Wilck
  2018-01-19  0:29 ` [RFC PATCH 02/16] Revert "multipathd: imply -n " Martin Wilck
                   ` (15 subsequent siblings)
  16 siblings, 0 replies; 30+ messages in thread
From: Martin Wilck @ 2018-01-19  0:29 UTC (permalink / raw)
  To: Christophe Varoqui, dm-devel, Benjamin Marzinski
  Cc: Xose Vazquez Perez, Martin Wilck

This reverts commit ffbb886a8a16cb063d669cd76a1e656fd3ec8c4b.
---
 multipath/main.c | 10 ----------
 1 file changed, 10 deletions(-)

diff --git a/multipath/main.c b/multipath/main.c
index 52bf1658bbca..21cc45145dbb 100644
--- a/multipath/main.c
+++ b/multipath/main.c
@@ -761,16 +761,6 @@ main (int argc, char *argv[])
 		}
 	}
 
-	/*
-	 * FIXME: new device detection with find_multipaths currently
-	 * doesn't work reliably.
-	 */
-	if (cmd ==  CMD_VALID_PATH &&
-	    conf->find_multipaths && conf->ignore_wwids) {
-		condlog(2, "ignoring -i flag because find_multipath is set in multipath.conf");
-		conf->ignore_wwids = 0;
-	}
-
 	if (getuid() != 0) {
 		fprintf(stderr, "need to be root\n");
 		exit(1);
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC PATCH 02/16] Revert "multipathd: imply -n if find_multipaths is set"
  2018-01-19  0:29 [RFC PATCH 00/16] multipath path classification Martin Wilck
  2018-01-19  0:29 ` [RFC PATCH 01/16] Revert "multipath: ignore -i if find_multipaths is set" Martin Wilck
@ 2018-01-19  0:29 ` Martin Wilck
  2018-01-19  0:29 ` [RFC PATCH 03/16] libmultipath: add mpvec param to should_multipath() Martin Wilck
                   ` (14 subsequent siblings)
  16 siblings, 0 replies; 30+ messages in thread
From: Martin Wilck @ 2018-01-19  0:29 UTC (permalink / raw)
  To: Christophe Varoqui, dm-devel, Benjamin Marzinski
  Cc: Xose Vazquez Perez, Martin Wilck

This reverts commit 64e27ec066a001012f44550f095c93443e91d845.
---
 multipathd/main.c | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/multipathd/main.c b/multipathd/main.c
index 26632291657f..255d1d860b2a 100644
--- a/multipathd/main.c
+++ b/multipathd/main.c
@@ -2131,10 +2131,6 @@ reconfigure (struct vectors * vecs)
 		conf->verbosity = verbosity;
 	if (bindings_read_only)
 		conf->bindings_read_only = bindings_read_only;
-	if (conf->find_multipaths) {
-		condlog(2, "find_multipaths is set: -n is implied");
-		ignore_new_devs = 1;
-	}
 	if (ignore_new_devs)
 		conf->ignore_new_devs = ignore_new_devs;
 	uxsock_timeout = conf->uxsock_timeout;
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC PATCH 03/16] libmultipath: add mpvec param to should_multipath()
  2018-01-19  0:29 [RFC PATCH 00/16] multipath path classification Martin Wilck
  2018-01-19  0:29 ` [RFC PATCH 01/16] Revert "multipath: ignore -i if find_multipaths is set" Martin Wilck
  2018-01-19  0:29 ` [RFC PATCH 02/16] Revert "multipathd: imply -n " Martin Wilck
@ 2018-01-19  0:29 ` Martin Wilck
  2018-01-19  0:29 ` [RFC PATCH 04/16] libmultipath: should_multipath: keep existing maps Martin Wilck
                   ` (13 subsequent siblings)
  16 siblings, 0 replies; 30+ messages in thread
From: Martin Wilck @ 2018-01-19  0:29 UTC (permalink / raw)
  To: Christophe Varoqui, dm-devel, Benjamin Marzinski
  Cc: Xose Vazquez Perez, Martin Wilck

This will be used in a follow-up patch.
---
 libmultipath/configure.c | 2 +-
 libmultipath/wwids.c     | 2 +-
 libmultipath/wwids.h     | 2 +-
 multipathd/main.c        | 2 +-
 4 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/libmultipath/configure.c b/libmultipath/configure.c
index 13e14cc25fff..325018a78b95 100644
--- a/libmultipath/configure.c
+++ b/libmultipath/configure.c
@@ -970,7 +970,7 @@ int coalesce_paths (struct vectors * vecs, vector newmp, char * refwwid,
 			continue;
 
 		/* If find_multipaths was selected check if the path is valid */
-		if (!refwwid && !should_multipath(pp1, pathvec)) {
+		if (!refwwid && !should_multipath(pp1, pathvec, curmp)) {
 			orphan_path(pp1, "only one path");
 			continue;
 		}
diff --git a/libmultipath/wwids.c b/libmultipath/wwids.c
index bc70a27409d3..fcbf5281b491 100644
--- a/libmultipath/wwids.c
+++ b/libmultipath/wwids.c
@@ -271,7 +271,7 @@ out:
 }
 
 int
-should_multipath(struct path *pp1, vector pathvec)
+should_multipath(struct path *pp1, vector pathvec, vector mpvec)
 {
 	int i, ignore_new_devs;
 	struct path *pp2;
diff --git a/libmultipath/wwids.h b/libmultipath/wwids.h
index 95270129daa0..d9a78b38ccf8 100644
--- a/libmultipath/wwids.h
+++ b/libmultipath/wwids.h
@@ -12,7 +12,7 @@
 "#\n" \
 "# Valid WWIDs:\n"
 
-int should_multipath(struct path *pp, vector pathvec);
+int should_multipath(struct path *pp, vector pathvec, vector mpvec);
 int remember_wwid(char *wwid);
 int check_wwids_file(char *wwid, int write_wwid);
 int remove_wwid(char *wwid);
diff --git a/multipathd/main.c b/multipathd/main.c
index 255d1d860b2a..98ce16ee020b 100644
--- a/multipathd/main.c
+++ b/multipathd/main.c
@@ -699,7 +699,7 @@ rescan:
 		mpp->action = ACT_RELOAD;
 		extract_hwe_from_path(mpp);
 	} else {
-		if (!should_multipath(pp, vecs->pathvec)) {
+		if (!should_multipath(pp, vecs->pathvec, vecs->mpvec)) {
 			orphan_path(pp, "only one path");
 			return 0;
 		}
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC PATCH 04/16] libmultipath: should_multipath: keep existing maps
  2018-01-19  0:29 [RFC PATCH 00/16] multipath path classification Martin Wilck
                   ` (2 preceding siblings ...)
  2018-01-19  0:29 ` [RFC PATCH 03/16] libmultipath: add mpvec param to should_multipath() Martin Wilck
@ 2018-01-19  0:29 ` Martin Wilck
  2018-01-19 16:06   ` Benjamin Marzinski
  2018-01-19  0:29 ` [RFC PATCH 05/16] multipath -u -i: change logic for find_multipaths Martin Wilck
                   ` (12 subsequent siblings)
  16 siblings, 1 reply; 30+ messages in thread
From: Martin Wilck @ 2018-01-19  0:29 UTC (permalink / raw)
  To: Christophe Varoqui, dm-devel, Benjamin Marzinski
  Cc: Xose Vazquez Perez, Martin Wilck

If with find_multipaths and !ignore_new_devs, if a path is already
multipathed, keep it. The same logic is applied in multipath -u
with ignore_wwids.
---
 libmultipath/wwids.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/libmultipath/wwids.c b/libmultipath/wwids.c
index fcbf5281b491..828a3de5b5cb 100644
--- a/libmultipath/wwids.c
+++ b/libmultipath/wwids.c
@@ -287,6 +287,13 @@ should_multipath(struct path *pp1, vector pathvec, vector mpvec)
 
 	condlog(4, "checking if %s should be multipathed", pp1->dev);
 	if (!ignore_new_devs) {
+		struct multipath *mp = find_mp_by_wwid(mpvec, pp1->wwid);
+
+		if (mp != NULL && first_path(mp) != NULL) {
+			condlog(3, "wwid %s is already multipathed, keeping it",
+				pp1->wwid);
+			return 1;
+		}
 		vector_foreach_slot(pathvec, pp2, i) {
 			if (pp1->dev == pp2->dev)
 				continue;
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC PATCH 05/16] multipath -u -i: change logic for find_multipaths
  2018-01-19  0:29 [RFC PATCH 00/16] multipath path classification Martin Wilck
                   ` (3 preceding siblings ...)
  2018-01-19  0:29 ` [RFC PATCH 04/16] libmultipath: should_multipath: keep existing maps Martin Wilck
@ 2018-01-19  0:29 ` Martin Wilck
  2018-01-19  0:29 ` [RFC PATCH 06/16] libmultipath: let ignore_wwids be set in config file Martin Wilck
                   ` (11 subsequent siblings)
  16 siblings, 0 replies; 30+ messages in thread
From: Martin Wilck @ 2018-01-19  0:29 UTC (permalink / raw)
  To: Christophe Varoqui, dm-devel, Benjamin Marzinski
  Cc: Xose Vazquez Perez, Martin Wilck

Previously, if find_multipaths was set, files listed in the wwids file
weren't set to be multipathed by "multipath -u -i" unless they also
met the "find_multpaths" criteria (at least two paths, or existing map with
this WWID). Now we classify all paths in the WWIDs file as multipath members.

The rationale for this patch is to match the logic that multipathd applies
when "-n" is not given.
---
 multipath/main.c | 19 +++++++++++--------
 1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/multipath/main.c b/multipath/main.c
index 21cc45145dbb..b7e5cf46fe0f 100644
--- a/multipath/main.c
+++ b/multipath/main.c
@@ -423,16 +423,19 @@ configure (struct config *conf, enum mpath_cmds cmd,
 		 * set, you need to actually check if there are two available
 		 * paths to determine if this path should be multipathed. To
 		 * do this, we put off the check until after discovering all
-		 * the paths */
-		if (cmd == CMD_VALID_PATH &&
-		    (!conf->find_multipaths || !conf->ignore_wwids)) {
-			if (conf->ignore_wwids ||
+		 * the paths.
+		 * Paths listed in the wwids file are always considered valid.
+		 */
+		if (cmd == CMD_VALID_PATH) {
+			if ((!conf->find_multipaths && conf->ignore_wwids) ||
 			    check_wwids_file(refwwid, 0) == 0)
 				r = 0;
-
-			printf("%s %s a valid multipath device path\n",
-			       devpath, r == 0 ? "is" : "is not");
-			goto out;
+			if (r == 0 ||
+			    !conf->find_multipaths || !conf->ignore_wwids) {
+				printf("%s %s a valid multipath device path\n",
+				       devpath, r == 0 ? "is" : "is not");
+				goto out;
+			}
 		}
 	}
 
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC PATCH 06/16] libmultipath: let ignore_wwids be set in config file
  2018-01-19  0:29 [RFC PATCH 00/16] multipath path classification Martin Wilck
                   ` (4 preceding siblings ...)
  2018-01-19  0:29 ` [RFC PATCH 05/16] multipath -u -i: change logic for find_multipaths Martin Wilck
@ 2018-01-19  0:29 ` Martin Wilck
  2018-01-19  0:29 ` [RFC PATCH 07/16] multipathd: replace -n with !ignore_wwids Martin Wilck
                   ` (10 subsequent siblings)
  16 siblings, 0 replies; 30+ messages in thread
From: Martin Wilck @ 2018-01-19  0:29 UTC (permalink / raw)
  To: Christophe Varoqui, dm-devel, Benjamin Marzinski
  Cc: Xose Vazquez Perez, Martin Wilck

This allows to generalize this option, use it for multipathd as well,
and make sure to use the same logic in both tools.
---
 libmultipath/config.c   | 1 +
 libmultipath/defaults.h | 1 +
 libmultipath/dict.c     | 4 ++++
 3 files changed, 6 insertions(+)

diff --git a/libmultipath/config.c b/libmultipath/config.c
index 3d6a24c94fab..436179a87494 100644
--- a/libmultipath/config.c
+++ b/libmultipath/config.c
@@ -615,6 +615,7 @@ load_config (char * file)
 	conf->partition_delim = DEFAULT_PARTITION_DELIM;
 	conf->processed_main_config = 0;
 	conf->find_multipaths = DEFAULT_FIND_MULTIPATHS;
+	conf->ignore_wwids = DEFAULT_IGNORE_WWIDS;
 	conf->uxsock_timeout = DEFAULT_REPLY_TIMEOUT;
 	conf->retrigger_tries = DEFAULT_RETRIGGER_TRIES;
 	conf->retrigger_delay = DEFAULT_RETRIGGER_DELAY;
diff --git a/libmultipath/defaults.h b/libmultipath/defaults.h
index c9e3411aa579..a6a526a10ac3 100644
--- a/libmultipath/defaults.h
+++ b/libmultipath/defaults.h
@@ -18,6 +18,7 @@
 #define DEFAULT_VERBOSITY	2
 #define DEFAULT_REASSIGN_MAPS	0
 #define DEFAULT_FIND_MULTIPATHS	0
+#define DEFAULT_IGNORE_WWIDS	1
 #define DEFAULT_FAST_IO_FAIL	5
 #define DEFAULT_DEV_LOSS_TMO	600
 #define DEFAULT_RETAIN_HWHANDLER RETAIN_HWHANDLER_ON
diff --git a/libmultipath/dict.c b/libmultipath/dict.c
index e52f1f798f7a..feab547ad9fc 100644
--- a/libmultipath/dict.c
+++ b/libmultipath/dict.c
@@ -241,6 +241,9 @@ declare_def_snprint(partition_delim, print_str)
 declare_def_handler(find_multipaths, set_yes_no)
 declare_def_snprint(find_multipaths, print_yes_no)
 
+declare_def_handler(ignore_wwids, set_yes_no)
+declare_def_snprint(ignore_wwids, print_yes_no)
+
 declare_def_handler(selector, set_str)
 declare_def_snprint_defstr(selector, print_str, DEFAULT_SELECTOR)
 declare_hw_handler(selector, set_str)
@@ -1469,6 +1472,7 @@ init_keywords(vector keywords)
 	install_keyword("marginal_path_double_failed_time", &def_marginal_path_double_failed_time_handler, &snprint_def_marginal_path_double_failed_time);
 
 	install_keyword("find_multipaths", &def_find_multipaths_handler, &snprint_def_find_multipaths);
+	install_keyword("ignore_wwids", &def_ignore_wwids_handler, &snprint_def_ignore_wwids);
 	install_keyword("uxsock_timeout", &def_uxsock_timeout_handler, &snprint_def_uxsock_timeout);
 	install_keyword("retrigger_tries", &def_retrigger_tries_handler, &snprint_def_retrigger_tries);
 	install_keyword("retrigger_delay", &def_retrigger_delay_handler, &snprint_def_retrigger_delay);
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC PATCH 07/16] multipathd: replace -n with !ignore_wwids
  2018-01-19  0:29 [RFC PATCH 00/16] multipath path classification Martin Wilck
                   ` (5 preceding siblings ...)
  2018-01-19  0:29 ` [RFC PATCH 06/16] libmultipath: let ignore_wwids be set in config file Martin Wilck
@ 2018-01-19  0:29 ` Martin Wilck
  2018-01-19  0:29 ` [RFC PATCH 08/16] multipath.conf.5: document "ignore_wwids" Martin Wilck
                   ` (9 subsequent siblings)
  16 siblings, 0 replies; 30+ messages in thread
From: Martin Wilck @ 2018-01-19  0:29 UTC (permalink / raw)
  To: Christophe Varoqui, dm-devel, Benjamin Marzinski
  Cc: Xose Vazquez Perez, Martin Wilck

Ignore the -n option (ignore_new_devs), at use !ignore_wwids instead.
'-n' mean that multipathd should only create maps from WWIDs that
are listed in the WWIDs file. This is exactly the meaning of
"ignore_wwids = off". The difference is that the setting is now
made in the config file and honoured by multipath, too (unless
it's called with '-i').
---
 libmultipath/config.h | 1 -
 libmultipath/wwids.c  | 2 +-
 multipathd/main.c     | 7 +------
 3 files changed, 2 insertions(+), 8 deletions(-)

diff --git a/libmultipath/config.h b/libmultipath/config.h
index a20ac2aac296..7366974644bf 100644
--- a/libmultipath/config.h
+++ b/libmultipath/config.h
@@ -168,7 +168,6 @@ struct config {
 	int strict_timing;
 	int retrigger_tries;
 	int retrigger_delay;
-	int ignore_new_devs;
 	int delayed_reconfig;
 	int uev_wait_timeout;
 	int skip_kpartx;
diff --git a/libmultipath/wwids.c b/libmultipath/wwids.c
index 828a3de5b5cb..655ccbe2c94a 100644
--- a/libmultipath/wwids.c
+++ b/libmultipath/wwids.c
@@ -278,7 +278,7 @@ should_multipath(struct path *pp1, vector pathvec, vector mpvec)
 	struct config *conf;
 
 	conf = get_multipath_config();
-	ignore_new_devs = conf->ignore_new_devs;
+	ignore_new_devs = !conf->ignore_wwids;
 	if (!conf->find_multipaths && !ignore_new_devs) {
 		put_multipath_config(conf);
 		return 1;
diff --git a/multipathd/main.c b/multipathd/main.c
index 98ce16ee020b..6f612f6d3d2f 100644
--- a/multipathd/main.c
+++ b/multipathd/main.c
@@ -107,7 +107,6 @@ int logsink;
 int uxsock_timeout;
 int verbosity;
 int bindings_read_only;
-int ignore_new_devs;
 enum daemon_status running_state = DAEMON_INIT;
 pid_t daemon_pid;
 pthread_mutex_t config_lock = PTHREAD_MUTEX_INITIALIZER;
@@ -2131,8 +2130,6 @@ reconfigure (struct vectors * vecs)
 		conf->verbosity = verbosity;
 	if (bindings_read_only)
 		conf->bindings_read_only = bindings_read_only;
-	if (ignore_new_devs)
-		conf->ignore_new_devs = ignore_new_devs;
 	uxsock_timeout = conf->uxsock_timeout;
 
 	old = rcu_dereference(multipath_conf);
@@ -2357,8 +2354,6 @@ child (void * param)
 		conf->verbosity = verbosity;
 	if (bindings_read_only)
 		conf->bindings_read_only = bindings_read_only;
-	if (ignore_new_devs)
-		conf->ignore_new_devs = ignore_new_devs;
 	uxsock_timeout = conf->uxsock_timeout;
 	rcu_assign_pointer(multipath_conf, conf);
 	if (init_checkers(conf->multipath_dir)) {
@@ -2691,7 +2686,7 @@ main (int argc, char *argv[])
 			bindings_read_only = 1;
 			break;
 		case 'n':
-			ignore_new_devs = 1;
+			condlog(0, "WARNING: ignoring deprecated option -n, use 'ignore_wwids = no' instead");
 			break;
 		default:
 			fprintf(stderr, "Invalid argument '-%c'\n",
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC PATCH 08/16] multipath.conf.5: document "ignore_wwids"
  2018-01-19  0:29 [RFC PATCH 00/16] multipath path classification Martin Wilck
                   ` (6 preceding siblings ...)
  2018-01-19  0:29 ` [RFC PATCH 07/16] multipathd: replace -n with !ignore_wwids Martin Wilck
@ 2018-01-19  0:29 ` Martin Wilck
  2018-01-19  0:29 ` [RFC PATCH 09/16] multipath.8: adapt documentation of '-i' Martin Wilck
                   ` (8 subsequent siblings)
  16 siblings, 0 replies; 30+ messages in thread
From: Martin Wilck @ 2018-01-19  0:29 UTC (permalink / raw)
  To: Christophe Varoqui, dm-devel, Benjamin Marzinski
  Cc: Xose Vazquez Perez, Martin Wilck

---
 multipath/multipath.conf.5 | 31 +++++++++++++++++++++++++++++++
 1 file changed, 31 insertions(+)

diff --git a/multipath/multipath.conf.5 b/multipath/multipath.conf.5
index ab151e720d75..33ad775139c4 100644
--- a/multipath/multipath.conf.5
+++ b/multipath/multipath.conf.5
@@ -954,12 +954,43 @@ device again, as soon as it sees a path with that WWID. This should allow most
 users to have multipath automatically choose the correct paths to make into
 multipath devices, without having to edit the blacklist.
 .RS
+.PP
+If \fBignore_wwids\fR is set to
+.I no
+, only method
+.I 2
+above (manual creation by the user) will be used for creating new maps.
+.RE
+.RS
 .TP
 The default is: \fBno\fR
 .RE
 .
 .
 .TP
+.B ignore_wwids
+If set to
+.I no
+, multipath and multipathd only consider paths matching a WWID listed in
+the \fBwwids_file\fR as valid multipath devices. Automatic multipath map
+creation is disabled.
+.
+If set to
+.I yes
+, and \fBfind_multipaths\fR is
+.I no
+, every device which is not blacklisted
+will be treated as a potential multipath member. See above for
+the logic if  \fBfind_multipaths\fR is set to
+.I yes.
+.
+.RS
+.TP
+The default is: \fBoff\fR
+.RE
+.
+.
+.TP
 .B uxsock_timeout
 CLI receive timeout in milliseconds. For larger systems CLI commands
 might timeout before the multipathd lock is released and the CLI command
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC PATCH 09/16] multipath.8: adapt documentation of '-i'
  2018-01-19  0:29 [RFC PATCH 00/16] multipath path classification Martin Wilck
                   ` (7 preceding siblings ...)
  2018-01-19  0:29 ` [RFC PATCH 08/16] multipath.conf.5: document "ignore_wwids" Martin Wilck
@ 2018-01-19  0:29 ` Martin Wilck
  2018-01-19  0:29 ` [RFC PATCH 10/16] multipathd.8: document that '-n' is now ignored Martin Wilck
                   ` (7 subsequent siblings)
  16 siblings, 0 replies; 30+ messages in thread
From: Martin Wilck @ 2018-01-19  0:29 UTC (permalink / raw)
  To: Christophe Varoqui, dm-devel, Benjamin Marzinski
  Cc: Xose Vazquez Perez, Martin Wilck

---
 multipath/multipath.8 | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/multipath/multipath.8 b/multipath/multipath.8
index 56f870351ee2..ee5aecd06bf1 100644
--- a/multipath/multipath.8
+++ b/multipath/multipath.8
@@ -94,7 +94,8 @@ Force devmap reload.
 .
 .TP
 .B \-i
-Ignore WWIDs file when processing devices.
+Ignore WWIDs file when processing devices. This option overrides the config
+file setting \fBignore_wwids\fR.
 .
 .TP
 .B \-B
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC PATCH 10/16] multipathd.8: document that '-n' is now ignored
  2018-01-19  0:29 [RFC PATCH 00/16] multipath path classification Martin Wilck
                   ` (8 preceding siblings ...)
  2018-01-19  0:29 ` [RFC PATCH 09/16] multipath.8: adapt documentation of '-i' Martin Wilck
@ 2018-01-19  0:29 ` Martin Wilck
  2018-01-19  0:29 ` [RFC PATCH 11/16] multipath: common code path for CMD_VALID_PATH Martin Wilck
                   ` (6 subsequent siblings)
  16 siblings, 0 replies; 30+ messages in thread
From: Martin Wilck @ 2018-01-19  0:29 UTC (permalink / raw)
  To: Christophe Varoqui, dm-devel, Benjamin Marzinski
  Cc: Xose Vazquez Perez, Martin Wilck

---
 multipathd/multipathd.8 | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/multipathd/multipathd.8 b/multipathd/multipathd.8
index 5c96680c0514..ccf23606c0ff 100644
--- a/multipathd/multipathd.8
+++ b/multipathd/multipathd.8
@@ -25,7 +25,6 @@ multipathd \- Multipath daemon.
 .RB [\| \-v\ \c
 .IR verbosity \|]
 .RB [\| \-B \|]
-.RB [\| \-n \|]
 .
 .
 .\" ----------------------------------------------------------------------------
@@ -73,8 +72,8 @@ be viewed by entering '\fIhelp\fR'. When you are finished entering commands, pre
 .
 .TP
 .B \-n
-Ignore new devices. multipathd will not create a multipath device unless the
-WWID for the device is already listed in the WWIDs file.
+\fBDEPRECATED\fR. This parameter is ignored. Use the config file parameter
+\fB"ignore_wwids = no"\fR to obtain the effect this option used to have.
 .
 .
 .\" ----------------------------------------------------------------------------
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC PATCH 11/16] multipath: common code path for CMD_VALID_PATH
  2018-01-19  0:29 [RFC PATCH 00/16] multipath path classification Martin Wilck
                   ` (9 preceding siblings ...)
  2018-01-19  0:29 ` [RFC PATCH 10/16] multipathd.8: document that '-n' is now ignored Martin Wilck
@ 2018-01-19  0:29 ` Martin Wilck
  2018-01-19  0:29 ` [RFC PATCH 12/16] multipath -u/-c: change output to environment/key format Martin Wilck
                   ` (5 subsequent siblings)
  16 siblings, 0 replies; 30+ messages in thread
From: Martin Wilck @ 2018-01-19  0:29 UTC (permalink / raw)
  To: Christophe Varoqui, dm-devel, Benjamin Marzinski
  Cc: Xose Vazquez Perez, Martin Wilck

Print the result message in one place only. This simplifies
future changes.
---
 multipath/main.c | 30 +++++++++++++++++++-----------
 1 file changed, 19 insertions(+), 11 deletions(-)

diff --git a/multipath/main.c b/multipath/main.c
index b7e5cf46fe0f..4b3d3a94b282 100644
--- a/multipath/main.c
+++ b/multipath/main.c
@@ -336,6 +336,17 @@ out:
 	return r;
 }
 
+static void print_cmd_valid(const char *devpath, int k)
+{
+	const char *msg[] = { "is", "is not" };
+
+	if (k < 0 || k >= sizeof(msg))
+		return;
+
+	printf("%s %s a valid multipath device path\n",
+	       devpath, msg[k]);
+}
+
 /*
  * Return value:
  *  -1: Retry
@@ -377,10 +388,7 @@ configure (struct config *conf, enum mpath_cmds cmd,
 	    cmd != CMD_REMOVE_WWID &&
 	    (filter_devnode(conf->blist_devnode,
 			    conf->elist_devnode, dev) > 0)) {
-		if (cmd == CMD_VALID_PATH)
-			printf("%s is not a valid multipath device path\n",
-			       devpath);
-		goto out;
+		goto print_valid;
 	}
 
 	/*
@@ -393,7 +401,7 @@ configure (struct config *conf, enum mpath_cmds cmd,
 		if (!refwwid) {
 			condlog(4, "%s: failed to get wwid", devpath);
 			if (failed == 2 && cmd == CMD_VALID_PATH)
-				printf("%s is not a valid multipath device path\n", devpath);
+				goto print_valid;
 			else
 				condlog(3, "scope is null");
 			goto out;
@@ -432,9 +440,7 @@ configure (struct config *conf, enum mpath_cmds cmd,
 				r = 0;
 			if (r == 0 ||
 			    !conf->find_multipaths || !conf->ignore_wwids) {
-				printf("%s %s a valid multipath device path\n",
-				       devpath, r == 0 ? "is" : "is not");
-				goto out;
+				goto print_valid;
 			}
 		}
 	}
@@ -477,9 +483,7 @@ configure (struct config *conf, enum mpath_cmds cmd,
 		 * the refwwid, then the path is valid */
 		if (VECTOR_SIZE(curmp) != 0 || VECTOR_SIZE(pathvec) > 1)
 			r = 0;
-		printf("%s %s a valid multipath device path\n",
-		       devpath, r == 0 ? "is" : "is not");
-		goto out;
+		goto print_valid;
 	}
 
 	if (cmd != CMD_CREATE && cmd != CMD_DRY_RUN) {
@@ -493,6 +497,10 @@ configure (struct config *conf, enum mpath_cmds cmd,
 	r = coalesce_paths(&vecs, NULL, refwwid,
 			   conf->force_reload, cmd);
 
+print_valid:
+	if (cmd == CMD_VALID_PATH)
+		print_cmd_valid(devpath, r);
+
 out:
 	if (refwwid)
 		FREE(refwwid);
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC PATCH 12/16] multipath -u/-c: change output to environment/key format
  2018-01-19  0:29 [RFC PATCH 00/16] multipath path classification Martin Wilck
                   ` (10 preceding siblings ...)
  2018-01-19  0:29 ` [RFC PATCH 11/16] multipath: common code path for CMD_VALID_PATH Martin Wilck
@ 2018-01-19  0:29 ` Martin Wilck
  2018-01-19  0:29 ` [RFC PATCH 13/16] multipath -u/-c: add "$DEV is maybe a valid path" Martin Wilck
                   ` (4 subsequent siblings)
  16 siblings, 0 replies; 30+ messages in thread
From: Martin Wilck @ 2018-01-19  0:29 UTC (permalink / raw)
  To: Christophe Varoqui, dm-devel, Benjamin Marzinski
  Cc: Xose Vazquez Perez, Martin Wilck

... instead of free format. This provides more flexibility
for udev rule processing for the future. Adapt code in multipath.rules.
The exit status remains as usual.
---
 multipath/main.c          | 11 +++++------
 multipath/multipath.rules |  6 +++---
 2 files changed, 8 insertions(+), 9 deletions(-)

diff --git a/multipath/main.c b/multipath/main.c
index 4b3d3a94b282..2127daebcb8f 100644
--- a/multipath/main.c
+++ b/multipath/main.c
@@ -336,15 +336,14 @@ out:
 	return r;
 }
 
-static void print_cmd_valid(const char *devpath, int k)
+static void print_cmd_valid(int k)
 {
-	const char *msg[] = { "is", "is not" };
+	int vals[] = { 1, 0 };
 
-	if (k < 0 || k >= sizeof(msg))
+	if (k < 0 || k >= sizeof(vals))
 		return;
 
-	printf("%s %s a valid multipath device path\n",
-	       devpath, msg[k]);
+	printf("DM_MULTIPATH_DEVICE_PATH=\"%d\"\n", vals[k]);
 }
 
 /*
@@ -499,7 +498,7 @@ configure (struct config *conf, enum mpath_cmds cmd,
 
 print_valid:
 	if (cmd == CMD_VALID_PATH)
-		print_cmd_valid(devpath, r);
+		print_cmd_valid(r);
 
 out:
 	if (refwwid)
diff --git a/multipath/multipath.rules b/multipath/multipath.rules
index 6f8ee2be0a58..5b3c3c9c1135 100644
--- a/multipath/multipath.rules
+++ b/multipath/multipath.rules
@@ -19,9 +19,9 @@ LABEL="test_dev"
 ENV{MPATH_SBIN_PATH}="/sbin"
 TEST!="$env{MPATH_SBIN_PATH}/multipath", ENV{MPATH_SBIN_PATH}="/usr/sbin"
 
-ENV{DM_MULTIPATH_DEVICE_PATH}!="1", \
-	PROGRAM=="$env{MPATH_SBIN_PATH}/multipath -u %k", \
-	ENV{DM_MULTIPATH_DEVICE_PATH}="1", ENV{ID_FS_TYPE}="mpath_member", \
+# multipath -u sets DM_MULTIPATH_DEVICE_PATH
+ENV{DM_MULTIPATH_DEVICE_PATH}!="1", IMPORT="$env{MPATH_SBIN_PATH}/multipath -u %k"
+ENV{DM_MULTIPATH_DEVICE_PATH}=="1", ENV{ID_FS_TYPE}="mpath_member", \
 	ENV{SYSTEMD_READY}="0"
 
 LABEL="end_mpath"
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC PATCH 13/16] multipath -u/-c: add "$DEV is maybe a valid path"
  2018-01-19  0:29 [RFC PATCH 00/16] multipath path classification Martin Wilck
                   ` (11 preceding siblings ...)
  2018-01-19  0:29 ` [RFC PATCH 12/16] multipath -u/-c: change output to environment/key format Martin Wilck
@ 2018-01-19  0:29 ` Martin Wilck
  2018-01-19  0:29 ` [RFC PATCH 14/16] multipath.rules: find_multipaths+ignore_wwids logic Martin Wilck
                   ` (3 subsequent siblings)
  16 siblings, 0 replies; 30+ messages in thread
From: Martin Wilck @ 2018-01-19  0:29 UTC (permalink / raw)
  To: Christophe Varoqui, dm-devel, Benjamin Marzinski
  Cc: Xose Vazquez Perez, Martin Wilck

Use DM_MULTIPATH_DEVICE_PATH="2" to indicate that this might be a
valid path, but we aren't certain. This happens with find_multipaths
and ignore_wwids set, when the first path to a device (or a single-path
device) is encountered. In this case, multipath needs to return
with status 0 so that the udev IMPORT statement works.
---
 multipath/main.c | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/multipath/main.c b/multipath/main.c
index 2127daebcb8f..c65793d5a9e9 100644
--- a/multipath/main.c
+++ b/multipath/main.c
@@ -336,14 +336,15 @@ out:
 	return r;
 }
 
-static void print_cmd_valid(int k)
+static int print_cmd_valid(int k)
 {
-	int vals[] = { 1, 0 };
+	int vals[] = { 1, 0, 2 };
 
 	if (k < 0 || k >= sizeof(vals))
-		return;
+		return 1;
 
 	printf("DM_MULTIPATH_DEVICE_PATH=\"%d\"\n", vals[k]);
+	return k == 1;
 }
 
 /*
@@ -482,6 +483,9 @@ configure (struct config *conf, enum mpath_cmds cmd,
 		 * the refwwid, then the path is valid */
 		if (VECTOR_SIZE(curmp) != 0 || VECTOR_SIZE(pathvec) > 1)
 			r = 0;
+		else
+			/* Use r=2 as an indication for "maybe" */
+			r = 2;
 		goto print_valid;
 	}
 
@@ -498,7 +502,7 @@ configure (struct config *conf, enum mpath_cmds cmd,
 
 print_valid:
 	if (cmd == CMD_VALID_PATH)
-		print_cmd_valid(r);
+		r = print_cmd_valid(r);
 
 out:
 	if (refwwid)
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC PATCH 14/16] multipath.rules: find_multipaths+ignore_wwids logic
  2018-01-19  0:29 [RFC PATCH 00/16] multipath path classification Martin Wilck
                   ` (12 preceding siblings ...)
  2018-01-19  0:29 ` [RFC PATCH 13/16] multipath -u/-c: add "$DEV is maybe a valid path" Martin Wilck
@ 2018-01-19  0:29 ` Martin Wilck
  2018-01-19 18:12   ` Benjamin Marzinski
  2018-01-20  0:27   ` [FIX for 14/16] multipath.rules: set job properties for systemd-run correctly Martin Wilck
  2018-01-19  0:29 ` [RFC PATCH 15/16] libmultipath: trigger change uevent on new device creation Martin Wilck
                   ` (2 subsequent siblings)
  16 siblings, 2 replies; 30+ messages in thread
From: Martin Wilck @ 2018-01-19  0:29 UTC (permalink / raw)
  To: Christophe Varoqui, dm-devel, Benjamin Marzinski
  Cc: Xose Vazquez Perez, Martin Wilck

Solve the problem that, when the first path to a device appears,
we don't know if more paths are going to follow.

These rules apply only if both find_multipaths and ignore_wwids are
set in multipath.conf.

multipath -u sets DM_MULTIPATH_DEVICE_PATH=2 if a device is "maybe"
a multipath member (not blacklisted, only one path seen).
In that case, pretend to be a multipath member and disallow further
processing by systemd (allowing multipathd some time to grab the path),
and check again after some time. If the path is still not multipathed by then,
pass it on to systemd for further processing. Ensure that this happens only
once. The timeout values FIND_MULTIPATHS_BOOT_TMO (time to wait since system
boot) and FIND_MULTIPATHS_PATH_TMO (time to wait after detection of first
path) can be configured in udev rules that are run before "multipath.rules".
The earlier timeout wins, thus if the first path is detected after
FIND_MULTIPATHS_BOOT_TMO has expired, the timer will expire immediately.
---
 multipath/multipath.rules | 59 +++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 57 insertions(+), 2 deletions(-)

diff --git a/multipath/multipath.rules b/multipath/multipath.rules
index 5b3c3c9c1135..6b4a418d0009 100644
--- a/multipath/multipath.rules
+++ b/multipath/multipath.rules
@@ -19,9 +19,64 @@ LABEL="test_dev"
 ENV{MPATH_SBIN_PATH}="/sbin"
 TEST!="$env{MPATH_SBIN_PATH}/multipath", ENV{MPATH_SBIN_PATH}="/usr/sbin"
 
+# find_multipaths + ignore_wwids logic, part 1
+#
+# Recover environment on next uevent after waiting for the timer.
+# This happens only once because DM_MULTIPATH_SAVED_FS_TYPE will be empty
+# on subsequent events.
+
+IMPORT{db}="DM_MULTIPATH_WAIT_DONE"
+IMPORT{db}="DM_MULTIPATH_SAVED_FS_TYPE"
+ENV{DM_MULTIPATH_WAIT_DONE}!="1", GOTO="skip_restore"
+ENV{DM_MULTIPATH_SAVED_FS_TYPE}=="", GOTO="skip_restore"
+
+# Reset if it's our dummy value, see below
+ENV{DM_MULTIPATH_SAVED_FS_TYPE}=="dm_multipath_unkown", ENV{DM_MULTIPATH_SAVED_FS_TYPE}=""
+ENV{ID_FS_TYPE}="$env{DM_MULTIPATH_SAVED_FS_TYPE}"
+ENV{DM_MULTIPATH_SAVED_FS_TYPE}=""
+ENV{DM_MULTIPATH_DEVICE_PATH}=""
+ENV{SYSTEMD_READY}=""
+
+# Special case:
+# An "add" event happened while we were waiting for the timer.
+# This happens during "coldplug" after switching root FS, if a timer
+# started during initramfs processing was interrupted. We may not have
+# waited long enough: try again.
+ACTION=="add", ENV{DM_MULTIPATH_WAIT_DONE}=""
+
+LABEL="skip_restore"
+
 # multipath -u sets DM_MULTIPATH_DEVICE_PATH
-ENV{DM_MULTIPATH_DEVICE_PATH}!="1", IMPORT="$env{MPATH_SBIN_PATH}/multipath -u %k"
+ENV{DM_MULTIPATH_DEVICE_PATH}!="1", IMPORT{program}="$env{MPATH_SBIN_PATH}/multipath -u %k"
 ENV{DM_MULTIPATH_DEVICE_PATH}=="1", ENV{ID_FS_TYPE}="mpath_member", \
-	ENV{SYSTEMD_READY}="0"
+	ENV{SYSTEMD_READY}="0", GOTO="end_mpath"
+
+# find_multipaths + ignore_wwids logic, part 2
+#
+# multipath -u sets DM_MULTIPATH_DEVICE_PATH=2 if a device is "maybe"
+# a multipath member (not blacklisted, only one path seen).
+# In that case, pretend to be a multipath member (allowing multipathd
+# some time to grab the path), and check again after some time.
+# If the path is still not multipathed by then, pass it on to systemd
+# for further processing.
+# DM_MULTIPATH_WAIT_DONE ensures that this happens only once.
+# (But see "special case" exception above).
+
+ENV{DM_MULTIPATH_WAIT_DONE}=="1", GOTO="end_mpath"
+ENV{DM_MULTIPATH_DEVICE_PATH}!="2", GOTO="end_mpath"
+
+# Default timeout values for the timer. Use early udev rules files to customize.
+# Timeouts are in seconds after system boot, and seconds after first path
+# discovery, respectively. The earlier timeout wins.
+ENV{FIND_MULTIPATHS_BOOT_TMO}!="?*", ENV{FIND_MULTIPATHS_BOOT_TMO}="180"
+ENV{FIND_MULTIPATHS_PATH_TMO}!="?*", ENV{FIND_MULTIPATHS_PATH_TMO}="30"
+
+ENV{DM_MULTIPATH_WAIT_DONE}="1"
+ENV{DM_MULTIPATH_SAVED_FS_TYPE}="$env{ID_FS_TYPE}"
+ENV{DM_MULTIPATH_SAVED_FS_TYPE}=="", ENV{DM_MULTIPATH_SAVED_FS_TYPE}="dm_multipath_unkown"
+ENV{ID_FS_TYPE}="maybe_mpath_member"
+ENV{DM_MULTIPATH_DEVICE_PATH}="1"
+ENV{SYSTEMD_READY}="0"
+RUN+="/usr/bin/systemd-run --action change --on-boot $env{FIND_MULTIPATHS_BOOT_TMO} --on-active $env{FIND_MULTIPATHS_PATH_TMO} /usr/bin/udevadm trigger $sys$devpath"
 
 LABEL="end_mpath"
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC PATCH 15/16] libmultipath: trigger change uevent on new device creation
  2018-01-19  0:29 [RFC PATCH 00/16] multipath path classification Martin Wilck
                   ` (13 preceding siblings ...)
  2018-01-19  0:29 ` [RFC PATCH 14/16] multipath.rules: find_multipaths+ignore_wwids logic Martin Wilck
@ 2018-01-19  0:29 ` Martin Wilck
  2018-01-19  0:29 ` [RFC PATCH 16/16] libmultipath: trigger path uevent only when necessary Martin Wilck
  2018-03-07  8:53 ` [RFC PATCH 00/16] multipath path classification Christophe Varoqui
  16 siblings, 0 replies; 30+ messages in thread
From: Martin Wilck @ 2018-01-19  0:29 UTC (permalink / raw)
  To: Christophe Varoqui, dm-devel, Benjamin Marzinski
  Cc: Xose Vazquez Perez, Martin Wilck

From: Benjamin Marzinski <bmarzins@redhat.com>

When multipath first sees a path device with find_multipaths
enabled, it can't know if the device should be multipathed. This means
that it will not claim the device in udev.  If the device is eventually
multipathed, multipath should trigger a change uevent to update the udev
database to claim the device.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 libmultipath/configure.c | 26 ++++++++++++++++++++++++--
 libmultipath/configure.h |  1 +
 libmultipath/wwids.c     |  2 +-
 multipath/main.c         |  2 +-
 multipathd/main.c        |  3 ++-
 5 files changed, 29 insertions(+), 5 deletions(-)

diff --git a/libmultipath/configure.c b/libmultipath/configure.c
index 325018a78b95..c08ea5b8decc 100644
--- a/libmultipath/configure.c
+++ b/libmultipath/configure.c
@@ -433,6 +433,28 @@ trigger_udev_change(const struct multipath *mpp)
 	udev_device_unref(udd);
 }
 
+void
+trigger_paths_udev_change(const struct multipath *mpp)
+{
+	struct pathgroup * pgp;
+	struct path * pp;
+	int i, j;
+
+	if (!mpp || !mpp->pg)
+		return;
+
+	vector_foreach_slot (mpp->pg, pgp, i) {
+		if (!pgp->paths)
+			continue;
+		vector_foreach_slot(pgp->paths, pp, j) {
+			if (!pp->udev)
+				continue;
+			sysfs_attr_set_value(pp->udev, "uevent", "change",
+					     strlen("change"));
+		}
+	}
+}
+
 static int
 is_mpp_known_to_udev(const struct multipath *mpp)
 {
@@ -827,8 +849,8 @@ int domap(struct multipath *mpp, char *params, int is_daemon)
 		 * succeeded
 		 */
 		mpp->force_udev_reload = 0;
-		if (mpp->action == ACT_CREATE)
-			remember_wwid(mpp->wwid);
+		if (mpp->action == ACT_CREATE && remember_wwid(mpp->wwid) == 1)
+			trigger_paths_udev_change(mpp);
 		if (!is_daemon) {
 			/* multipath client mode */
 			dm_switchgroup(mpp->alias, mpp->bestpg);
diff --git a/libmultipath/configure.h b/libmultipath/configure.h
index 0ffc28efdaf7..09603ba81733 100644
--- a/libmultipath/configure.h
+++ b/libmultipath/configure.h
@@ -37,3 +37,4 @@ int get_refwwid (enum mpath_cmds cmd, char * dev, enum devtypes dev_type,
 int reload_map(struct vectors *vecs, struct multipath *mpp, int refresh, int is_daemon);
 int sysfs_get_host_adapter_name(struct path *pp, char *adapter_name);
 struct udev_device *get_udev_device(const char *dev, enum devtypes dev_type);
+void trigger_paths_udev_change(const struct multipath *mpp);
diff --git a/libmultipath/wwids.c b/libmultipath/wwids.c
index 655ccbe2c94a..098aa6f67b77 100644
--- a/libmultipath/wwids.c
+++ b/libmultipath/wwids.c
@@ -326,5 +326,5 @@ remember_wwid(char *wwid)
 		condlog(3, "wrote wwid %s to wwids file", wwid);
 	else
 		condlog(4, "wwid %s already in wwids file", wwid);
-	return 0;
+	return ret;
 }
diff --git a/multipath/main.c b/multipath/main.c
index c65793d5a9e9..528c3fe1f50f 100644
--- a/multipath/main.c
+++ b/multipath/main.c
@@ -419,7 +419,7 @@ configure (struct config *conf, enum mpath_cmds cmd,
 		}
 		if (cmd == CMD_ADD_WWID) {
 			r = remember_wwid(refwwid);
-			if (r == 0)
+			if (r >= 0)
 				printf("wwid '%s' added\n", refwwid);
 			else
 				printf("failed adding '%s' to wwids file\n",
diff --git a/multipathd/main.c b/multipathd/main.c
index 6f612f6d3d2f..5807b42c2b2c 100644
--- a/multipathd/main.c
+++ b/multipathd/main.c
@@ -2048,7 +2048,8 @@ configure (struct vectors * vecs, int start_waiters)
 
 	sync_maps_state(mpvec);
 	vector_foreach_slot(mpvec, mpp, i){
-		remember_wwid(mpp->wwid);
+		if (remember_wwid(mpp->wwid) == 1)
+			trigger_paths_udev_change(mpp);
 		update_map_pr(mpp);
 	}
 
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC PATCH 16/16] libmultipath: trigger path uevent only when necessary
  2018-01-19  0:29 [RFC PATCH 00/16] multipath path classification Martin Wilck
                   ` (14 preceding siblings ...)
  2018-01-19  0:29 ` [RFC PATCH 15/16] libmultipath: trigger change uevent on new device creation Martin Wilck
@ 2018-01-19  0:29 ` Martin Wilck
  2018-03-07  8:53 ` [RFC PATCH 00/16] multipath path classification Christophe Varoqui
  16 siblings, 0 replies; 30+ messages in thread
From: Martin Wilck @ 2018-01-19  0:29 UTC (permalink / raw)
  To: Christophe Varoqui, dm-devel, Benjamin Marzinski
  Cc: Xose Vazquez Perez, Martin Wilck

Paths that are already classified as DM_MULTIPATH_DEVICE_PATH don't
need to be retriggered. As an exception, trigger events for paths in
"maybe" state that are waiting to be updated.
---
 libmultipath/configure.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/libmultipath/configure.c b/libmultipath/configure.c
index c08ea5b8decc..f8d1fa452014 100644
--- a/libmultipath/configure.c
+++ b/libmultipath/configure.c
@@ -447,8 +447,28 @@ trigger_paths_udev_change(const struct multipath *mpp)
 		if (!pgp->paths)
 			continue;
 		vector_foreach_slot(pgp->paths, pp, j) {
+			const char *env;
+
 			if (!pp->udev)
 				continue;
+			/*
+			 * Paths that are already classified as multipath
+			 * members don't need another uevent.
+			 */
+			env = udev_device_get_property_value(
+				pp->udev, "DM_MULTIPATH_DEVICE_PATH");
+			if (env != NULL && !strcmp(env, "1")) {
+				/*
+				 * Nonempty DM_MULTIPATH_SAVED_FS_TYPE means
+				 * path is in "maybe" state and timer is running
+				 * Send uevent now (see multipath.rules).
+				 */
+				env = udev_device_get_property_value(
+					pp->udev, "DM_MULTIPATH_SAVED_FS_TYPE");
+				if (env == NULL)
+					continue;
+			}
+			condlog(4, "triggering change uevent for %s", pp->dev);
 			sysfs_attr_set_value(pp->udev, "uevent", "change",
 					     strlen("change"));
 		}
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH 04/16] libmultipath: should_multipath: keep existing maps
  2018-01-19  0:29 ` [RFC PATCH 04/16] libmultipath: should_multipath: keep existing maps Martin Wilck
@ 2018-01-19 16:06   ` Benjamin Marzinski
  0 siblings, 0 replies; 30+ messages in thread
From: Benjamin Marzinski @ 2018-01-19 16:06 UTC (permalink / raw)
  To: Martin Wilck; +Cc: dm-devel, Xose Vazquez Perez

On Fri, Jan 19, 2018 at 01:29:04AM +0100, Martin Wilck wrote:
> If with find_multipaths and !ignore_new_devs, if a path is already
> multipathed, keep it. The same logic is applied in multipath -u
> with ignore_wwids.

One question, why the requirement that (first_path(mp) != NULL)?
It seems possible that we have a multipath device that is in use and
loses all of it's path devices.  When the first path is rediscovered,
we want to add it back immediately.  Obviously, the wwid check should
take care of this. But if we think we need this check as a backup, then
I don't see why we wouldn't want to accept the first path?

Also, I have one small worry, but I'm pretty convinced that I'm just
being paranoid. There is a case where this will be more lenient in
accepting devices than our existing code, and possibly too lenient.
Right now, we only add a wwid to the wwids file after a multipath device
has been successfully created.  But we add the multipath device to the
mpvec before we try to create it.  I can't think of a case where we
would add a device the the mpvec where we shouldn't be accepting paths
to that device as multipathable, but the ordering difference seems a
little worrying.

Both of these could be avoided, albeit a little slower, by actually
getting a listing multipath devices from dm, and checking against their
wwids. This is what the multipath command is doing, so we would be using
the exact same logic in both places.  I really wish you could get a list
of dm devices by wwid as easily as you can by name.

Another option is, as I mentioned before, to trust that any multipath
device will have it's wwid in the wwids file. Have you seen a case where
this doesn't happen?

-Ben

> ---
>  libmultipath/wwids.c | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/libmultipath/wwids.c b/libmultipath/wwids.c
> index fcbf5281b491..828a3de5b5cb 100644
> --- a/libmultipath/wwids.c
> +++ b/libmultipath/wwids.c
> @@ -287,6 +287,13 @@ should_multipath(struct path *pp1, vector pathvec, vector mpvec)
>  
>  	condlog(4, "checking if %s should be multipathed", pp1->dev);
>  	if (!ignore_new_devs) {
> +		struct multipath *mp = find_mp_by_wwid(mpvec, pp1->wwid);
> +
> +		if (mp != NULL && first_path(mp) != NULL) {
> +			condlog(3, "wwid %s is already multipathed, keeping it",
> +				pp1->wwid);
> +			return 1;
> +		}
>  		vector_foreach_slot(pathvec, pp2, i) {
>  			if (pp1->dev == pp2->dev)
>  				continue;
> -- 
> 2.15.1

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH 14/16] multipath.rules: find_multipaths+ignore_wwids logic
  2018-01-19  0:29 ` [RFC PATCH 14/16] multipath.rules: find_multipaths+ignore_wwids logic Martin Wilck
@ 2018-01-19 18:12   ` Benjamin Marzinski
  2018-01-20  1:20     ` Martin Wilck
  2018-01-20  0:27   ` [FIX for 14/16] multipath.rules: set job properties for systemd-run correctly Martin Wilck
  1 sibling, 1 reply; 30+ messages in thread
From: Benjamin Marzinski @ 2018-01-19 18:12 UTC (permalink / raw)
  To: Martin Wilck; +Cc: dm-devel, Xose Vazquez Perez

On Fri, Jan 19, 2018 at 01:29:14AM +0100, Martin Wilck wrote:
> Solve the problem that, when the first path to a device appears,
> we don't know if more paths are going to follow.
> 
> These rules apply only if both find_multipaths and ignore_wwids are
> set in multipath.conf.
> 
> multipath -u sets DM_MULTIPATH_DEVICE_PATH=2 if a device is "maybe"
> a multipath member (not blacklisted, only one path seen).
> In that case, pretend to be a multipath member and disallow further
> processing by systemd (allowing multipathd some time to grab the path),
> and check again after some time. If the path is still not multipathed by then,
> pass it on to systemd for further processing. Ensure that this happens only
> once. The timeout values FIND_MULTIPATHS_BOOT_TMO (time to wait since system
> boot) and FIND_MULTIPATHS_PATH_TMO (time to wait after detection of first
> path) can be configured in udev rules that are run before "multipath.rules".
> The earlier timeout wins, thus if the first path is detected after
> FIND_MULTIPATHS_BOOT_TMO has expired, the timer will expire immediately.

RedHat defaults to using find_multipaths. Most customers have some local
non-multipathed storage.  As far as I can see, this change is adding at
least 30 seconds to the boot time or every boot in our standard case, to
solve a problem that is either in your Nice-to-Have category, or is as
far as I know, only theoretical, and also which can be fixed by simply
mounting a filesystem, albeit from an emergency shell.  RedHat has been
defaulting to find_multipaths since not long after I first posted it
(which is long before it finally got accepted upstream), and like I
said, I've never receieved a bug report for that issue with new storage
causing boot to fail. I admit, new storage being not multipathed has
happened (when there is existing LVM/MD metadata on it), and cleaning
that up does involve adding the wwid, which isn't obvious, which is why
i made a multipath command to make it as simple as possible. But the
only bug I've ever received for this has been from the RedHat system
installer, where you are adding a lot of new storage all at once, and
it's very likely to have old LVM/MD metadata on it.

I would personally be much happier if you just added the "-i" to the
upstream rules, and RedHat just reverted some commits on our own. 30
seconds feels too long to wait on every boot in the (for us) standard
case, and making the timeout short enough to not be bothersome doesn't
(or didn't when I tried something similar, when I first wrote
find_multipaths) work when there were a large number of devices being
discovered at boot.  And if the timeout is too short, you end up in the
same situation that the current RedHat setup has, with one important
difference. Multipathd is now much more likely to win the race, and
multipath a device that it has not claimed, which is the only way you
can get into the more serious boot issue I described before.

IMHO a better way to avoid this would be to give multipathd a
configurable timeout to wait before creating a multipath device on new
storage if find_multipaths is set, it order to give you the same sort of
guarantee that the race will go one way. I realize that if there is
existing metadata on the device or a fstab entry for it, this would make
the new device always be incorrectly not multipathed, but it has the
benefit of not slowing boot and reducing the chance of the worst
outcome, regardless of how long the timeout is. It would essentially be
like keeping the guarantees of your original "imply -n" patch, but with
the benefit that if nothing else wanted to assemble on the device,
multipathd would eventually try and succeed, and then update the udev
database afterwards.

Thoughts?
-Ben

> ---
>  multipath/multipath.rules | 59 +++++++++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 57 insertions(+), 2 deletions(-)
> 
> diff --git a/multipath/multipath.rules b/multipath/multipath.rules
> index 5b3c3c9c1135..6b4a418d0009 100644
> --- a/multipath/multipath.rules
> +++ b/multipath/multipath.rules
> @@ -19,9 +19,64 @@ LABEL="test_dev"
>  ENV{MPATH_SBIN_PATH}="/sbin"
>  TEST!="$env{MPATH_SBIN_PATH}/multipath", ENV{MPATH_SBIN_PATH}="/usr/sbin"
>  
> +# find_multipaths + ignore_wwids logic, part 1
> +#
> +# Recover environment on next uevent after waiting for the timer.
> +# This happens only once because DM_MULTIPATH_SAVED_FS_TYPE will be empty
> +# on subsequent events.
> +
> +IMPORT{db}="DM_MULTIPATH_WAIT_DONE"
> +IMPORT{db}="DM_MULTIPATH_SAVED_FS_TYPE"
> +ENV{DM_MULTIPATH_WAIT_DONE}!="1", GOTO="skip_restore"
> +ENV{DM_MULTIPATH_SAVED_FS_TYPE}=="", GOTO="skip_restore"
> +
> +# Reset if it's our dummy value, see below
> +ENV{DM_MULTIPATH_SAVED_FS_TYPE}=="dm_multipath_unkown", ENV{DM_MULTIPATH_SAVED_FS_TYPE}=""
> +ENV{ID_FS_TYPE}="$env{DM_MULTIPATH_SAVED_FS_TYPE}"
> +ENV{DM_MULTIPATH_SAVED_FS_TYPE}=""
> +ENV{DM_MULTIPATH_DEVICE_PATH}=""
> +ENV{SYSTEMD_READY}=""
> +
> +# Special case:
> +# An "add" event happened while we were waiting for the timer.
> +# This happens during "coldplug" after switching root FS, if a timer
> +# started during initramfs processing was interrupted. We may not have
> +# waited long enough: try again.
> +ACTION=="add", ENV{DM_MULTIPATH_WAIT_DONE}=""
> +
> +LABEL="skip_restore"
> +
>  # multipath -u sets DM_MULTIPATH_DEVICE_PATH
> -ENV{DM_MULTIPATH_DEVICE_PATH}!="1", IMPORT="$env{MPATH_SBIN_PATH}/multipath -u %k"
> +ENV{DM_MULTIPATH_DEVICE_PATH}!="1", IMPORT{program}="$env{MPATH_SBIN_PATH}/multipath -u %k"
>  ENV{DM_MULTIPATH_DEVICE_PATH}=="1", ENV{ID_FS_TYPE}="mpath_member", \
> -	ENV{SYSTEMD_READY}="0"
> +	ENV{SYSTEMD_READY}="0", GOTO="end_mpath"
> +
> +# find_multipaths + ignore_wwids logic, part 2
> +#
> +# multipath -u sets DM_MULTIPATH_DEVICE_PATH=2 if a device is "maybe"
> +# a multipath member (not blacklisted, only one path seen).
> +# In that case, pretend to be a multipath member (allowing multipathd
> +# some time to grab the path), and check again after some time.
> +# If the path is still not multipathed by then, pass it on to systemd
> +# for further processing.
> +# DM_MULTIPATH_WAIT_DONE ensures that this happens only once.
> +# (But see "special case" exception above).
> +
> +ENV{DM_MULTIPATH_WAIT_DONE}=="1", GOTO="end_mpath"
> +ENV{DM_MULTIPATH_DEVICE_PATH}!="2", GOTO="end_mpath"
> +
> +# Default timeout values for the timer. Use early udev rules files to customize.
> +# Timeouts are in seconds after system boot, and seconds after first path
> +# discovery, respectively. The earlier timeout wins.
> +ENV{FIND_MULTIPATHS_BOOT_TMO}!="?*", ENV{FIND_MULTIPATHS_BOOT_TMO}="180"
> +ENV{FIND_MULTIPATHS_PATH_TMO}!="?*", ENV{FIND_MULTIPATHS_PATH_TMO}="30"
> +
> +ENV{DM_MULTIPATH_WAIT_DONE}="1"
> +ENV{DM_MULTIPATH_SAVED_FS_TYPE}="$env{ID_FS_TYPE}"
> +ENV{DM_MULTIPATH_SAVED_FS_TYPE}=="", ENV{DM_MULTIPATH_SAVED_FS_TYPE}="dm_multipath_unkown"
> +ENV{ID_FS_TYPE}="maybe_mpath_member"
> +ENV{DM_MULTIPATH_DEVICE_PATH}="1"
> +ENV{SYSTEMD_READY}="0"
> +RUN+="/usr/bin/systemd-run --action change --on-boot $env{FIND_MULTIPATHS_BOOT_TMO} --on-active $env{FIND_MULTIPATHS_PATH_TMO} /usr/bin/udevadm trigger $sys$devpath"
>  
>  LABEL="end_mpath"
> -- 
> 2.15.1

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [FIX for 14/16] multipath.rules: set job properties for systemd-run correctly
  2018-01-19  0:29 ` [RFC PATCH 14/16] multipath.rules: find_multipaths+ignore_wwids logic Martin Wilck
  2018-01-19 18:12   ` Benjamin Marzinski
@ 2018-01-20  0:27   ` Martin Wilck
  1 sibling, 0 replies; 30+ messages in thread
From: Martin Wilck @ 2018-01-20  0:27 UTC (permalink / raw)
  To: Christophe Varoqui, Benjamin Marzinski, dm-devel
  Cc: Xose Vazquez Perez, Martin Wilck

The default dependencies must be turned off for both the timer and
the service unit, otherwise systemd-run will wait for sysinit.target.

Moreover, fix these bugs:
 - never run the logic if DM_MULTIPATH_DEVICE_PATH!=2
 - reset DM_MULTIPATH_DEVICE_PATH to "" on timer expiry

Signed-off-by: Martin Wilck <mwilck@suse.com>
---
 multipath/multipath.rules | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/multipath/multipath.rules b/multipath/multipath.rules
index 6b4a418d0009..659aa586d46e 100644
--- a/multipath/multipath.rules
+++ b/multipath/multipath.rules
@@ -62,8 +62,9 @@ ENV{DM_MULTIPATH_DEVICE_PATH}=="1", ENV{ID_FS_TYPE}="mpath_member", \
 # DM_MULTIPATH_WAIT_DONE ensures that this happens only once.
 # (But see "special case" exception above).
 
-ENV{DM_MULTIPATH_WAIT_DONE}=="1", GOTO="end_mpath"
 ENV{DM_MULTIPATH_DEVICE_PATH}!="2", GOTO="end_mpath"
+ENV{DM_MULTIPATH_WAIT_DONE}=="1", ENV{DM_MULTIPATH_DEVICE_PATH}="", \
+	GOTO="end_mpath"
 
 # Default timeout values for the timer. Use early udev rules files to customize.
 # Timeouts are in seconds after system boot, and seconds after first path
@@ -77,6 +78,6 @@ ENV{DM_MULTIPATH_SAVED_FS_TYPE}=="", ENV{DM_MULTIPATH_SAVED_FS_TYPE}="dm_multipa
 ENV{ID_FS_TYPE}="maybe_mpath_member"
 ENV{DM_MULTIPATH_DEVICE_PATH}="1"
 ENV{SYSTEMD_READY}="0"
-RUN+="/usr/bin/systemd-run --action change --on-boot $env{FIND_MULTIPATHS_BOOT_TMO} --on-active $env{FIND_MULTIPATHS_PATH_TMO} /usr/bin/udevadm trigger $sys$devpath"
+RUN+="/usr/bin/systemd-run --no-block --timer-property DefaultDependencies=no --property DefaultDependencies=no --on-boot $env{FIND_MULTIPATHS_BOOT_TMO} --on-active $env{FIND_MULTIPATHS_PATH_TMO} /usr/bin/udevadm --action change trigger $sys$devpath"
 
 LABEL="end_mpath"
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH 14/16] multipath.rules: find_multipaths+ignore_wwids logic
  2018-01-19 18:12   ` Benjamin Marzinski
@ 2018-01-20  1:20     ` Martin Wilck
  2018-01-21  3:21       ` Benjamin Marzinski
  0 siblings, 1 reply; 30+ messages in thread
From: Martin Wilck @ 2018-01-20  1:20 UTC (permalink / raw)
  To: Benjamin Marzinski; +Cc: dm-devel, Xose Vazquez Perez

On Fri, 2018-01-19 at 12:12 -0600, Benjamin Marzinski wrote:
> On Fri, Jan 19, 2018 at 01:29:14AM +0100, Martin Wilck wrote:
> > Solve the problem that, when the first path to a device appears,
> > we don't know if more paths are going to follow.
> > 
> > These rules apply only if both find_multipaths and ignore_wwids are
> > set in multipath.conf.
> > 
> > multipath -u sets DM_MULTIPATH_DEVICE_PATH=2 if a device is "maybe"
> > a multipath member (not blacklisted, only one path seen).
> > In that case, pretend to be a multipath member and disallow further
> > processing by systemd (allowing multipathd some time to grab the
> > path),
> > and check again after some time. If the path is still not
> > multipathed by then,
> > pass it on to systemd for further processing. Ensure that this
> > happens only
> > once. The timeout values FIND_MULTIPATHS_BOOT_TMO (time to wait
> > since system
> > boot) and FIND_MULTIPATHS_PATH_TMO (time to wait after detection of
> > first
> > path) can be configured in udev rules that are run before
> > "multipath.rules".
> > The earlier timeout wins, thus if the first path is detected after
> > FIND_MULTIPATHS_BOOT_TMO has expired, the timer will expire
> > immediately.
> 
> RedHat defaults to using find_multipaths. Most customers have some
> local
> non-multipathed storage.  As far as I can see, this change is adding
> at
> least 30 seconds to the boot time or every boot in our standard case,
> to
> solve a problem that is either in your Nice-to-Have category, or is
> as
> far as I know, only theoretical, and also which can be fixed by
> simply
> mounting a filesystem, albeit from an emergency shell.

This is not the case. Firstly, I thought that in an earlier mail you
said that you were using ignore_new_devs = yes in the initrd. (with my
patch that would transform to "ignore_wwids = no"). In that case, this
logic doesn't apply, and no delays occur. Moreover, the timeout
defaults I used can of course be set much lower, so that even with
ignore_wwids = yes, the delay would be hardly noticeable. If you really
dislike it so much, you can set both timeouts to 0 and the behavior
will be as usual ("maybe" paths will be treated as non-multipath).

Lastly, the safest way to get your device *not* multipathed is still
blacklisting, and the safest way to *make sure* it's multipathed is to
add it to the wwids file. Both methods still work, with no delays, and
the wwids file is still updated with every detected multipath map.
Nothing here differs from standard find_multipath usage.

This logic is solely for the case where it's unkown whether or not a
device will be a multipath device.

>   RedHat has been
> defaulting to find_multipaths since not long after I first posted it
> (which is long before it finally got accepted upstream), and like I
> said, I've never receieved a bug report for that issue with new
> storage
> causing boot to fail. 

That's a thing Hannes and I need to think about ;-)

> I would personally be much happier if you just added the "-i" to the
> upstream rules, and RedHat just reverted some commits on our own. 

Hm, I thought you disliked the "-i"... Maybe the most important point
of this patch series is the idea not to use this option any more, but
use a config file option instead which is shared between multipath and
multipathd. I'm not sure I understand what you're asking for.

> 30
> seconds feels too long to wait on every boot in the (for us) standard
> case, and making the timeout short enough to not be bothersome
> doesn't
> (or didn't when I tried something similar, when I first wrote
> find_multipaths) work when there were a large number of devices being
> discovered at boot.  

Of course it isn't perfect, and timeouts are something that's
impossible to get right for all use cases, we know that all too well.
However, if we don't know whether or not additional paths will be
detected, what else can we do except wait? If you have a system where
you see one path, then 5000 other devices, and finally another path to
the same device after 30 minutes, this approach can't work, and the
admin would be well-advised to add the device to the wwids file.

But it can avoid unpleasant surprises in other cases, I believe.
v
> And if the timeout is too short, you end up in the
> same situation that the current RedHat setup has, with one important
> difference. Multipathd is now much more likely to win the race, and
> multipath a device that it has not claimed, which is the only way you
> can get into the more serious boot issue I described before.

I don't get that. Multipath only "wins" if it sees more than a single
path, in which case it's correct that it "wins".

> IMHO a better way to avoid this would be to give multipathd a
> configurable timeout to wait before creating a multipath device on
> new
> storage if find_multipaths is set, it order to give you the same sort
> of
> guarantee that the race will go one way.

You mean, wait with a single path, and then after some time, grab the
path even if no other paths are found? I don't like that idea. After
all, the longer we wait, the less likely it becomes that another path
will pop up. 

>  I realize that if there is
> existing metadata on the device or a fstab entry for it, this would
> make
> the new device always be incorrectly not multipathed, but it has the
> benefit of not slowing boot and reducing the chance of the worst
> outcome, regardless of how long the timeout is. 

I haven't understood what you consider the worst outcome, but I admit
I've yet to read your other longer posting in detail. In my opinion the
worst thing is devices being permanently unavailable, and I'm quite
positive that my patch can avoid that situation.

> It would essentially be
> like keeping the guarantees of your original "imply -n" patch, but
> with
> the benefit that if nothing else wanted to assemble on the device,
> multipathd would eventually try and succeed, and then update the udev
> database afterwards.

I understand that you're not too happy about my proposal. Be assured
that it's not my intention to break Red Hat's default setup. This stuff
needs some more testing and work to stabilize, but when that's done,
I'd be grateful if you'd give it a try in practice.

Regards
Martin

> 
> Thoughts?
> -Ben
> 
> > ---
> >  multipath/multipath.rules | 59
> > +++++++++++++++++++++++++++++++++++++++++++++--
> >  1 file changed, 57 insertions(+), 2 deletions(-)
> > 
> > diff --git a/multipath/multipath.rules b/multipath/multipath.rules
> > index 5b3c3c9c1135..6b4a418d0009 100644
> > --- a/multipath/multipath.rules
> > +++ b/multipath/multipath.rules
> > @@ -19,9 +19,64 @@ LABEL="test_dev"
> >  ENV{MPATH_SBIN_PATH}="/sbin"
> >  TEST!="$env{MPATH_SBIN_PATH}/multipath",
> > ENV{MPATH_SBIN_PATH}="/usr/sbin"
> >  
> > +# find_multipaths + ignore_wwids logic, part 1
> > +#
> > +# Recover environment on next uevent after waiting for the timer.
> > +# This happens only once because DM_MULTIPATH_SAVED_FS_TYPE will
> > be empty
> > +# on subsequent events.
> > +
> > +IMPORT{db}="DM_MULTIPATH_WAIT_DONE"
> > +IMPORT{db}="DM_MULTIPATH_SAVED_FS_TYPE"
> > +ENV{DM_MULTIPATH_WAIT_DONE}!="1", GOTO="skip_restore"
> > +ENV{DM_MULTIPATH_SAVED_FS_TYPE}=="", GOTO="skip_restore"
> > +
> > +# Reset if it's our dummy value, see below
> > +ENV{DM_MULTIPATH_SAVED_FS_TYPE}=="dm_multipath_unkown",
> > ENV{DM_MULTIPATH_SAVED_FS_TYPE}=""
> > +ENV{ID_FS_TYPE}="$env{DM_MULTIPATH_SAVED_FS_TYPE}"
> > +ENV{DM_MULTIPATH_SAVED_FS_TYPE}=""
> > +ENV{DM_MULTIPATH_DEVICE_PATH}=""
> > +ENV{SYSTEMD_READY}=""
> > +
> > +# Special case:
> > +# An "add" event happened while we were waiting for the timer.
> > +# This happens during "coldplug" after switching root FS, if a
> > timer
> > +# started during initramfs processing was interrupted. We may not
> > have
> > +# waited long enough: try again.
> > +ACTION=="add", ENV{DM_MULTIPATH_WAIT_DONE}=""
> > +
> > +LABEL="skip_restore"
> > +
> >  # multipath -u sets DM_MULTIPATH_DEVICE_PATH
> > -ENV{DM_MULTIPATH_DEVICE_PATH}!="1",
> > IMPORT="$env{MPATH_SBIN_PATH}/multipath -u %k"
> > +ENV{DM_MULTIPATH_DEVICE_PATH}!="1",
> > IMPORT{program}="$env{MPATH_SBIN_PATH}/multipath -u %k"
> >  ENV{DM_MULTIPATH_DEVICE_PATH}=="1",
> > ENV{ID_FS_TYPE}="mpath_member", \
> > -	ENV{SYSTEMD_READY}="0"
> > +	ENV{SYSTEMD_READY}="0", GOTO="end_mpath"
> > +
> > +# find_multipaths + ignore_wwids logic, part 2
> > +#
> > +# multipath -u sets DM_MULTIPATH_DEVICE_PATH=2 if a device is
> > "maybe"
> > +# a multipath member (not blacklisted, only one path seen).
> > +# In that case, pretend to be a multipath member (allowing
> > multipathd
> > +# some time to grab the path), and check again after some time.
> > +# If the path is still not multipathed by then, pass it on to
> > systemd
> > +# for further processing.
> > +# DM_MULTIPATH_WAIT_DONE ensures that this happens only once.
> > +# (But see "special case" exception above).
> > +
> > +ENV{DM_MULTIPATH_WAIT_DONE}=="1", GOTO="end_mpath"
> > +ENV{DM_MULTIPATH_DEVICE_PATH}!="2", GOTO="end_mpath"
> > +
> > +# Default timeout values for the timer. Use early udev rules files
> > to customize.
> > +# Timeouts are in seconds after system boot, and seconds after
> > first path
> > +# discovery, respectively. The earlier timeout wins.
> > +ENV{FIND_MULTIPATHS_BOOT_TMO}!="?*",
> > ENV{FIND_MULTIPATHS_BOOT_TMO}="180"
> > +ENV{FIND_MULTIPATHS_PATH_TMO}!="?*",
> > ENV{FIND_MULTIPATHS_PATH_TMO}="30"
> > +
> > +ENV{DM_MULTIPATH_WAIT_DONE}="1"
> > +ENV{DM_MULTIPATH_SAVED_FS_TYPE}="$env{ID_FS_TYPE}"
> > +ENV{DM_MULTIPATH_SAVED_FS_TYPE}=="",
> > ENV{DM_MULTIPATH_SAVED_FS_TYPE}="dm_multipath_unkown"
> > +ENV{ID_FS_TYPE}="maybe_mpath_member"
> > +ENV{DM_MULTIPATH_DEVICE_PATH}="1"
> > +ENV{SYSTEMD_READY}="0"
> > +RUN+="/usr/bin/systemd-run --action change --on-boot
> > $env{FIND_MULTIPATHS_BOOT_TMO} --on-active
> > $env{FIND_MULTIPATHS_PATH_TMO} /usr/bin/udevadm trigger
> > $sys$devpath"
> >  
> >  LABEL="end_mpath"
> > -- 
> > 2.15.1
> 
> 

-- 
Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH 14/16] multipath.rules: find_multipaths+ignore_wwids logic
  2018-01-20  1:20     ` Martin Wilck
@ 2018-01-21  3:21       ` Benjamin Marzinski
  2018-01-22 21:56         ` Martin Wilck
  0 siblings, 1 reply; 30+ messages in thread
From: Benjamin Marzinski @ 2018-01-21  3:21 UTC (permalink / raw)
  To: Martin Wilck; +Cc: dm-devel, Xose Vazquez Perez

I apologize in advance for how long this is. I feel like that guy who
corners people at parties to talk about politics and won't shut up.  I
will reply to the specific points of your last email later, with much
shorter answers.

I think we are talking past each other a bit here.  I know that in my
last reply I was building on points from my earlier reply without making
that clear enough.  But lets go back to stating the problem in a way we
can agree on, and then see if I can explain my thoughts better from that
common ground.

There are four classes of potential path devices that multipath sees:
        1. blacklisted devices
        2. devices in the wwids file
        3. devices that are neither blacklisted or in the wwids file
           but that you know will be multipathed.
           (basically new devices in the non-find_multipaths case)
        4. devices that are neither blacklisted or in the wwids file
           where don't know if they will be multipathed.
           (basically new devices and single paths in the
            find_multipaths case)

The problem we are trying to deal with is only about the 4th class.  In
the other classes we all agree that multipath and multipathd can get the
correct answer immediately.

There are three subsets of this 4th class of devices:
        4A. The device should not be multipathed
        4B. The device should be multipathed and nothing else wants to
            use it
        4C. The device should be multipathed but something else wants to
            use it

4A: If in reality, the device should not be multipathed, then mutipathd
will never assemble on the device. So there are only two possible
outcomes

        1. The device is not claimed by multipath, and is not
           multipathed
        2. The device is claimed by multipath, but not multipathed

Outcome 1 is the correct one. multipath temporarily claiming a device,
and then unclaiming it in a timely manner is also Outcome 1.

Outcome 2 is very bad.  This was the cause of your "imply -n if
find_mutipaths" patch. This is why RedHat never runs "multipath -u" with
"-i".  If multipath claims a device and multipathd doesn't assemble on
it, nobody can use the device, and the system can become unusable. Even
worse, since we don't have anything like an ignore-these-wwids file
(that's what the blacklist is for, but that's class 1, and we're only
looking at class 4 here) you can hit this every time you discover that
device. This is an outcome that any solution must absolutely avoid.


4B: If in reality, the device should be multipathed and there is nothing
else that wants to use the path device, multipathd should always be able
to assemble on the device.  However, if you run mutipathd with -n, that
is not the case.  Thus, there are four possible outcomes.

        1. The device is not claimed by multipath, and is not
           multipathed
        2. The device is claimed by multipath, but not multipathed
        3. The device is not claimed by multipath, but is multipathed
        4. The device is claimed by multipath and is multipathed

Outcome 1 is the multpathd -n case, working correctly. It is pretty
suboptimal, since you could have safely assembled the multipath device.
However, multipathd couldn't know before-hand that there was nothing
else that wanted to use the device.

Outcome 2 is the multipathd -n case, without synchronization with
mutipath.  This isn't as bad as Outcome 2 in the 4A class, because
nothing wants the device, but there is still no reason for this to ever
happen.

Outcome 3 is a little sloppy, but assuming that multipathd can claim the
device afterwards, it appears completely correct to the user, and will
only happen once. On future boots this device will be in the wwids file
(class 2).

Outcome 4 is the correct one


4C: If in reality, the device should be multipathed but there is
something else that also wants to use the device, there are four
possible outcomes:

        1. The device is not claimed by multipath, and is not
           multipathed
        2. The device is claimed by multipath, but not multipathed
        3. The device is not claimed by multipath, but is multipathed
        4. The device is claimed by multipath and is multipathed

Outcome 1 is suboptimal, since the device really should be multipathed,
but the system will still be usable (albeit, with only a single path to
the storage).  However, this is fixable for future boots, by adding the
wwid to the wwids file.

Outcome 2 is just as bad as Outcome 2 in class 4A. Of course, if the
device is supposed to be multipathed, and is claimed by multipath, it is
very likely that multipathd will assemble on it, so this is an extremely
rare case.

Outcome 3 is the cause of the never actually observed bug I explained in
an earlier eamil. If multipath doesn't claim the device, then whatever
else wants to use it will go ahead and try.  If multipathd comes along
and assembles on the device, that can keep the other user from being
able to actually use the device as it was planning to. The other user
may see the multipth device and try using it after failing on the path
device, but it could simply give up after failing on the path device.  I
want to note that this can only happen if the new multipathable storage
already has metadata on it that is supposed to get autoassembled,
mounted, etc. Further, as both of us have pointed out during this email
thread, multipathd is very unlikely to win this race. It starts later
than other things that use the devices, and find_multipaths has to wait
for two paths to appear before it can start to assemble the device,
while other users can begin right away. Outcome 1 is what happens when
multipathd fails this race.

Outcome 4 is the correct one.

We also know something about the relative frequency of these various
classes (4A, 4B, and 4C). Class 4A devices are seen every single boot
when there are single path devices and find_multipaths is set. Any
solution must do this one right because this is the general case.
Classes 4B and 4C are very rare in comparison. A chunk of users will
never encounter these classes of devices.  I'm not sure how 4B and 4C
compare to each other, but if I had to guess, I would assume that 4B is
more common than 4C.

RedHat's current solution guarantees that you always get Outcome 1 for
4A devices, Outcome 3 for 4B devices, and either Outcome 1 or Outcome 3
for 4C devices (however in practice, 4C Outcome 3 has never been
reported).

SUSE's "imply -n on find_multipaths" solution guarantees that you always
get Outcome 1 for 4A devices, Outcome 1 for 4B devices, and Outcome 1
for 4C devices.

Hopefully we agree on the above analysis. If you think I'm wrong in part
of it, please let me know, because this is what I'm reasoning from. Now
on to your and my proposed solutions.

Your proposed solution guarantees that you always get Outcome 1 for 4A
devices.

After that it gets a little trickier. Your solution involves a timeout,
and that timeout can delay booting if there are 4A devices. Even if we
do the equivalent of "multipath -n" in the initramfs, there are often
still filesystems that need to mount after we switch-root. Those will
get delayed, and the machine may not be usable until they are mounted. I
really do feel that this will not be a rare case at all. You pointed out
that this can be dealt with by decreasing the timeout, even all the way
to 0.  I think that since this timeout is protecting against a problem
in the rare case, by making the common case slower, users will be very
inclined to decrease it.  Thus, it's worth looking at what happens in
the case where the timeout is long enough for multipathd to assemble
the device, and the case where it is not long enough.

When the timeout is long enough for multipathd to have enough time, your
proposed solution guarantees that you will always get Outcome 4 for
class 4B and Outcome 4 for class 4C.

When the timeout is not long enough, Your solution guarantees that you
will get outcome 3 for 4B devices, and either Outcome 1 or Outcome 3 for
4C devices.  However, there is a difference in the 4C case from the
current RedHat solution.  By claiming the path device until the timeout,
you keep the other users from being able to assemble on it, and you give
the addtional paths more time to appear. If your timeout isn't long
enough for multipathd to finish assembling the device, it's very likely
that multipathd is close to being finished to assembling the device.
This means that you make Outcome 3 more likely and Outcome 1 less
likely.

Now let me try to explain my proposed solution a little better than I
did last time. First the rationale. Class 4B and 4C devices are so much
rarer than class 4A devices, that it's not worth slowing down 4A
processing unless we absolutely need to, to avoid the worst case
outcomes for 4B and 4C. Also, for class 4B devices, Outcome 3 and
Outcome 4 are essentially identical to the user.  This means that the
only case where the current RedHat solution is not essentially optimal
is for class 4C devices. Outcome 4 for class 4C devices is what you
called "Nice-to-have", and that's how I feel about it as well. I'm
perfectly fine with Outcome 1 if that's all it takes to make the common
case work as well as possible. The only thing I want to avoid is Outcome
2 and 3. Outcome 2 we already avoid, and Outcome 3 is very rare.  But
by using timeouts, we can make it even rarer, without effecting the
processing of 4A devices at all.

My solution idea is basically a mirror of yours.

At a high level, your solution is:
When you see a "maybe" device, assume it's a "yes" and claim it so that
nothing else can use the device. Then, set a timeout for multipathd to
make use of the device. If that timeout passes, and multipathd hasn't
used the device, go back and unclaim the device so that it's in the
correct state. Then, if something else should use the device, it can.

At a high level, my solution is:
When you see a "maybe" device, assume it's a "no" and don't claim it.
Also, disallow multipathd from using the device. Then, set a timeout for
other things to make use of the device.  When that timeout passes,
mutipathd is no longer disallowed from using that device, so that if
mutipathd should use the device, it can. If multipathd uses the device,
go back and claim the device, so it's in the correct state.

The advantage of your method is that, as long as the timeout is long
enough, you always do the correct thing with multipath devices. The
disadvantage is that the timeout slows down the common case, to make the
rare case correct.

The advantage of my method is that it only slows down the rare case. The
disadvantage is that it will not get the "Nice-to-have" outcome in the
rare case.

I'm working on coding up my solution, which includes a number of the
patches from your solution, but I'm leaving tomorrow for a week of
meetings and conferences, so it might be a little bit it coming.

-Ben

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH 14/16] multipath.rules: find_multipaths+ignore_wwids logic
  2018-01-21  3:21       ` Benjamin Marzinski
@ 2018-01-22 21:56         ` Martin Wilck
  2018-01-25 13:40           ` Benjamin Marzinski
  0 siblings, 1 reply; 30+ messages in thread
From: Martin Wilck @ 2018-01-22 21:56 UTC (permalink / raw)
  To: Benjamin Marzinski; +Cc: dm-devel, Xose Vazquez Perez

Hi Ben,

I agree with most of your analysis, I've added some replies below. 

But I'd like to discuss something else first.

I'd like to *simplify the configuration*, and exclude configurations
that make no sense. Before my commits 64e27e and ffbb88 last year,
there were 3 settings related to path detection: find_multipaths, -i
for multipath (ignore_wwids), and -n for multipathd (ignore_new_devs).
This adds up to 8 combinations, which I denote "fin", "FiN", etc. in
the following, using upper case for "on" and lower case for "off".

The SUSE default setup is "fIn", and the Red Hat / Ubuntu one is "Fin".
In initramfs, Red hat is effectively using "FiN" ("multipathd -n" isn't
used, but strict blacklisting is used to the same effect).

My patch es 64e27e and ffbb88 forced F=>N and F=>i, thus "FiN" became
the only combination with find_multipaths, leaving 5 valid
combinations. My recent RFC series allows only "xiN" and "xIn"
combinations for consistency reasons. But I can see this doesn't fit
the way Red Hat and others are setting up multipath, thus we need
something different.

I wonder if we can agree that the combinations "fIN", "FIN", and "fin"
are useless. "IN" combinations are really dangerous and can lead to the
fatal outcomes 4A.2, 4B.2, 4C.2 from your analysis; they shouldn't be
allowed. "fin" is similar to "Fin" at first sight, but without the
protection of "find_multipaths", it becomes much more likely that a
device that multipath hasn't claimed is claimed by multipathd later, I
think we should disallow it as well, although it's the current upstream
default. Moreover, "fiN" and "FiN" are equivalent: if new devices are
completely ignored, "find_multipaths yes" has no effect.

If we agree on that, I'd like to propose a new configuration scheme. As
in my RFC series, I'd like to replace the command line options with
config file options (**). For backward compatibility reasons, I propose
to use the "find_multipaths" option, but with 4 rather than 2 possible
values:

 - find_multipaths "no": fIn, current SUSE default
 - find_multipaths "yes": Fin, current Red Hat / Ubuntu default
 - find_multipaths "strict": fiN/FiN, use only known WWIDs 
 - find_multipaths "auto": FIn, try to be smart; this is what we've
been discussing.

Having limited the path detection options to reasonable combinations,
we can add more logic to improve the "auto" case, one way or the other.

[(**) "multipath -u -i" might still be allowed for purposes like you
pointed out for anaconda, or interactive querying. It would override
"ignore_wwids" for the "yes" and "strict" cases.]

Now my reply to your mail.

On Sat, 2018-01-20 at 21:21 -0600, Benjamin Marzinski wrote:
> I apologize in advance for how long this is.

It has to be, it's complex :-) Anyway I'll skip everything except 4C),
because we agree on the rest anyway.

> 4C: If in reality, the device should be multipathed but there is
> something else that also wants to use the device, there are four
> possible outcomes:
> 
>         1. The device is not claimed by multipath, and is not
>            multipathed
>         2. The device is claimed by multipath, but not multipathed
>         3. The device is not claimed by multipath, but is multipathed
>         4. The device is claimed by multipath and is multipathed
> 
> Outcome 1 is suboptimal, since the device really should be
> multipathed,
> but the system will still be usable (albeit, with only a single path
> to
> the storage).  However, this is fixable for future boots, by adding
> the
> wwid to the wwids file.

A common case is that users install without multipath, and convert the
system to using multipath later. That means dracut is run in a non-
multipathed system, where the wwids file doesn't contain the entries
for the root FS yet. That's a case which may lead to a fatal variant of
4C.3 later on. 

Along similar lines, it's essential for the Red Hat "multipath-
hostonly" approach that indeed no service in the initrd grabs devices
which might be multipathed later. If that happens, a fatal form of 4C.3
can occur. We see this often with BTRFS + subvolumes.

But initrd issues are out of scope for the current discussion, I guess.

> Outcome 2 is just as bad as Outcome 2 in class 4A. Of course, if the
> device is supposed to be multipathed, and is claimed by multipath, it
> is
> very likely that multipathd will assemble on it, so this is an
> extremely
> rare case.

Certainly. This is why "xIN" should be avoided (see above).

> Outcome 3 is the cause of the never actually observed bug I explained
> in
> an earlier eamil.

We did observe this, but the fatal cases where usually related to
initrd/root FS configuration inconsistencies (see above). But then,
SUSE is normally working with "fIn", where things are a little
different.

> [...]
>
> RedHat's current solution guarantees that you always get Outcome 1
> for
> 4A devices, Outcome 3 for 4B devices, and either Outcome 1 or Outcome
> 3
> for 4C devices (however in practice, 4C Outcome 3 has never been
> reported).
> 
> SUSE's "imply -n on find_multipaths" solution guarantees that you
> always
> get Outcome 1 for 4A devices, Outcome 1 for 4B devices, and Outcome 1
> for 4C devices.
> 
> Hopefully we agree on the above analysis. If you think I'm wrong in
> part
> of it, please let me know, because this is what I'm reasoning from.
> Now
> on to your and my proposed solutions.

All of this made sense to me. I made a similar write-up for myself.

> Your proposed solution guarantees that you always get Outcome 1 for
> 4A
> devices.
> 
> After that it gets a little trickier. Your solution involves a
> timeout,
> and that timeout can delay booting if there are 4A devices. Even if
> we
> do the equivalent of "multipath -n" in the initramfs, there are often
> still filesystems that need to mount after we switch-root. Those will
> get delayed, and the machine may not be usable until they are
> mounted. I
> really do feel that this will not be a rare case at all. You pointed
> out
> that this can be dealt with by decreasing the timeout, even all the
> way
> to 0.  I think that since this timeout is protecting against a
> problem
> in the rare case, by making the common case slower, users will be
> very
> inclined to decrease it.  Thus, it's worth looking at what happens in
> the case where the timeout is long enough for multipathd to assemble
> the device, and the case where it is not long enough.

Yes, the problem is that for large multipath installations and/or SANs
with slow device detection, the timeout has to be large to avoid "false
negatives"; but a large timeout would delay booting in inacceptible
ways for systems with single-path devices.

My idea how to solve this is to make the timeout configurable through
multipath.conf and hwtable, with extra logic to use a *very* small
timeout (1s or no waiting at all) if a device is not listed in the in
either hwtable or config file; thus the typical SAS or SATA devices of
non-multipath OS installations wouldn't be waited for. That should
address your main critique.

> My solution idea is basically a mirror of yours.
> 
> At a high level, your solution is:
> When you see a "maybe" device, assume it's a "yes" and claim it so
> that
> nothing else can use the device. Then, set a timeout for multipathd
> to
> make use of the device. If that timeout passes, and multipathd hasn't
> used the device, go back and unclaim the device so that it's in the
> correct state. Then, if something else should use the device, it can.
> 
> At a high level, my solution is:
> When you see a "maybe" device, assume it's a "no" and don't claim it.
> Also, disallow multipathd from using the device. Then, set a timeout
> for
> other things to make use of the device.  When that timeout passes,
> mutipathd is no longer disallowed from using that device, so that if
> mutipathd should use the device, it can. If multipathd uses the
> device,
> go back and claim the device, so it's in the correct state.

How would you disallow multipathd to use the device? By setting an udev
property? And why would you do it? Don't you agree that, as soon as a
second path is encountered, multipathd should be allowed to grab both?
Maybe I misunderstood, and multipathd will only be forbidden to use the
path as long as there's only one? But no, with "find_multipaths on",
multipathd wouldn't grab a single path anyway... I'm a bit confused.

Along similar lines as you argued about my approach, by delaying
multipathd's actions, you'd increase the probability of the suboptimal
outcome 1). And you're opening up the time window in which both
multipathd and other layers can grab the device, which may be not so
bad in practice as you say, but still bothers me for principal reasons.

Finally, as you said yourself, multipathd is likely to "loose the race"
anyway. With your patch you just make its chance even smaller. In a
way, d7188fc "multipathd: start daemon after udev trigger" already
implements your idea, because by the time multipathd starts, essential
device detection will be finished (with the exception of extremely slow
device detection where the udev queue runs empty).

> The advantage of your method is that, as long as the timeout is long
> enough, you always do the correct thing with multipath devices. The
> disadvantage is that the timeout slows down the common case, to make
> the
> rare case correct.

Would the idea with variable timeouts improve my approach in your eyes?

> The advantage of my method is that it only slows down the rare case.
> The
> disadvantage is that it will not get the "Nice-to-have" outcome in
> the
> rare case.
> 
> I'm working on coding up my solution, which includes a number of the
> patches from your solution, but I'm leaving tomorrow for a week of
> meetings and conferences, so it might be a little bit it coming.

Looking forward to it.

Btw, it just occured to me that your approach could be implemented in
exactly the way as mine. Basically, all we need to change is what udev
properties get set on the "maybe" uevents. Take my code, but don't set
SYSTEMD_READY=0 and DM_MULTIPATH_DEVICE_PATH=1 in the "maybe" case...
Should work, no? 

Cheers,
Martin

-- 
Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH 14/16] multipath.rules: find_multipaths+ignore_wwids logic
  2018-01-22 21:56         ` Martin Wilck
@ 2018-01-25 13:40           ` Benjamin Marzinski
  2018-01-26 17:29             ` Martin Wilck
  0 siblings, 1 reply; 30+ messages in thread
From: Benjamin Marzinski @ 2018-01-25 13:40 UTC (permalink / raw)
  To: Martin Wilck; +Cc: dm-devel, Xose Vazquez Perez

On Mon, Jan 22, 2018 at 10:56:19PM +0100, Martin Wilck wrote:
> Hi Ben,
> 
> I agree with most of your analysis, I've added some replies below. 
> 
> But I'd like to discuss something else first.
> 
> I'd like to *simplify the configuration*, and exclude configurations
> that make no sense. Before my commits 64e27e and ffbb88 last year,
> there were 3 settings related to path detection: find_multipaths, -i
> for multipath (ignore_wwids), and -n for multipathd (ignore_new_devs).
> This adds up to 8 combinations, which I denote "fin", "FiN", etc. in
> the following, using upper case for "on" and lower case for "off".
> 
> The SUSE default setup is "fIn", and the Red Hat / Ubuntu one is "Fin".
> In initramfs, Red hat is effectively using "FiN" ("multipathd -n" isn't
> used, but strict blacklisting is used to the same effect).
> 
> My patch es 64e27e and ffbb88 forced F=>N and F=>i, thus "FiN" became
> the only combination with find_multipaths, leaving 5 valid
> combinations. My recent RFC series allows only "xiN" and "xIn"
> combinations for consistency reasons. But I can see this doesn't fit
> the way Red Hat and others are setting up multipath, thus we need
> something different.
> 
> I wonder if we can agree that the combinations "fIN", "FIN", and "fin"
> are useless. "IN" combinations are really dangerous and can lead to the
> fatal outcomes 4A.2, 4B.2, 4C.2 from your analysis; they shouldn't be
> allowed. "fin" is similar to "Fin" at first sight, but without the
> protection of "find_multipaths", it becomes much more likely that a
> device that multipath hasn't claimed is claimed by multipathd later, I
> think we should disallow it as well, although it's the current upstream
> default. Moreover, "fiN" and "FiN" are equivalent: if new devices are
> completely ignored, "find_multipaths yes" has no effect.

I'd be o.k with removing "fin" from upstream in favor of "fIn". My
analysis ingnores class 3 devices, on the assumption that "fIn" is
correct. If we do this, I will change "fIn" to "fin" in redhat's local
patches. Here's why. Outcome 2 is really bad. Imagine a
non-find-mutipaths equivalent of 4C Outcome 2 (3C Outcome 2?). The "I"
makes sure that multipath claims the device when it first appears, but
for some reason multipathd simply can't create a device on it.  This can
happen if multipath is not running in your initramfs, and something else
sets itself up on a path device.  When you switch-root, multipath will
claim the device, and mess everything up. But basically, this can happen
any time that multipathd fails to be able to set up on a device it has
claimed.  One way you can deal with most of these possibilites is to
require that multipathd has to set itself up on path at least once
before we claim it. That's exactly what "i" gives us.

Possibly another option is to check if something else is using the device
and to not claim the device in this case. That will solve the initramfs
case, with is the really bad one. It will still leave the case where
multipath will never be able to create the device for some other reason,
but that is almost always a configuration issues, where for instance
multipath should be blacklisting a whole class of devices that it isn't.

> If we agree on that, I'd like to propose a new configuration scheme. As
> in my RFC series, I'd like to replace the command line options with
> config file options (**). For backward compatibility reasons, I propose
> to use the "find_multipaths" option, but with 4 rather than 2 possible
> values:
> 
>  - find_multipaths "no": fIn, current SUSE default
>  - find_multipaths "yes": Fin, current Red Hat / Ubuntu default
>  - find_multipaths "strict": fiN/FiN, use only known WWIDs 
>  - find_multipaths "auto": FIn, try to be smart; this is what we've
> been discussing.

I would still like to put forward the code for my idea, but if we don't
find agreement on anything else, I would definitely accept this as
the upstream version (with the caveat that I really don't like ending up
in Outcome 2, and will make "no" be "fin" on RedHat if I can't convince
you to do that upstream).

Actually, AFAICS, FIn with being smart is not safe if you don't have
multipath in the initrd, and something grabs a device there. After the
switch-root, multipath will claim the device (even if just for a limited
time). If you are directly using this device (instead of through
LVM/MD), and it gets set to not ready in systemd, which can
automatically unmount the device.  If we can check if a path device is
in use before claiming it, that should solve this.

> Having limited the path detection options to reasonable combinations,
> we can add more logic to improve the "auto" case, one way or the other.
> 
> [(**) "multipath -u -i" might still be allowed for purposes like you
> pointed out for anaconda, or interactive querying. It would override
> "ignore_wwids" for the "yes" and "strict" cases.]
> 
> Now my reply to your mail.
> 
> On Sat, 2018-01-20 at 21:21 -0600, Benjamin Marzinski wrote:
> > I apologize in advance for how long this is.
> 
> It has to be, it's complex :-) Anyway I'll skip everything except 4C),
> because we agree on the rest anyway.
> 
> > 4C: If in reality, the device should be multipathed but there is
> > something else that also wants to use the device, there are four
> > possible outcomes:
> > 
> >         1. The device is not claimed by multipath, and is not
> >            multipathed
> >         2. The device is claimed by multipath, but not multipathed
> >         3. The device is not claimed by multipath, but is multipathed
> >         4. The device is claimed by multipath and is multipathed
> > 
> > Outcome 1 is suboptimal, since the device really should be
> > multipathed,
> > but the system will still be usable (albeit, with only a single path
> > to
> > the storage).  However, this is fixable for future boots, by adding
> > the
> > wwid to the wwids file.
> 
> A common case is that users install without multipath, and convert the
> system to using multipath later. That means dracut is run in a non-
> multipathed system, where the wwids file doesn't contain the entries
> for the root FS yet. That's a case which may lead to a fatal variant of
> 4C.3 later on. 

How? This outcome only happens in a "Fin" or "FiN" setup. You never
claim the device, because you never multipath the device. In this
sitution, multipath never changes the path device at all. If dracut is
run without multipath running, it will create an initramfs where
multipathd won't grab the devices (which I agree is what Outcome 1 is
all about).  This will mean that the other users grab the devices, which
means that Outcome 3 is pretty impossible, because the device is already
in use by something else.  The only thing that can grab a path device
device and have multipath grab it later is LVM on a whole device. This
is the specific case that reassign_maps is designed to handle. Even
without it, if multipathd created a device on top of (and later claimed)
the same paths that a LVM device is using, it would set the path devices
to not ready, not the LVM device.

> Along similar lines, it's essential for the Red Hat "multipath-
> hostonly" approach that indeed no service in the initrd grabs devices
> which might be multipathed later. If that happens, a fatal form of 4C.3
> can occur. We see this often with BTRFS + subvolumes.

Again, I don't understand how the case here works.  If something in the
initrd grabs the device, that will keep multipathd from assembling on
it. If LVM is already assembled, it shouldn't be hard to make multipath
notice this and not assemble even if LVM is on the whole device. As an
aside, I am personally very wary about reassign_maps. Multipath doesn't
own the other devices it is reloading. There is nothing to guarantee
that someone else isn't trying to modify the LVM device at the same
time. I don't know of a specific bug with this (we never turn it on),
but it seems very risky to start changing devices we don't own with no
coordination.

If I had to make a guess, I can definitely see how you could get into a
problem with the SUSE policy of "fIn".  In this case, multipathd doesn't
claim or grab the device in the initrd, so something else does. Then
after the switch-root, multipath will claim the device and multipathd
won't be able to assemble on it.  This is the dreaded Outcome 2, and
this is the reason I never use "I", even when find_multipaths is not
set.

On the other hand, it's quite possible that I'm just missing something,
and you can get to state 4C.3 (where mutipathd wins the race to assemble
on the device, even though it hasn't claimed it). Could you try to
explain how this happens a little more.

> But initrd issues are out of scope for the current discussion, I guess.
> 
> > Outcome 2 is just as bad as Outcome 2 in class 4A. Of course, if the
> > device is supposed to be multipathed, and is claimed by multipath, it
> > is
> > very likely that multipathd will assemble on it, so this is an
> > extremely
> > rare case.
> 
> Certainly. This is why "xIN" should be avoided (see above).

So, I could see Outcome 2 happening because of initrd issues. But I look
at this as a problem with using "I" in the udev rules.

> 
> > Outcome 3 is the cause of the never actually observed bug I explained
> > in
> > an earlier eamil.
> 
> We did observe this, but the fatal cases where usually related to
> initrd/root FS configuration inconsistencies (see above). But then,
> SUSE is normally working with "fIn", where things are a little
> different.

I still don't see how you get here instead of Outcome 2.

> > [...]
> >
> > RedHat's current solution guarantees that you always get Outcome 1
> > for
> > 4A devices, Outcome 3 for 4B devices, and either Outcome 1 or Outcome
> > 3
> > for 4C devices (however in practice, 4C Outcome 3 has never been
> > reported).
> > 
> > SUSE's "imply -n on find_multipaths" solution guarantees that you
> > always
> > get Outcome 1 for 4A devices, Outcome 1 for 4B devices, and Outcome 1
> > for 4C devices.
> > 
> > Hopefully we agree on the above analysis. If you think I'm wrong in
> > part
> > of it, please let me know, because this is what I'm reasoning from.
> > Now
> > on to your and my proposed solutions.
> 
> All of this made sense to me. I made a similar write-up for myself.
> 
> > Your proposed solution guarantees that you always get Outcome 1 for
> > 4A
> > devices.
> > 
> > After that it gets a little trickier. Your solution involves a
> > timeout,
> > and that timeout can delay booting if there are 4A devices. Even if
> > we
> > do the equivalent of "multipath -n" in the initramfs, there are often
> > still filesystems that need to mount after we switch-root. Those will
> > get delayed, and the machine may not be usable until they are
> > mounted. I
> > really do feel that this will not be a rare case at all. You pointed
> > out
> > that this can be dealt with by decreasing the timeout, even all the
> > way
> > to 0.  I think that since this timeout is protecting against a
> > problem
> > in the rare case, by making the common case slower, users will be
> > very
> > inclined to decrease it.  Thus, it's worth looking at what happens in
> > the case where the timeout is long enough for multipathd to assemble
> > the device, and the case where it is not long enough.
> 
> Yes, the problem is that for large multipath installations and/or SANs
> with slow device detection, the timeout has to be large to avoid "false
> negatives"; but a large timeout would delay booting in inacceptible
> ways for systems with single-path devices.
> 
> My idea how to solve this is to make the timeout configurable through
> multipath.conf and hwtable, with extra logic to use a *very* small
> timeout (1s or no waiting at all) if a device is not listed in the in
> either hwtable or config file; thus the typical SAS or SATA devices of
> non-multipath OS installations wouldn't be waited for. That should
> address your main critique.

I agree that this will make the boot problem much less likely. It would
still exist for SAN storage that isn't multipathed, but this is rarer
(although not incredibly rare).

> > My solution idea is basically a mirror of yours.
> > 
> > At a high level, your solution is:
> > When you see a "maybe" device, assume it's a "yes" and claim it so
> > that
> > nothing else can use the device. Then, set a timeout for multipathd
> > to
> > make use of the device. If that timeout passes, and multipathd hasn't
> > used the device, go back and unclaim the device so that it's in the
> > correct state. Then, if something else should use the device, it can.
> > 
> > At a high level, my solution is:
> > When you see a "maybe" device, assume it's a "no" and don't claim it.
> > Also, disallow multipathd from using the device. Then, set a timeout
> > for
> > other things to make use of the device.  When that timeout passes,
> > mutipathd is no longer disallowed from using that device, so that if
> > mutipathd should use the device, it can. If multipathd uses the
> > device,
> > go back and claim the device, so it's in the correct state.
> 
> How would you disallow multipathd to use the device? By setting an udev
> property?

No. I would do it by having should_multipath() return "maybe" and having
multipathd flag those paths and then ignore them, and set the path's
checker ticks for whatever timeout you choose. When that time expires,
the cherckerloop will assemble the map. It already does things like this
is different cases.  All the code changes are in multipath.

> And why would you do it? Don't you agree that, as soon as a
> second path is encountered, multipathd should be allowed to grab both?
> Maybe I misunderstood, and multipathd will only be forbidden to use the
> path as long as there's only one? But no, with "find_multipaths on",
> multipathd wouldn't grab a single path anyway... I'm a bit confused.

Well, in RedHat, multipathd currently grabs the device as soon as two
paths appear, and I am fine with keeping things that way.  But this idea
IS to keep multipathd from assembling on a device, even after the second
path appears (assuming it appears before the timeout). I'm doing this
for one purpose only, to make sure that I am always in case 4C.1 instead
of 4C.3.  I have never seen case 4C.3 happen, since in practice
multipathd always loses this race, which gives you case 4C.1.  Also, I
look at 4C.1 as only being different from the ideal case in that it
misses a nice-to-have feature. Since I have never seen 4C.3 happen, any
timeout, even 1 second, will only make it even more remote a chance.
And the only downside is that it takes a second longer to get to case
4B.3, which will only happen the first time you see new multipathed
storage.

> 
> Along similar lines as you argued about my approach, by delaying
> multipathd's actions, you'd increase the probability of the suboptimal
> outcome 1).

This is not only true. It is the whole point. 4C.3 is bad, 4C.1 is only
not ideal. Wasting a couple of seconds on creating a multipath device
the very rare case that you are seeing it for the first time, in a way
that doesn't slow anything else down, is a reasonable trade-off to make
sure you get the better outcome (4C.1).

> And you're opening up the time window in which both
> multipathd and other layers can grab the device, which may be not so
> bad in practice as you say, but still bothers me for principal reasons.

multipathd and others can't both grab the device (except in the case of
whole device LVs, and I'm fine with removing that ability as well). The
race that is really important to remove is multipath claiming the
device, and someone else using it. I do get that I am forcing a
non-optimal case, where you are trying to make sure that we are always
in the optimal one.  It's just that we only see these 4B and 4C devices
in rare cases where we are adding new multipath storage to the system,
and I don't want to slow the common case to make this better. I also
don't want to push any more complexity into the udev rules.

> Finally, as you said yourself, multipathd is likely to "loose the race"
> anyway. With your patch you just make its chance even smaller. In a
> way, d7188fc "multipathd: start daemon after udev trigger" already
> implements your idea, because by the time multipathd starts, essential
> device detection will be finished (with the exception of extremely slow
> device detection where the udev queue runs empty).

I don't worry about 4C.3 happening in our current RedHat setup. There
isn't a hard barrier that is keeping this from happening, but the timing
makes it very unlikely.  If we assume that it won't happen, then
RedHat's current implementation guarantees 4A.1, 4B.3, and 4C.1.  I'm
fine with those guarantees.  Problems like you mention above, which can
cause 4C.2 if you use "I", even in the non-find-multipaths case, make me
leary about using "I" in any setup. But I'm willing to switch the
non-find-multipaths case to "i" in a RedHat patch, if I am alone in this
concern.

> > The advantage of your method is that, as long as the timeout is long
> > enough, you always do the correct thing with multipath devices. The
> > disadvantage is that the timeout slows down the common case, to make
> > the
> > rare case correct.
> 
> Would the idea with variable timeouts improve my approach in your eyes?

Yes. It still will cause slowdowns on single-pathed SAN storage, but it
should fix the most common case.

> > The advantage of my method is that it only slows down the rare case.
> > The
> > disadvantage is that it will not get the "Nice-to-have" outcome in
> > the
> > rare case.
> > 
> > I'm working on coding up my solution, which includes a number of the
> > patches from your solution, but I'm leaving tomorrow for a week of
> > meetings and conferences, so it might be a little bit it coming.
> 
> Looking forward to it.

If nobody is worried about multipathd winning the race against other
device users, then 4C.3 is basically an impossible state, and there is
no point in adding an additional timeout to make an impossible state
less likely. In this case, there is no point in my solution. As far as
limiting the number of possible configurations. If we could agree that
"I" isn't safe when checking if multipath should claim a device in udev,
then there would be only 3 cases: fin, Fin, and FiN/fiN.  Like I said,
there two classes of problem where "I" causes problems: if the device is
already in use, and if multipathd simply can't set itself up on the
device.  If we check the path device is not being used before claiming
it, then FIn with being smart is also a safe case since it will solve
both of these. fIn with being smart is also safe. I simply don't believe
that fIn is safe without doing these extra steps to protect against
claiming devices that we shouldn't.

This would still allow 5 states, that would probably need 3 config
parameters

- (f)ind_multipaths
- (i)gnore_wwids (or "smart" or something else. I orginally called this
  mode "greedy")
- (n)o_new_devs

In this case, N would ignore f/F and i/I. Because we are protecting
against problems with "I", any of the other four states are valid.

> Btw, it just occured to me that your approach could be implemented in
> exactly the way as mine. Basically, all we need to change is what udev
> properties get set on the "maybe" uevents. Take my code, but don't set
> SYSTEMD_READY=0 and DM_MULTIPATH_DEVICE_PATH=1 in the "maybe" case...
> Should work, no? 

No. This would let nobody use the device. lvm won't scan devices in
SYSTEMD_READY=0 state, and they can't be mounted.  These are exactly
the things I am trying to allow.

-Ben

> Cheers,
> Martin
> 
> -- 
> Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
> SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton
> HRB 21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH 14/16] multipath.rules: find_multipaths+ignore_wwids logic
  2018-01-25 13:40           ` Benjamin Marzinski
@ 2018-01-26 17:29             ` Martin Wilck
  2018-01-29 22:28               ` Benjamin Marzinski
  0 siblings, 1 reply; 30+ messages in thread
From: Martin Wilck @ 2018-01-26 17:29 UTC (permalink / raw)
  To: dm-devel

On Thu, 2018-01-25 at 07:40 -0600, Benjamin Marzinski wrote:
> On Mon, Jan 22, 2018 at 10:56:19PM +0100, Martin Wilck wrote:

> > I'd like to *simplify the configuration*, and exclude
> > configurations
> > that make no sense. Before my commits 64e27e and ffbb88 last year,
> > there were 3 settings related to path detection: find_multipaths,
> > -i
> > for multipath (ignore_wwids), and -n for multipathd
> > (ignore_new_devs).
> > This adds up to 8 combinations, which I denote "fin", "FiN", etc.
> > in
> > the following, using upper case for "on" and lower case for "off".
> > 
> > The SUSE default setup is "fIn", and the Red Hat / Ubuntu one is
> > "Fin".
> > In initramfs, Red hat is effectively using "FiN" ("multipathd -n"
> > isn't
> > used, but strict blacklisting is used to the same effect).
> > 
> > My patch es 64e27e and ffbb88 forced F=>N and F=>i, thus "FiN"
> > became
> > the only combination with find_multipaths, leaving 5 valid
> > combinations. My recent RFC series allows only "xiN" and "xIn"
> > combinations for consistency reasons. But I can see this doesn't
> > fit
> > the way Red Hat and others are setting up multipath, thus we need
> > something different.
> > 
> > I wonder if we can agree that the combinations "fIN", "FIN", and
> > "fin"
> > are useless. "IN" combinations are really dangerous and can lead to
> > the
> > fatal outcomes 4A.2, 4B.2, 4C.2 from your analysis; they shouldn't
> > be
> > allowed. "fin" is similar to "Fin" at first sight, but without the
> > protection of "find_multipaths", it becomes much more likely that a
> > device that multipath hasn't claimed is claimed by multipathd
> > later, I
> > think we should disallow it as well, although it's the current
> > upstream
> > default. Moreover, "fiN" and "FiN" are equivalent: if new devices
> > are
> > completely ignored, "find_multipaths yes" has no effect.
> 
> I'd be o.k with removing "fin" from upstream in favor of "fIn". My
> analysis ingnores class 3 devices, on the assumption that "fIn" is
> correct. If we do this, I will change "fIn" to "fin" in redhat's
> local
> patches. Here's why. Outcome 2 is really bad. Imagine a
> non-find-mutipaths equivalent of 4C Outcome 2 (3C Outcome 2?). The
> "I"
> makes sure that multipath claims the device when it first appears,
> but
> for some reason multipathd simply can't create a device on it.  This
> can
> happen if multipath is not running in your initramfs, and something
> else
> sets itself up on a path device.  When you switch-root, multipath
> will
> claim the device, and mess everything up. But basically, this can
> happen
> any time that multipathd fails to be able to set up on a device it
> has
> claimed.  One way you can deal with most of these possibilites is to
> require that multipathd has to set itself up on path at least once
> before we claim it. That's exactly what "i" gives us.

I agree that this makes perfect sense in the "Fin" case, but for "fin",
if find_multipaths is off, it seems odd. "fXn" means that multipathd
claims everything it comes by, which pairs better with "I" than "i".

> Possibly another option is to check if something else is using the
> device
> and to not claim the device in this case. That will solve the
> initramfs
> case, with is the really bad one.

I agree it's bad, but AFAICS "i" only eliminates a part of the problem.

> It will still leave the case where
> multipath will never be able to create the device for some other
> reason,
> but that is almost always a configuration issues, where for instance
> multipath should be blacklisting a whole class of devices that it
> isn't.

Hannes and I have recently discussed an idea how to deal with this
situation. It's quite high on my todo list. The basic idea is to that
if multipathd fails to set up a device, it leaves a flag somewhere and
retrigger an uevent. When multipath sees the flag, it doesn't claim the
device next time.

> 
> If we agree on that, I'd like to propose a new configuration scheme.
> As
> > in my RFC series, I'd like to replace the command line options with
> > config file options (**). For backward compatibility reasons, I
> > propose
> > to use the "find_multipaths" option, but with 4 rather than 2
> > possible
> > values:
> > 
> >  - find_multipaths "no": fIn, current SUSE default
> >  - find_multipaths "yes": Fin, current Red Hat / Ubuntu default
> >  - find_multipaths "strict": fiN/FiN, use only known WWIDs 
> >  - find_multipaths "auto": FIn, try to be smart; this is what we've
> > been discussing.
> 
> I would still like to put forward the code for my idea, but if we
> don't
> find agreement on anything else, I would definitely accept this as
> the upstream version (with the caveat that I really don't like ending
> up
> in Outcome 2, and will make "no" be "fin" on RedHat if I can't
> convince
> you to do that upstream).

If you insist, we'll just let "fin" survive as, say, 'find_multipaths
"conservative"'. Or we call "fIn" "greedy" and "fin" "no".

> > A common case is that users install without multipath, and convert
> > the
> > system to using multipath later. That means dracut is run in a non-
> > multipathed system, where the wwids file doesn't contain the
> > entries
> > for the root FS yet. That's a case which may lead to a fatal
> > variant of
> > 4C.3 later on. 
> 
> How? This outcome only happens in a "Fin" or "FiN" setup. You never
> claim the device, because you never multipath the device. In this
> sitution, multipath never changes the path device at all. If dracut
> is
> run without multipath running, it will create an initramfs where
> multipathd won't grab the devices (which I agree is what Outcome 1 is
> all about).  This will mean that the other users grab the devices,
> which
> means that Outcome 3 is pretty impossible, because the device is
> already
> in use by something else. 

We had a case where the customer ran "dracut --add multipath", but
forgot to run "systemctl enable multipath.service". Because we use
"fIn", multipathd grabbed the device in the initrd, and afterwards the
root device's subvolumes couldn't be mounted, because systemd would try
to mount the members, which were locked by the multipath device.

>  The only thing that can grab a path device
> device and have multipath grab it later is LVM on a whole device.
> This
> is the specific case that reassign_maps is designed to handle. Even
> without it, if multipathd created a device on top of (and later
> claimed)
> the same paths that a LVM device is using, it would set the path
> devices
> to not ready, not the LVM device.
> 
> > Along similar lines, it's essential for the Red Hat "multipath-
> > hostonly" approach that indeed no service in the initrd grabs
> > devices
> > which might be multipathed later. If that happens, a fatal form of
> > 4C.3
> > can occur. We see this often with BTRFS + subvolumes.
> 
> Again, I don't understand how the case here works.  If something in
> the
> initrd grabs the device, that will keep multipathd from assembling on
> it.

Exactly. But if then after switch root, multipath claims the device,
and multipathd fails to grab it, we're hosed. The root device is
accessible but other file systems on the same device (e.g. subvolumes)
can't be mounted. Actually this rather 4C.2, sorry for mixing it up.

>  If LVM is already assembled, it shouldn't be hard to make multipath
> notice this and not assemble even if LVM is on the whole device.

Maybe not hard, but currently unimplemented :-/

>  As an
> aside, I am personally very wary about reassign_maps. Multipath
> doesn't
> own the other devices it is reloading. There is nothing to guarantee
> that someone else isn't trying to modify the LVM device at the same
> time. I don't know of a specific bug with this (we never turn it on),
> but it seems very risky to start changing devices we don't own with
> no
> coordination.

I've no experience with reassign_maps. It's tempting to think that it
could solve all these nasty issues, but my gut feeling, like yours,
says that it might be dangerous. We don't turn it on, either.

> If I had to make a guess, I can definitely see how you could get into
> a
> problem with the SUSE policy of "fIn".  In this case, multipathd
> doesn't
> claim or grab the device in the initrd, so something else does. Then
> after the switch-root, multipath will claim the device and multipathd
> won't be able to assemble on it.  This is the dreaded Outcome 2, and
> this is the reason I never use "I", even when find_multipaths is not
> set.

Ah, right the situation that I described above. I think we're mostly on
the same boat here now.

> [...]
> > 
> > Finally, as you said yourself, multipathd is likely to "loose the
> > race"
> > anyway. With your patch you just make its chance even smaller. In a
> > way, d7188fc "multipathd: start daemon after udev trigger" already
> > implements your idea, because by the time multipathd starts,
> > essential
> > device detection will be finished (with the exception of extremely
> > slow
> > device detection where the udev queue runs empty).
> 
> I don't worry about 4C.3 happening in our current RedHat setup. There
> isn't a hard barrier that is keeping this from happening, but the
> timing
> makes it very unlikely.  If we assume that it won't happen, then
> RedHat's current implementation guarantees 4A.1, 4B.3, and 4C.1.  I'm
> fine with those guarantees.

I agree your approach can't hurt, although I still think it will just
make a very unlikely outcome even more unlikely (the 4C.3 case
described above is obviously very special, and your patch wouldn't
avoid it). I hope that our mutual ideas can go together.

> 
>   Problems like you mention above, which can
> cause 4C.2 if you use "I", even in the non-find-multipaths case, make
> me
> leary about using "I" in any setup. But I'm willing to switch the
> non-find-multipaths case to "i" in a RedHat patch, if I am alone in
> this
> concern.

That won't be necessary, see above.

> > > The advantage of your method is that, as long as the timeout is
> > > long
> > > enough, you always do the correct thing with multipath devices.
> > > The
> > > disadvantage is that the timeout slows down the common case, to
> > > make
> > > the
> > > rare case correct.
> > 
> > Would the idea with variable timeouts improve my approach in your
> > eyes?
> 
> Yes. It still will cause slowdowns on single-pathed SAN storage, but
> it
> should fix the most common case.

Great.

> If nobody is worried about multipathd winning the race against other
> device users, then 4C.3 is basically an impossible state, and there
> is
> no point in adding an additional timeout to make an impossible state
> less likely. In this case, there is no point in my solution. As far
> as
> limiting the number of possible configurations. If we could agree
> that
> "I" isn't safe when checking if multipath should claim a device in
> udev,
> then there would be only 3 cases: fin, Fin, and FiN/fiN.  Like I
> said,
> there two classes of problem where "I" causes problems: if the device
> is
> already in use, and if multipathd simply can't set itself up on the
> device.  If we check the path device is not being used before
> claiming
> it, then FIn with being smart is also a safe case since it will solve
> both of these. fIn with being smart is also safe. I simply don't
> believe
> that fIn is safe without doing these extra steps to protect against
> claiming devices that we shouldn't.

You don't have to use it, but please let's keep the "fIn" option
around. Our customers are used to this behavior. I don't deny that it
has caused some problems (usually related to mishandling one way or the
other), but we're not ready to give it up. We'll be working on
improving it. Dropping it upstream would hurt us.

> 
> This would still allow 5 states, that would probably need 3 config
> parameters
> 
> - (f)ind_multipaths
> - (i)gnore_wwids (or "smart" or something else. I orginally called
> this
>   mode "greedy")
> - (n)o_new_devs

I'd really like to pursue the idea to hide all of this in a single
option with multiple possible values, rather than providing several
options but disallowing certain combinations thereof. That was the main
point I was trying to make in my previous email.

> 
> In this case, N would ignore f/F and i/I. Because we are protecting
> against problems with "I", any of the other four states are valid.
> 
> > Btw, it just occured to me that your approach could be implemented
> > in
> > exactly the way as mine. Basically, all we need to change is what
> > udev
> > properties get set on the "maybe" uevents. Take my code, but don't
> > set
> > SYSTEMD_READY=0 and DM_MULTIPATH_DEVICE_PATH=1 in the "maybe"
> > case...
> > Should work, no? 
> 
> No. This would let nobody use the device. lvm won't scan devices in
> SYSTEMD_READY=0 state, and they can't be mounted.  These are exactly
> the things I am trying to allow.

That's why I said *don't* set SYSTEMD_READY=0 :-) ... but never mind,
you were thinking of a different solution anyway.

Regards,
Martin


-- 
Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH 14/16] multipath.rules: find_multipaths+ignore_wwids logic
  2018-01-26 17:29             ` Martin Wilck
@ 2018-01-29 22:28               ` Benjamin Marzinski
  2018-01-30 13:07                 ` Martin Wilck
  0 siblings, 1 reply; 30+ messages in thread
From: Benjamin Marzinski @ 2018-01-29 22:28 UTC (permalink / raw)
  To: Martin Wilck; +Cc: dm-devel

On Fri, Jan 26, 2018 at 06:29:04PM +0100, Martin Wilck wrote:
> On Thu, 2018-01-25 at 07:40 -0600, Benjamin Marzinski wrote:
> > On Mon, Jan 22, 2018 at 10:56:19PM +0100, Martin Wilck wrote:
> > I'd be o.k with removing "fin" from upstream in favor of "fIn". My
> > analysis ingnores class 3 devices, on the assumption that "fIn" is
> > correct. If we do this, I will change "fIn" to "fin" in redhat's
> > local
> > patches. Here's why. Outcome 2 is really bad. Imagine a
> > non-find-mutipaths equivalent of 4C Outcome 2 (3C Outcome 2?). The
> > "I"
> > makes sure that multipath claims the device when it first appears,
> > but
> > for some reason multipathd simply can't create a device on it.  This
> > can
> > happen if multipath is not running in your initramfs, and something
> > else
> > sets itself up on a path device.  When you switch-root, multipath
> > will
> > claim the device, and mess everything up. But basically, this can
> > happen
> > any time that multipathd fails to be able to set up on a device it
> > has
> > claimed.  One way you can deal with most of these possibilites is to
> > require that multipathd has to set itself up on path at least once
> > before we claim it. That's exactly what "i" gives us.
> 
> I agree that this makes perfect sense in the "Fin" case, but for "fin",
> if find_multipaths is off, it seems odd. "fXn" means that multipathd
> claims everything it comes by, which pairs better with "I" than "i".
> 
> > Possibly another option is to check if something else is using the
> > device
> > and to not claim the device in this case. That will solve the
> > initramfs
> > case, with is the really bad one.
> 
> I agree it's bad, but AFAICS "i" only eliminates a part of the problem.

I agree "i" isn't a perfect solution.  But adding code to never claim a
device that is currently in use will help things, regardless of any
other changes we do or whatever mode we are in.  If we are in agreement
that reassign_maps isn't safe, then there's never a good reason for
multipathd to set up on a path that is already in use. If we don't do
that, then we should never claim these paths either. 
 
> > It will still leave the case where
> > multipath will never be able to create the device for some other
> > reason,
> > but that is almost always a configuration issues, where for instance
> > multipath should be blacklisting a whole class of devices that it
> > isn't.
> 
> Hannes and I have recently discussed an idea how to deal with this
> situation. It's quite high on my todo list. The basic idea is to that
> if multipathd fails to set up a device, it leaves a flag somewhere and
> retrigger an uevent. When multipath sees the flag, it doesn't claim the
> device next time.

How is this different from the pathes you just wrote to deal with the
find_multipaths case? It claims the path device immediately, sets a
timer, and if multipathd doesn't set itself up within that time, it
unclaims the path.  The only addition you need is for multipathd to
trigger that change event for devices that it fails set up on, so that
you don't have to wait for the timeout. You still do need a backup
timeout to deal with the case where mutipathd doesn't start up
correctly.

> 
> If you insist, we'll just let "fin" survive as, say, 'find_multipaths
> "conservative"'. Or we call "fIn" "greedy" and "fin" "no".

I'm fine with this. But, assuming you agree with my two above comments,
then I'm not sure why you don't like the five options outlined in my
last email: 

fin like we have now
Fin like we have now
fIn with the ability to go back and unclaim the device
Fin with the ability to go back and unclaim the device
xxN where f/F and i/I are ignored

* all with the additional check that we don't claim or assemble on paths
that are currently in use.

I really don't care how we expose these options to the user. I was just
pointing out that if we used "find_multipaths", "ignore_wwids", and
"no_new_devs", where "ignore_wwids" implied the ability to go back and
unclaim the device, then every combination of the three is valid. But
whether we use one config option with 5 keywords, or three config
options that are boolean, or whatever, is fine by me.
 
> We had a case where the customer ran "dracut --add multipath", but
> forgot to run "systemctl enable multipath.service". Because we use
> "fIn", multipathd grabbed the device in the initrd, and afterwards the
> root device's subvolumes couldn't be mounted, because systemd would try
> to mount the members, which were locked by the multipath device.

So, I'm still a little fuzzy on what happened here. Are you saying that
this case was also an instance of 4C.2 after all? If not that, was the
problem that multipath was assembled on the path device, but not having
multipathd enabled made "multipath -u" not claim the device after the
switch-root, making it appear available for others to use.  Perhaps in
that case, if we fail the check to see if multipathd is running/wanted
in "multipath -u", we should still check if the path device is
multipathed already, and if it is, we should claim it regardless,
because nobody else will be able to.
 
> >  As an
> > aside, I am personally very wary about reassign_maps. Multipath
> > doesn't
> > own the other devices it is reloading. There is nothing to guarantee
> > that someone else isn't trying to modify the LVM device at the same
> > time. I don't know of a specific bug with this (we never turn it on),
> > but it seems very risky to start changing devices we don't own with
> > no
> > coordination.
> 
> I've no experience with reassign_maps. It's tempting to think that it
> could solve all these nasty issues, but my gut feeling, like yours,
> says that it might be dangerous. We don't turn it on, either.
> 

If neither of us find this safe, we should deprecate it at the least,
and probably disable it and print a warning message instead, if it is
set in multipath.conf.

> > > 
> > > Finally, as you said yourself, multipathd is likely to "loose the
> > > race"
> > > anyway. With your patch you just make its chance even smaller. In a
> > > way, d7188fc "multipathd: start daemon after udev trigger" already
> > > implements your idea, because by the time multipathd starts,
> > > essential
> > > device detection will be finished (with the exception of extremely
> > > slow
> > > device detection where the udev queue runs empty).
> > 
> > I don't worry about 4C.3 happening in our current RedHat setup. There
> > isn't a hard barrier that is keeping this from happening, but the
> > timing
> > makes it very unlikely.  If we assume that it won't happen, then
> > RedHat's current implementation guarantees 4A.1, 4B.3, and 4C.1.  I'm
> > fine with those guarantees.
> 
> I agree your approach can't hurt, although I still think it will just
> make a very unlikely outcome even more unlikely (the 4C.3 case
> described above is obviously very special, and your patch wouldn't
> avoid it). I hope that our mutual ideas can go together.
> 

If we can cut down the cases where we end up waiting, I'm fine with
dropping my idea. I had some discussions this week, and people do
recognize that there are problems with the existing device assembling
code in udev. I'm now pretty confident that sooner or later, we will
have the ability to find about how a device was labelled in the past,
even between boots.  With that ability, we can easily set a timeout on
devices that haven't been used by anything in the past, and on devices
that have been used by something other than multipath we can skip the
timeout, since we expect that the device will still be
non-multipathable. This would mean that we only ever have timeouts in
the rare case when new storage is added. Until we have these abilities,
I would be fine with continuing the way that RedHat currently does
things.

> 
> You don't have to use it, but please let's keep the "fIn" option
> around. Our customers are used to this behavior. I don't deny that it
> has caused some problems (usually related to mishandling one way or the
> other), but we're not ready to give it up. We'll be working on
> improving it. Dropping it upstream would hurt us.
> 

I'm fine with you keeping it exactly as it is, if you want, but if you
are planning on improving it using the same sorts of ideas as you are
doing for the "FIn" case, then why not just use the same code?


> That's why I said *don't* set SYSTEMD_READY=0 :-)

Oops. I mis-read that.

-Ben

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH 14/16] multipath.rules: find_multipaths+ignore_wwids logic
  2018-01-29 22:28               ` Benjamin Marzinski
@ 2018-01-30 13:07                 ` Martin Wilck
  2018-01-30 23:40                   ` Benjamin Marzinski
  0 siblings, 1 reply; 30+ messages in thread
From: Martin Wilck @ 2018-01-30 13:07 UTC (permalink / raw)
  To: Benjamin Marzinski; +Cc: dm-devel

On Mon, 2018-01-29 at 16:28 -0600, Benjamin Marzinski wrote:
> On Fri, Jan 26, 2018 at 06:29:04PM +0100, Martin Wilck wrote:
> > On Thu, 2018-01-25 at 07:40 -0600, Benjamin Marzinski wrote:
> > > 
> > > Possibly another option is to check if something else is using
> > > the
> > > device
> > > and to not claim the device in this case. That will solve the
> > > initramfs
> > > case, with is the really bad one.
> > 
> > I agree it's bad, but AFAICS "i" only eliminates a part of the
> > problem.
> 
> I agree "i" isn't a perfect solution.  But adding code to never claim
> a
> device that is currently in use will help things, regardless of any
> other changes we do or whatever mode we are in.  If we are in
> agreement
> that reassign_maps isn't safe, then there's never a good reason for
> multipathd to set up on a path that is already in use. If we don't do
> that, then we should never claim these paths either. 

If a path is in use, multipathd fails to claim it. I've been pondering
to simply try "open(device, O_EXCL)" to test for "is in use". 
If the open() fails, the problem is to determine whether the device is 
permanently in use (e.g. already mounted or claimed by MD) or only
temporarily (because some other process is probing it at the same time
as us). Retrying, at least over more than a minimal time span, is out
of the question. Maybe multipathd could try to claim a single-path map
(if should_multipath is true), and just try the open() call in the case
of failure, to test whether the reason for the failure was that the
path was in use.

Btw, I've been wondering what the additional "in use" logic based on
lock_multipath()/flock() is useful for these days. We can either grab a
path (which comes down to open(..., O_EXCL)), or we can't. It seems odd
to have an additional, not fully equivalent "in use" logic
(lock_multipath() success doesn't imply subsequent O_EXCL succcess!). 

IMO we should determine if we're supposed to grab the path
(should_multipath(), using the best heuristics we can), and if yes, we
should immediately do so without doing other checks first. If that
fails, we might either retry or fall back. Once we have the O_EXCL fd,
we may  also (try to) flock() the paths to notify other processes using
flock(), but that would be fully optional.

>  
> > > It will still leave the case where
> > > multipath will never be able to create the device for some other
> > > reason,
> > > but that is almost always a configuration issues, where for
> > > instance
> > > multipath should be blacklisting a whole class of devices that it
> > > isn't.
> > 
> > Hannes and I have recently discussed an idea how to deal with this
> > situation. It's quite high on my todo list. The basic idea is to
> > that
> > if multipathd fails to set up a device, it leaves a flag somewhere
> > and
> > retrigger an uevent. When multipath sees the flag, it doesn't claim
> > the
> > device next time.
> 
> How is this different from the pathes you just wrote to deal with the
> find_multipaths case? It claims the path device immediately, sets a
> timer, and if multipathd doesn't set itself up within that time, it
> unclaims the path.
>   The only addition you need is for multipathd to
> trigger that change event for devices that it fails set up on, so
> that
> you don't have to wait for the timeout. You still do need a backup

That's the idea. But a flag is necessary that multipath -u will see.

> timeout to deal with the case where mutipathd doesn't start up
> correctly.

Hm. IMO we have to assume that multipathd will start up. If it doesn't,
all the schemes we've discussed so far would fail, even strict fiN. But
you've just hinted that my "retry timer" idea might actually be a
solution to this problem :-)

> 
> > 
> > If you insist, we'll just let "fin" survive as, say,
> > 'find_multipaths
> > "conservative"'. Or we call "fIn" "greedy" and "fin" "no".
> 
> I'm fine with this. But, assuming you agree with my two above
> comments,
> then I'm not sure why you don't like the five options outlined in my
> last email: 
> 
> fin like we have now
> Fin like we have now
> fIn with the ability to go back and unclaim the device
> Fin with the ability to go back and unclaim the device
> xxN where f/F and i/I are ignored

I think we should differentiate "what we try to claim" and "what we do
if we fail". The ability to go back in the error case is highly
desirable, but orthogonal to the options which describe the "try to"
side. Luckily, for "go back and unclaim" we need no option - it seems
obvious that failure recovery shouldn't be disabled (assuming we get
this right).

> * all with the additional check that we don't claim or assemble on
> paths
> that are currently in use.
> 
> I really don't care how we expose these options to the user. I was
> just
> pointing out that if we used "find_multipaths", "ignore_wwids", and
> "no_new_devs", where "ignore_wwids" implied the ability to go back
> and
> unclaim the device, then every combination of the three is valid. But
> whether we use one config option with 5 keywords, or three config
> options that are boolean, or whatever, is fine by me.

I think one config option is far better, be it only for the sake of
documenting it. I hate stuff like "option X does Y if set to Z, but
only if option A is set to B, ...". It's bad for readers and even worse
for writers.

>  
> > We had a case where the customer ran "dracut --add multipath", but
> > forgot to run "systemctl enable multipath.service". Because we use
> > "fIn", multipathd grabbed the device in the initrd, and afterwards
> > the
> > root device's subvolumes couldn't be mounted, because systemd would
> > try
> > to mount the members, which were locked by the multipath device.
> 
> So, I'm still a little fuzzy on what happened here. Are you saying
> that
> this case was also an instance of 4C.2 after all? If not that, was
> the
> problem that multipath was assembled on the path device, but not
> having
> multipathd enabled made "multipath -u" not claim the device after the
> switch-root, making it appear available for others to use. 

Exactly. systemd would have been able to mount the other files systems
through the mpath device, but the path devices show up earlier during
"coldplug", and once the mount unit is in FAILED state, systemd heads
towards emergency.

>  Perhaps in
> that case, if we fail the check to see if multipathd is
> running/wanted
> in "multipath -u", we should still check if the path device is
> multipathed already, and if it is, we should claim it regardless,
> because nobody else will be able to.

Yes, that might have helped in this failure scenario.

>  
> > >  As an
> > > aside, I am personally very wary about reassign_maps. Multipath
> > > doesn't
> > > own the other devices it is reloading. There is nothing to
> > > guarantee
> > > that someone else isn't trying to modify the LVM device at the
> > > same
> > > time. I don't know of a specific bug with this (we never turn it
> > > on),
> > > but it seems very risky to start changing devices we don't own
> > > with
> > > no
> > > coordination.
> > 
> > I've no experience with reassign_maps. It's tempting to think that
> > it
> > could solve all these nasty issues, but my gut feeling, like yours,
> > says that it might be dangerous. We don't turn it on, either.
> > 
> 
> If neither of us find this safe, we should deprecate it at the least,
> and probably disable it and print a warning message instead, if it is
> set in multipath.conf.

I'm unsure. I don't have the experience to say it does work or it
doesn't.

@ALL: (if anyone except Ben read this far :-), can you contribute
experience with reassign_maps?

> > I agree your approach can't hurt, although I still think it will
> > just
> > make a very unlikely outcome even more unlikely (the 4C.3 case
> > described above is obviously very special, and your patch wouldn't
> > avoid it). I hope that our mutual ideas can go together.
> > 
> 
> If we can cut down the cases where we end up waiting, I'm fine with
> dropping my idea. I had some discussions this week, and people do
> recognize that there are problems with the existing device assembling
> code in udev. I'm now pretty confident that sooner or later, we will
> have the ability to find about how a device was labelled in the past,
> even between boots. 

I'd be interested to learn more about this. Do you have a reference?

Cheers,
Martin

-- 
Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH 14/16] multipath.rules: find_multipaths+ignore_wwids logic
  2018-01-30 13:07                 ` Martin Wilck
@ 2018-01-30 23:40                   ` Benjamin Marzinski
  0 siblings, 0 replies; 30+ messages in thread
From: Benjamin Marzinski @ 2018-01-30 23:40 UTC (permalink / raw)
  To: Martin Wilck; +Cc: device-mapper development

On Tue, Jan 30, 2018 at 02:07:57PM +0100, Martin Wilck wrote:
> On Mon, 2018-01-29 at 16:28 -0600, Benjamin Marzinski wrote:
> > On Fri, Jan 26, 2018 at 06:29:04PM +0100, Martin Wilck wrote:
> > > On Thu, 2018-01-25 at 07:40 -0600, Benjamin Marzinski wrote:
> > > > 
> > > > Possibly another option is to check if something else is using
> > > > the
> > > > device
> > > > and to not claim the device in this case. That will solve the
> > > > initramfs
> > > > case, with is the really bad one.
> > > 
> > > I agree it's bad, but AFAICS "i" only eliminates a part of the
> > > problem.
> > 
> > I agree "i" isn't a perfect solution.  But adding code to never claim
> > a
> > device that is currently in use will help things, regardless of any
> > other changes we do or whatever mode we are in.  If we are in
> > agreement
> > that reassign_maps isn't safe, then there's never a good reason for
> > multipathd to set up on a path that is already in use. If we don't do
> > that, then we should never claim these paths either. 
> 
> If a path is in use, multipathd fails to claim it. I've been pondering
> to simply try "open(device, O_EXCL)" to test for "is in use". 
> If the open() fails, the problem is to determine whether the device is 
> permanently in use (e.g. already mounted or claimed by MD) or only
> temporarily (because some other process is probing it at the same time
> as us). Retrying, at least over more than a minimal time span, is out
> of the question. Maybe multipathd could try to claim a single-path map
> (if should_multipath is true), and just try the open() call in the case
> of failure, to test whether the reason for the failure was that the
> path was in use.

I was hoping that there was a better way to find this out than actually
going through with an exlcusive open, but I don't see one. I'm not a big
fan of exclusively opening every multipathable device on every event,
even if its just for a second. I'm worried that this could make things
like mounting a device fail, because an event happened on the device,
and multipath was holding it open exclusively at the same time that
mount tried to grab it exclusively.

To speak to your worry, I can't find too much that does hold a device
open exclusively for a short period of time. I wrote this systemtap
script to check 

******

probe kernel.function("blkdev_get") {
	if ($mode & 0x80) {
		printf("%s(%d) opened %d:%d EXCLUSIVELY with %x\n", execname(), pid(), MAJOR($bdev->bd_dev), MINOR($bdev->bd_dev), $mode)
	}
	else {
		printf("%s(%d) opened %d:%d with %x\n", execname(), pid(), MAJOR($bdev->bd_dev), MINOR($bdev->bd_dev), $mode)
	}
}

*****

after discovering, mounting, and trying other commands with some devices,
the only things that I found which do temporary exclusive opens are lvm
commands like (pv|vg|lv)(create|remove). But I'm sure I missed other
cases.

Without the ability to not claim devices that are currently being used,
"FIn" can easily run into problems. Imagine multipath claiming a single
path device, and then timing out and unclaiming it, and having the
device get mounted. Later another path shows up. Multipath will claim
that one too, but multipathd will fail to assemble on it, since the
other path device is in use.  However the next time an uevent occurs for
the mounted path device, multipath will claim it again since it does
have two paths (right?), setting it and its partitions to not ready.
This can cause systemd to auto-unmount them.

Possibly, multipath could only use "FIn" criteria the first time it
claims a device, and after that, switch back to "Fin" criteria. That
would solve this case, although there might still be a problem after
switching from the initramfs. 
 
> Btw, I've been wondering what the additional "in use" logic based on
> lock_multipath()/flock() is useful for these days. We can either grab a
> path (which comes down to open(..., O_EXCL)), or we can't. It seems odd
> to have an additional, not fully equivalent "in use" logic
> (lock_multipath() success doesn't imply subsequent O_EXCL succcess!). 
> 
> IMO we should determine if we're supposed to grab the path
> (should_multipath(), using the best heuristics we can), and if yes, we
> should immediately do so without doing other checks first. If that
> fails, we might either retry or fall back. Once we have the O_EXCL fd,
> we may  also (try to) flock() the paths to notify other processes using
> flock(), but that would be fully optional.

I don't know what the lock_multipath code is protecting us from. It was
originally to protect multiple multipath commands (or multipath and
multipathd) from both trying to create the same device at the same time.
However, after Hannes changed this to a shared lock like udev uses, I'm
not sure what it protects us from.  However, I didn't object to keeping
it as a shared lock on the grounds that udev is clearly doing this to
protect against something, and by doing this, we are being protected
against the same thing, whatever that is.  But I'm not sure if it is
necessary at all. 

> >  
> > > > It will still leave the case where
> > > > multipath will never be able to create the device for some other
> > > > reason,
> > > > but that is almost always a configuration issues, where for
> > > > instance
> > > > multipath should be blacklisting a whole class of devices that it
> > > > isn't.
> > > 
> > > Hannes and I have recently discussed an idea how to deal with this
> > > situation. It's quite high on my todo list. The basic idea is to
> > > that
> > > if multipathd fails to set up a device, it leaves a flag somewhere
> > > and
> > > retrigger an uevent. When multipath sees the flag, it doesn't claim
> > > the
> > > device next time.
> > 
> > How is this different from the pathes you just wrote to deal with the
> > find_multipaths case? It claims the path device immediately, sets a
> > timer, and if multipathd doesn't set itself up within that time, it
> > unclaims the path.
> >   The only addition you need is for multipathd to
> > trigger that change event for devices that it fails set up on, so
> > that
> > you don't have to wait for the timeout. You still do need a backup
> 
> That's the idea. But a flag is necessary that multipath -u will see.
> 
> > timeout to deal with the case where mutipathd doesn't start up
> > correctly.
> 
> Hm. IMO we have to assume that multipathd will start up. If it doesn't,
> all the schemes we've discussed so far would fail, even strict fiN. But
> you've just hinted that my "retry timer" idea might actually be a
> solution to this problem :-)

That's my point. Your timer code should work just fine for this.
 
> > 
> > > 
> > > If you insist, we'll just let "fin" survive as, say,
> > > 'find_multipaths
> > > "conservative"'. Or we call "fIn" "greedy" and "fin" "no".
> > 
> > I'm fine with this. But, assuming you agree with my two above
> > comments,
> > then I'm not sure why you don't like the five options outlined in my
> > last email: 
> > 
> > fin like we have now
> > Fin like we have now
> > fIn with the ability to go back and unclaim the device
> > Fin with the ability to go back and unclaim the device
> > xxN where f/F and i/I are ignored
> 
> I think we should differentiate "what we try to claim" and "what we do
> if we fail". The ability to go back in the error case is highly
> desirable, but orthogonal to the options which describe the "try to"
> side. Luckily, for "go back and unclaim" we need no option - it seems
> obvious that failure recovery shouldn't be disabled (assuming we get
> this right).

Fair enough. We could back out claiming a device, even if it's in the
wwids file, if multipathd fails to assemble on the device.
 
> > * all with the additional check that we don't claim or assemble on
> > paths
> > that are currently in use.
> > 
> > I really don't care how we expose these options to the user. I was
> > just
> > pointing out that if we used "find_multipaths", "ignore_wwids", and
> > "no_new_devs", where "ignore_wwids" implied the ability to go back
> > and
> > unclaim the device, then every combination of the three is valid. But
> > whether we use one config option with 5 keywords, or three config
> > options that are boolean, or whatever, is fine by me.
> 
> I think one config option is far better, be it only for the sake of
> documenting it. I hate stuff like "option X does Y if set to Z, but
> only if option A is set to B, ...". It's bad for readers and even worse
> for writers.

I'm fine with that.

> >  
> > > We had a case where the customer ran "dracut --add multipath", but
> > > forgot to run "systemctl enable multipath.service". Because we use
> > > "fIn", multipathd grabbed the device in the initrd, and afterwards
> > > the
> > > root device's subvolumes couldn't be mounted, because systemd would
> > > try
> > > to mount the members, which were locked by the multipath device.
> > 
> > So, I'm still a little fuzzy on what happened here. Are you saying
> > that
> > this case was also an instance of 4C.2 after all? If not that, was
> > the
> > problem that multipath was assembled on the path device, but not
> > having
> > multipathd enabled made "multipath -u" not claim the device after the
> > switch-root, making it appear available for others to use. 
> 
> Exactly. systemd would have been able to mount the other files systems
> through the mpath device, but the path devices show up earlier during
> "coldplug", and once the mount unit is in FAILED state, systemd heads
> towards emergency.
> 
> >  Perhaps in
> > that case, if we fail the check to see if multipathd is
> > running/wanted
> > in "multipath -u", we should still check if the path device is
> > multipathed already, and if it is, we should claim it regardless,
> > because nobody else will be able to.
> 
> Yes, that might have helped in this failure scenario.
> 
> >  
> > > >  As an
> > > > aside, I am personally very wary about reassign_maps. Multipath
> > > > doesn't
> > > > own the other devices it is reloading. There is nothing to
> > > > guarantee
> > > > that someone else isn't trying to modify the LVM device at the
> > > > same
> > > > time. I don't know of a specific bug with this (we never turn it
> > > > on),
> > > > but it seems very risky to start changing devices we don't own
> > > > with
> > > > no
> > > > coordination.
> > > 
> > > I've no experience with reassign_maps. It's tempting to think that
> > > it
> > > could solve all these nasty issues, but my gut feeling, like yours,
> > > says that it might be dangerous. We don't turn it on, either.
> > > 
> > 
> > If neither of us find this safe, we should deprecate it at the least,
> > and probably disable it and print a warning message instead, if it is
> > set in multipath.conf.
> 
> I'm unsure. I don't have the experience to say it does work or it
> doesn't.
> 
> @ALL: (if anyone except Ben read this far :-), can you contribute
> experience with reassign_maps?
> 
> > > I agree your approach can't hurt, although I still think it will
> > > just
> > > make a very unlikely outcome even more unlikely (the 4C.3 case
> > > described above is obviously very special, and your patch wouldn't
> > > avoid it). I hope that our mutual ideas can go together.
> > > 
> > 
> > If we can cut down the cases where we end up waiting, I'm fine with
> > dropping my idea. I had some discussions this week, and people do
> > recognize that there are problems with the existing device assembling
> > code in udev. I'm now pretty confident that sooner or later, we will
> > have the ability to find about how a device was labelled in the past,
> > even between boots. 
> 
> I'd be interested to learn more about this. Do you have a reference?

There's nothing specific posted about this. It was just a conversation
with people who have had conversations with the udev/systemd developers.

-Ben

> Cheers,
> Martin
> 
> -- 
> Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
> SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton
> HRB 21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH 00/16] multipath path classification
  2018-01-19  0:29 [RFC PATCH 00/16] multipath path classification Martin Wilck
                   ` (15 preceding siblings ...)
  2018-01-19  0:29 ` [RFC PATCH 16/16] libmultipath: trigger path uevent only when necessary Martin Wilck
@ 2018-03-07  8:53 ` Christophe Varoqui
  2018-03-07  9:26   ` Martin Wilck
  16 siblings, 1 reply; 30+ messages in thread
From: Christophe Varoqui @ 2018-03-07  8:53 UTC (permalink / raw)
  To: Martin Wilck; +Cc: device-mapper development, Xose Vazquez Perez


[-- Attachment #1.1: Type: text/plain, Size: 2269 bytes --]

Martin, Ben,

can you update me on the status of this patchset ?
Is it ready for inclusion ?

Thanks,
Christophe

On Fri, Jan 19, 2018 at 1:29 AM, Martin Wilck <mwilck@suse.com> wrote:

> This patch series implements the recommendation in my recent posting
> "Multipath path classification revisited". My testing has been surprisingly
> successful so far; more testing is of course needed.
> Anyway, I think it's in a shape that I can ask for review.
>
> I've seen Ben's detailed reply to my posting, but his comments haven't been
> taken into account in this series yet.
>
> Benjamin Marzinski (1):
>   libmultipath: trigger change uevent on new device creation
>
> Martin Wilck (15):
>   Revert "multipath: ignore -i if find_multipaths is set"
>   Revert "multipathd: imply -n if find_multipaths is set"
>   libmultipath: add mpvec param to should_multipath()
>   libmultipath: should_multipath: keep existing maps
>   multipath -u -i: change logic for find_multipaths
>   libmultipath: let ignore_wwids be set in config file
>   multipathd: replace -n with !ignore_wwids
>   multipath.conf.5: document "ignore_wwids"
>   multipath.8: adapt documentation of '-i'
>   multipathd.8: document that '-n' is now ignored
>   multipath: common code path for CMD_VALID_PATH
>   multipath -u/-c: change output to environment/key format
>   multipath -u/-c: add "$DEV is maybe a valid path"
>   multipath.rules: find_multipaths+ignore_wwids logic
>   libmultipath: trigger path uevent only when necessary
>
>  libmultipath/config.c      |  1 +
>  libmultipath/config.h      |  1 -
>  libmultipath/configure.c   | 48 ++++++++++++++++++++++++++++++++---
>  libmultipath/configure.h   |  1 +
>  libmultipath/defaults.h    |  1 +
>  libmultipath/dict.c        |  4 +++
>  libmultipath/wwids.c       | 13 +++++++---
>  libmultipath/wwids.h       |  2 +-
>  multipath/main.c           | 58 ++++++++++++++++++++++--------
> ------------
>  multipath/multipath.8      |  3 ++-
>  multipath/multipath.conf.5 | 31 +++++++++++++++++++++++
>  multipath/multipath.rules  | 63 ++++++++++++++++++++++++++++++
> +++++++++++++---
>  multipathd/main.c          | 16 +++---------
>  multipathd/multipathd.8    |  5 ++--
>  14 files changed, 192 insertions(+), 55 deletions(-)
>
> --
> 2.15.1
>
>

[-- Attachment #1.2: Type: text/html, Size: 3000 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC PATCH 00/16] multipath path classification
  2018-03-07  8:53 ` [RFC PATCH 00/16] multipath path classification Christophe Varoqui
@ 2018-03-07  9:26   ` Martin Wilck
  0 siblings, 0 replies; 30+ messages in thread
From: Martin Wilck @ 2018-03-07  9:26 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: device-mapper development, Xose Vazquez Perez


Hello Christophe,

On Wed, 2018-03-07 at 09:53 +0100, Christophe Varoqui wrote:
> Martin, Ben,
> 
> can you update me on the status of this patchset ?
> Is it ready for inclusion ?

No. I am preparing an updated set that's much improved.

Regards
Martin

-- 
Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2018-03-07  9:26 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-01-19  0:29 [RFC PATCH 00/16] multipath path classification Martin Wilck
2018-01-19  0:29 ` [RFC PATCH 01/16] Revert "multipath: ignore -i if find_multipaths is set" Martin Wilck
2018-01-19  0:29 ` [RFC PATCH 02/16] Revert "multipathd: imply -n " Martin Wilck
2018-01-19  0:29 ` [RFC PATCH 03/16] libmultipath: add mpvec param to should_multipath() Martin Wilck
2018-01-19  0:29 ` [RFC PATCH 04/16] libmultipath: should_multipath: keep existing maps Martin Wilck
2018-01-19 16:06   ` Benjamin Marzinski
2018-01-19  0:29 ` [RFC PATCH 05/16] multipath -u -i: change logic for find_multipaths Martin Wilck
2018-01-19  0:29 ` [RFC PATCH 06/16] libmultipath: let ignore_wwids be set in config file Martin Wilck
2018-01-19  0:29 ` [RFC PATCH 07/16] multipathd: replace -n with !ignore_wwids Martin Wilck
2018-01-19  0:29 ` [RFC PATCH 08/16] multipath.conf.5: document "ignore_wwids" Martin Wilck
2018-01-19  0:29 ` [RFC PATCH 09/16] multipath.8: adapt documentation of '-i' Martin Wilck
2018-01-19  0:29 ` [RFC PATCH 10/16] multipathd.8: document that '-n' is now ignored Martin Wilck
2018-01-19  0:29 ` [RFC PATCH 11/16] multipath: common code path for CMD_VALID_PATH Martin Wilck
2018-01-19  0:29 ` [RFC PATCH 12/16] multipath -u/-c: change output to environment/key format Martin Wilck
2018-01-19  0:29 ` [RFC PATCH 13/16] multipath -u/-c: add "$DEV is maybe a valid path" Martin Wilck
2018-01-19  0:29 ` [RFC PATCH 14/16] multipath.rules: find_multipaths+ignore_wwids logic Martin Wilck
2018-01-19 18:12   ` Benjamin Marzinski
2018-01-20  1:20     ` Martin Wilck
2018-01-21  3:21       ` Benjamin Marzinski
2018-01-22 21:56         ` Martin Wilck
2018-01-25 13:40           ` Benjamin Marzinski
2018-01-26 17:29             ` Martin Wilck
2018-01-29 22:28               ` Benjamin Marzinski
2018-01-30 13:07                 ` Martin Wilck
2018-01-30 23:40                   ` Benjamin Marzinski
2018-01-20  0:27   ` [FIX for 14/16] multipath.rules: set job properties for systemd-run correctly Martin Wilck
2018-01-19  0:29 ` [RFC PATCH 15/16] libmultipath: trigger change uevent on new device creation Martin Wilck
2018-01-19  0:29 ` [RFC PATCH 16/16] libmultipath: trigger path uevent only when necessary Martin Wilck
2018-03-07  8:53 ` [RFC PATCH 00/16] multipath path classification Christophe Varoqui
2018-03-07  9:26   ` Martin Wilck

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.