All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/15] Multipath patch dump
@ 2020-01-17  2:18 Benjamin Marzinski
  2020-01-17  2:18 ` [PATCH 01/15] multipathd: warn when configuration has been changed Benjamin Marzinski
                   ` (14 more replies)
  0 siblings, 15 replies; 49+ messages in thread
From: Benjamin Marzinski @ 2020-01-17  2:18 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck

This patch set is has multiple parts.

patch 01 is just a resubmit of my previous patch, with Martin's
corrections added.

Patches 02 - 06 are miscellaneous fixes and cleanups

Patches 07 - 09 are related to adding a new format wildcard, at the
        request of HPE, that allows multipath to get and display
        information from the vendor specific VPD pages

Patches 10 - 15 are the part that needs the most review, patch 14
        especially. It turns out that on machines with a large number
        of cores, io_destroy(), which is used by the directio checker,
        can take a long time to complete. Talking to some kernel people
        at Red Hat, I was told that this isn't a bug. It's working as
        expected, and multipath shouldn't be issuing so many
        io_destroy() calls (1 per path, when multipath or multipathd
        stops). Cutting out the io_destroy calls involved a major change
        to the directio checker. It's pretty hard to test a lot of the
        corner cases on actual hardware, so I've written a bunch of
        unit tests for this (patch 15).

Benjamin Marzinski (15):
  multipathd: warn when configuration has been changed.
  libmultipath: fix leak in foreign code
  Fix leak in mpathpersist
  libmultipath: remove unused path->prio_args
  libmultipath: constify get_unaligned_be*
  libmultipath: add missing hwe mpe variable merges
  libmultipath: fix sgio_get_vpd looping
  libmultipath: add vend_id to get_vpd_sgio
  libmultipath: add code to get vendor specific vpd data
  libmultipath: change how the checker async is set
  libmultipath: change failed path prio timeout
  multipathd: add new paths under vecs lock
  libmultipath: add new checker class functions
  libmultipath: make directio checker share io contexts
  tests: add directio unit tests

 libmpathpersist/mpath_persist.c  |   2 +-
 libmultipath/checkers.c          |  29 +-
 libmultipath/checkers.h          |   1 +
 libmultipath/checkers/directio.c | 337 ++++++++++++---
 libmultipath/config.c            |  12 +
 libmultipath/config.h            |   3 +
 libmultipath/dict.c              |  34 ++
 libmultipath/discovery.c         |  74 +++-
 libmultipath/discovery.h         |   2 +-
 libmultipath/foreign.c           |  11 +-
 libmultipath/hwtable.c           |   2 +
 libmultipath/print.c             |  27 ++
 libmultipath/prio.c              |   2 +-
 libmultipath/propsel.c           |  26 +-
 libmultipath/propsel.h           |   2 +
 libmultipath/structs.h           |  10 +-
 libmultipath/unaligned.h         |   4 +-
 mpathpersist/main.c              |   1 +
 multipath/main.c                 |   1 +
 multipath/multipath.conf.5       |   8 +
 multipathd/main.c                |  18 +-
 multipathd/uxlsnr.c              | 134 +++++-
 tests/Makefile                   |   3 +-
 tests/directio.c                 | 704 +++++++++++++++++++++++++++++++
 tests/vpd.c                      |  94 +++--
 25 files changed, 1394 insertions(+), 147 deletions(-)
 create mode 100644 tests/directio.c

-- 
2.17.2

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [PATCH 01/15] multipathd: warn when configuration has been changed.
  2020-01-17  2:18 [PATCH 00/15] Multipath patch dump Benjamin Marzinski
@ 2020-01-17  2:18 ` Benjamin Marzinski
  2020-01-17 15:45   ` Martin Wilck
  2020-01-17 15:51   ` Martin Wilck
  2020-01-17  2:18 ` [PATCH 02/15] libmultipath: fix leak in foreign code Benjamin Marzinski
                   ` (13 subsequent siblings)
  14 siblings, 2 replies; 49+ messages in thread
From: Benjamin Marzinski @ 2020-01-17  2:18 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck

It would be helpful if multipathd could log a message when
multipath.conf or files in the config_dir have been written to, both so
that it can be used to send a notification to users, and to help with
determining after the fact if multipathd was running with an older
config, when the logs of multipathd's behaviour don't match with the
current multipath.conf.

To do this, the multipathd uxlsnr thread now sets up inotify watches on
both /etc/multipath.conf and the config_dir to watch if the files are
deleted or closed after being opened for writing.  In order to keep
uxlsnr from polling repeatedly if the multipath.conf or the config_dir
aren't present, it will only set up the watches once per reconfigure.
However, since multipath.conf is far more likely to be replaced by a
text editor than modified in place, if it gets removed, multipathd will
immediately try to restart the watch on it (which will succeed if the
file was simply replaced by a new copy).  This does mean that if
multipath.conf or the config_dir are actually removed and then later
re-added, multipathd won't log any more messages for changes until the
next reconfigure. But that seems like a fair trade-off to avoid
repeatedly polling for files that aren't likely to appear.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 libmultipath/config.h |   1 +
 multipathd/main.c     |   1 +
 multipathd/uxlsnr.c   | 134 ++++++++++++++++++++++++++++++++++++++++--
 3 files changed, 130 insertions(+), 6 deletions(-)

diff --git a/libmultipath/config.h b/libmultipath/config.h
index ffec3103..e69aa07c 100644
--- a/libmultipath/config.h
+++ b/libmultipath/config.h
@@ -188,6 +188,7 @@ struct config {
 	int find_multipaths_timeout;
 	int marginal_pathgroups;
 	unsigned int version[3];
+	unsigned int sequence_nr;
 
 	char * multipath_dir;
 	char * selector;
diff --git a/multipathd/main.c b/multipathd/main.c
index 34a57689..7b364cfe 100644
--- a/multipathd/main.c
+++ b/multipathd/main.c
@@ -2618,6 +2618,7 @@ reconfigure (struct vectors * vecs)
 	uxsock_timeout = conf->uxsock_timeout;
 
 	old = rcu_dereference(multipath_conf);
+	conf->sequence_nr = old->sequence_nr + 1;
 	rcu_assign_pointer(multipath_conf, conf);
 	call_rcu(&old->rcu, rcu_free_config);
 
diff --git a/multipathd/uxlsnr.c b/multipathd/uxlsnr.c
index bc71679e..92d9a79a 100644
--- a/multipathd/uxlsnr.c
+++ b/multipathd/uxlsnr.c
@@ -23,6 +23,7 @@
 #include <sys/time.h>
 #include <signal.h>
 #include <stdbool.h>
+#include <sys/inotify.h>
 #include "checkers.h"
 #include "memory.h"
 #include "debug.h"
@@ -51,6 +52,8 @@ struct client {
 LIST_HEAD(clients);
 pthread_mutex_t client_lock = PTHREAD_MUTEX_INITIALIZER;
 struct pollfd *polls;
+int notify_fd = -1;
+char *config_dir;
 
 static bool _socket_client_is_root(int fd);
 
@@ -151,6 +154,8 @@ void uxsock_cleanup(void *arg)
 	long ux_sock = (long)arg;
 
 	close(ux_sock);
+	close(notify_fd);
+	free(config_dir);
 
 	pthread_mutex_lock(&client_lock);
 	list_for_each_entry_safe(client_loop, client_tmp, &clients, node) {
@@ -162,6 +167,106 @@ void uxsock_cleanup(void *arg)
 	free_polls();
 }
 
+/* failing to set the watch descriptor is o.k. we just miss a warning
+ * message */
+void reset_watch(int notify_fd, int *wds, unsigned int *sequence_nr)
+{
+	struct config *conf;
+	int dir_reset = 0;
+	int conf_reset = 0;
+
+	if (notify_fd == -1)
+		return;
+
+	conf = get_multipath_config();
+	/* instead of repeatedly try to reset the inotify watch if
+	 * the config directory or multipath.conf isn't there, just
+	 * do it once per reconfigure */
+	if (*sequence_nr != conf->sequence_nr) {
+		*sequence_nr = conf->sequence_nr;
+		if (wds[0] == -1)
+			conf_reset = 1;
+		if (!config_dir || !conf->config_dir ||
+		    strcmp(config_dir, conf->config_dir)) {
+			dir_reset = 1;
+			if (config_dir)
+				free(config_dir);
+			if (conf->config_dir)
+				config_dir = strdup(conf->config_dir);
+			else
+				config_dir = NULL;
+		} else if (wds[1] == -1)
+			dir_reset = 1;
+	}
+	put_multipath_config(conf);
+
+	if (dir_reset) {
+		if (wds[1] != -1) {
+			inotify_rm_watch(notify_fd, wds[1]);
+			wds[1] = -1;
+		}
+		if (config_dir) {
+			wds[1] = inotify_add_watch(notify_fd, config_dir,
+						   IN_CLOSE_WRITE | IN_DELETE |
+						   IN_ONLYDIR);
+			if (wds[1] == -1)
+				condlog(3, "didn't set up notifications on %s: %s", config_dir, strerror(errno));
+		}
+	}
+	if (conf_reset) {
+		wds[0] = inotify_add_watch(notify_fd, DEFAULT_CONFIGFILE,
+					   IN_CLOSE_WRITE);
+		if (wds[0] == -1)
+			condlog(3, "didn't set up notifications on /etc/multipath.conf: %s", strerror(errno));
+	}
+	return;
+}
+
+void handle_inotify(int fd, int  *wds)
+{
+	char buff[1024]
+		__attribute__ ((aligned(__alignof__(struct inotify_event))));
+	const struct inotify_event *event;
+	ssize_t len;
+	char *ptr;
+	int i, got_notify = 0;
+
+	for (;;) {
+		len = read(fd, buff, sizeof(buff));
+		if (len <= 0) {
+			if (len < 0 && errno != EAGAIN) {
+				condlog(3, "error reading from inotify_fd");
+				for (i = 0; i < 2; i++) {
+					if (wds[i] != -1) {
+						inotify_rm_watch(fd, wds[i]);
+						wds[i] = -1;
+					}
+				}
+			}
+			break;
+		}
+
+		got_notify = 1;
+		for (ptr = buff; ptr < buff + len;
+		     ptr += sizeof(struct inotify_event) + event->len) {
+			event = (const struct inotify_event *) ptr;
+
+			if (event->mask & IN_IGNORED) {
+				/* multipathd.conf may have been overwritten.
+				 * Try once to reset the notification */
+				if (wds[0] == event->wd)
+					wds[0] = inotify_add_watch(notify_fd,
+							DEFAULT_CONFIGFILE,
+							IN_CLOSE_WRITE);
+				else if (wds[1] == event->wd)
+					wds[1] = -1;
+			}
+		}
+	}
+	if (got_notify)
+		condlog(1, "Multipath configuration updated.\nReload multipathd for changes to take effect");
+}
+
 /*
  * entry point
  */
@@ -173,13 +278,19 @@ void * uxsock_listen(uxsock_trigger_fn uxsock_trigger, long ux_sock,
 	char *reply;
 	sigset_t mask;
 	int old_clients = MIN_POLLS;
+	/* conf->sequence_nr will be 1 when uxsock_listen is first called */
+	unsigned int sequence_nr = 0;
+	int wds[2] = { -1, -1 };
 
 	condlog(3, "uxsock: startup listener");
-	polls = (struct pollfd *)MALLOC((MIN_POLLS + 1) * sizeof(struct pollfd));
+	polls = (struct pollfd *)MALLOC((MIN_POLLS + 2) * sizeof(struct pollfd));
 	if (!polls) {
 		condlog(0, "uxsock: failed to allocate poll fds");
 		exit_daemon();
 	}
+	notify_fd = inotify_init1(IN_NONBLOCK);
+	if (notify_fd == -1) /* it's fine if notifications fail */
+		condlog(3, "failed to start up configuration notifications");
 	sigfillset(&mask);
 	sigdelset(&mask, SIGINT);
 	sigdelset(&mask, SIGTERM);
@@ -198,18 +309,18 @@ void * uxsock_listen(uxsock_trigger_fn uxsock_trigger, long ux_sock,
 		if (num_clients != old_clients) {
 			struct pollfd *new;
 			if (num_clients <= MIN_POLLS && old_clients > MIN_POLLS) {
-				new = REALLOC(polls, (1 + MIN_POLLS) *
+				new = REALLOC(polls, (2 + MIN_POLLS) *
 						sizeof(struct pollfd));
 			} else if (num_clients <= MIN_POLLS && old_clients <= MIN_POLLS) {
 				new = polls;
 			} else {
-				new = REALLOC(polls, (1+num_clients) *
+				new = REALLOC(polls, (2 + num_clients) *
 						sizeof(struct pollfd));
 			}
 			if (!new) {
 				pthread_mutex_unlock(&client_lock);
 				condlog(0, "%s: failed to realloc %d poll fds",
-					"uxsock", 1 + num_clients);
+					"uxsock", 2 + num_clients);
 				sched_yield();
 				continue;
 			}
@@ -219,8 +330,15 @@ void * uxsock_listen(uxsock_trigger_fn uxsock_trigger, long ux_sock,
 		polls[0].fd = ux_sock;
 		polls[0].events = POLLIN;
 
+		reset_watch(notify_fd, wds, &sequence_nr);
+		if (notify_fd == -1 || (wds[0] == -1 && wds[1] == -1))
+			polls[1].fd = -1;
+		else
+			polls[1].fd = notify_fd;
+		polls[1].events = POLLIN;
+
 		/* setup the clients */
-		i = 1;
+		i = 2;
 		list_for_each_entry(c, &clients, node) {
 			polls[i].fd = c->fd;
 			polls[i].events = POLLIN;
@@ -262,7 +380,7 @@ void * uxsock_listen(uxsock_trigger_fn uxsock_trigger, long ux_sock,
 		}
 
 		/* see if a client wants to speak to us */
-		for (i = 1; i < num_clients + 1; i++) {
+		for (i = 2; i < num_clients + 2; i++) {
 			if (polls[i].revents & POLLIN) {
 				struct timespec start_time;
 
@@ -321,6 +439,10 @@ void * uxsock_listen(uxsock_trigger_fn uxsock_trigger, long ux_sock,
 		if (polls[0].revents & POLLIN) {
 			new_client(ux_sock);
 		}
+
+		/* handle inotify events on config files */
+		if (polls[1].revents & POLLIN)
+			handle_inotify(notify_fd, wds);
 	}
 
 	return NULL;
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 02/15] libmultipath: fix leak in foreign code
  2020-01-17  2:18 [PATCH 00/15] Multipath patch dump Benjamin Marzinski
  2020-01-17  2:18 ` [PATCH 01/15] multipathd: warn when configuration has been changed Benjamin Marzinski
@ 2020-01-17  2:18 ` Benjamin Marzinski
  2020-01-17 15:50   ` Martin Wilck
  2020-01-17  2:18 ` [PATCH 03/15] Fix leak in mpathpersist Benjamin Marzinski
                   ` (12 subsequent siblings)
  14 siblings, 1 reply; 49+ messages in thread
From: Benjamin Marzinski @ 2020-01-17  2:18 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck

If scandir fails or finds no foreign libraries, enable_re needs to be
freed before exitting.

Fixes: 8d03eda4 'multipath.conf: add "enable_foreign" parameter'
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 libmultipath/foreign.c | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/libmultipath/foreign.c b/libmultipath/foreign.c
index 4b34e141..68e9a9b8 100644
--- a/libmultipath/foreign.c
+++ b/libmultipath/foreign.c
@@ -129,7 +129,7 @@ static int _init_foreign(const char *multipath_dir, const char *enable)
 	char pathbuf[PATH_MAX];
 	struct dirent **di;
 	struct scandir_result sr;
-	int r, i;
+	int r, i, ret = 0;
 	regex_t *enable_re = NULL;
 
 	foreigns = vector_alloc();
@@ -157,13 +157,15 @@ static int _init_foreign(const char *multipath_dir, const char *enable)
 	if (r == 0) {
 		condlog(3, "%s: no foreign multipath libraries found",
 			__func__);
-		return 0;
+		ret = 0;
+		goto out;
 	} else if (r < 0) {
 		r = errno;
 		condlog(1, "%s: error %d scanning foreign multipath libraries",
 			__func__, r);
 		_cleanup_foreign();
-		return -r;
+		ret = -r;
+		goto out;
 	}
 
 	sr.di = di;
@@ -250,8 +252,9 @@ static int _init_foreign(const char *multipath_dir, const char *enable)
 		free_foreign(fgn);
 	}
 	pthread_cleanup_pop(1); /* free_scandir_result */
+out:
 	pthread_cleanup_pop(1); /* free_pre */
-	return 0;
+	return ret;
 }
 
 int init_foreign(const char *multipath_dir, const char *enable)
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 03/15] Fix leak in mpathpersist
  2020-01-17  2:18 [PATCH 00/15] Multipath patch dump Benjamin Marzinski
  2020-01-17  2:18 ` [PATCH 01/15] multipathd: warn when configuration has been changed Benjamin Marzinski
  2020-01-17  2:18 ` [PATCH 02/15] libmultipath: fix leak in foreign code Benjamin Marzinski
@ 2020-01-17  2:18 ` Benjamin Marzinski
  2020-01-17 15:53   ` Martin Wilck
  2020-01-17  2:18 ` [PATCH 04/15] libmultipath: remove unused path->prio_args Benjamin Marzinski
                   ` (11 subsequent siblings)
  14 siblings, 1 reply; 49+ messages in thread
From: Benjamin Marzinski @ 2020-01-17  2:18 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck

If the persistent in command fails, the response buffer must be freed.
Found by Coverity

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 mpathpersist/main.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/mpathpersist/main.c b/mpathpersist/main.c
index 278b8d51..920e686c 100644
--- a/mpathpersist/main.c
+++ b/mpathpersist/main.c
@@ -499,6 +499,7 @@ static int handle_args(int argc, char * argv[], int nline)
 		if (ret != MPATH_PR_SUCCESS )
 		{
 			fprintf (stderr, "Persistent Reserve IN command failed\n");
+			free(resp);
 			goto out_fd;
 		}
 
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 04/15] libmultipath: remove unused path->prio_args
  2020-01-17  2:18 [PATCH 00/15] Multipath patch dump Benjamin Marzinski
                   ` (2 preceding siblings ...)
  2020-01-17  2:18 ` [PATCH 03/15] Fix leak in mpathpersist Benjamin Marzinski
@ 2020-01-17  2:18 ` Benjamin Marzinski
  2020-01-17 15:55   ` Martin Wilck
  2020-01-17  2:18 ` [PATCH 05/15] libmultipath: constify get_unaligned_be* Benjamin Marzinski
                   ` (10 subsequent siblings)
  14 siblings, 1 reply; 49+ messages in thread
From: Benjamin Marzinski @ 2020-01-17  2:18 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 libmultipath/structs.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/libmultipath/structs.h b/libmultipath/structs.h
index a3adf906..1c32a799 100644
--- a/libmultipath/structs.h
+++ b/libmultipath/structs.h
@@ -272,7 +272,6 @@ struct path {
 	char * uid_attribute;
 	char * getuid;
 	struct prio prio;
-	char * prio_args;
 	struct checker checker;
 	struct multipath * mpp;
 	int fd;
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 05/15] libmultipath: constify get_unaligned_be*
  2020-01-17  2:18 [PATCH 00/15] Multipath patch dump Benjamin Marzinski
                   ` (3 preceding siblings ...)
  2020-01-17  2:18 ` [PATCH 04/15] libmultipath: remove unused path->prio_args Benjamin Marzinski
@ 2020-01-17  2:18 ` Benjamin Marzinski
  2020-01-17 15:57   ` Martin Wilck
  2020-01-17  2:18 ` [PATCH 06/15] libmultipath: add missing hwe mpe variable merges Benjamin Marzinski
                   ` (9 subsequent siblings)
  14 siblings, 1 reply; 49+ messages in thread
From: Benjamin Marzinski @ 2020-01-17  2:18 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 libmultipath/unaligned.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/libmultipath/unaligned.h b/libmultipath/unaligned.h
index 68c07742..b9eaa7cb 100644
--- a/libmultipath/unaligned.h
+++ b/libmultipath/unaligned.h
@@ -10,14 +10,14 @@ static inline uint16_t get_unaligned_be16(const void *ptr)
 	return p[0] << 8 | p[1];
 }
 
-static inline uint32_t get_unaligned_be32(void *ptr)
+static inline uint32_t get_unaligned_be32(const void *ptr)
 {
 	const uint8_t *p = ptr;
 
 	return p[0] << 24 | p[1] << 16 | p[2] << 8 | p[3];
 }
 
-static inline uint64_t get_unaligned_be64(void *ptr)
+static inline uint64_t get_unaligned_be64(const void *ptr)
 {
 	uint32_t low = get_unaligned_be32(ptr + 4);
 	uint64_t high = get_unaligned_be32(ptr);
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 06/15] libmultipath: add missing hwe mpe variable merges
  2020-01-17  2:18 [PATCH 00/15] Multipath patch dump Benjamin Marzinski
                   ` (4 preceding siblings ...)
  2020-01-17  2:18 ` [PATCH 05/15] libmultipath: constify get_unaligned_be* Benjamin Marzinski
@ 2020-01-17  2:18 ` Benjamin Marzinski
  2020-01-17 16:04   ` Martin Wilck
  2020-01-17  2:18 ` [PATCH 07/15] libmultipath: fix sgio_get_vpd looping Benjamin Marzinski
                   ` (8 subsequent siblings)
  14 siblings, 1 reply; 49+ messages in thread
From: Benjamin Marzinski @ 2020-01-17  2:18 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck

There were some variables in the hwe and mpe structs that weren't being
merged by merge_hwe() and merge_mpe().

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 libmultipath/config.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/libmultipath/config.c b/libmultipath/config.c
index 20e3b8bf..85626e96 100644
--- a/libmultipath/config.c
+++ b/libmultipath/config.c
@@ -372,6 +372,10 @@ merge_hwe (struct hwentry * dst, struct hwentry * src)
 	merge_num(san_path_err_threshold);
 	merge_num(san_path_err_forget_rate);
 	merge_num(san_path_err_recovery_time);
+	merge_num(marginal_path_err_sample_time);
+	merge_num(marginal_path_err_rate_threshold);
+	merge_num(marginal_path_err_recheck_gap_time);
+	merge_num(marginal_path_double_failed_time);
 
 	snprintf(id, sizeof(id), "%s/%s", dst->vendor, dst->product);
 	reconcile_features_with_options(id, &dst->features,
@@ -397,6 +401,7 @@ merge_mpe(struct mpentry *dst, struct mpentry *src)
 	if (dst->prkey_source == PRKEY_SOURCE_NONE &&
 	    src->prkey_source != PRKEY_SOURCE_NONE) {
 		dst->prkey_source = src->prkey_source;
+		dst->sa_flags = src->sa_flags;
 		memcpy(&dst->reservation_key, &src->reservation_key,
 		       sizeof(dst->reservation_key));
 	}
@@ -413,6 +418,9 @@ merge_mpe(struct mpentry *dst, struct mpentry *src)
 	merge_num(deferred_remove);
 	merge_num(delay_watch_checks);
 	merge_num(delay_wait_checks);
+	merge_num(san_path_err_threshold);
+	merge_num(san_path_err_forget_rate);
+	merge_num(san_path_err_recovery_time);
 	merge_num(marginal_path_err_sample_time);
 	merge_num(marginal_path_err_rate_threshold);
 	merge_num(marginal_path_err_recheck_gap_time);
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 07/15] libmultipath: fix sgio_get_vpd looping
  2020-01-17  2:18 [PATCH 00/15] Multipath patch dump Benjamin Marzinski
                   ` (5 preceding siblings ...)
  2020-01-17  2:18 ` [PATCH 06/15] libmultipath: add missing hwe mpe variable merges Benjamin Marzinski
@ 2020-01-17  2:18 ` Benjamin Marzinski
  2020-01-17 16:27   ` Martin Wilck
  2020-01-17  2:18 ` [PATCH 08/15] libmultipath: add vend_id to get_vpd_sgio Benjamin Marzinski
                   ` (7 subsequent siblings)
  14 siblings, 1 reply; 49+ messages in thread
From: Benjamin Marzinski @ 2020-01-17  2:18 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck

If do_inq returns a page with a length that is less than maxlen, but
larger than DEFAULT_SGIO_LEN, this function will loop forever. Also
if do_inq returns with a length equal to or greater than maxlen,
sgio_get_vpd will exit immediately, even if it hasn't read the entire
page.  Fix these issues, modify the tests to verify the new behavior.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 libmultipath/discovery.c | 12 +++---
 tests/vpd.c              | 84 ++++++++++++++++++++++++----------------
 2 files changed, 57 insertions(+), 39 deletions(-)

diff --git a/libmultipath/discovery.c b/libmultipath/discovery.c
index 72f455e8..3c72a80a 100644
--- a/libmultipath/discovery.c
+++ b/libmultipath/discovery.c
@@ -870,6 +870,7 @@ static int
 sgio_get_vpd (unsigned char * buff, int maxlen, int fd, int pg)
 {
 	int len = DEFAULT_SGIO_LEN;
+	int rlen;
 
 	if (fd < 0) {
 		errno = EBADF;
@@ -877,12 +878,11 @@ sgio_get_vpd (unsigned char * buff, int maxlen, int fd, int pg)
 	}
 retry:
 	if (0 == do_inq(fd, 0, 1, pg, buff, len)) {
-		len = get_unaligned_be16(&buff[2]) + 4;
-		if (len >= maxlen)
-			return len;
-		if (len > DEFAULT_SGIO_LEN)
-			goto retry;
-		return len;
+		rlen = get_unaligned_be16(&buff[2]) + 4;
+		if (rlen <= len || len >= maxlen)
+			return rlen;
+		len = (rlen < maxlen)? rlen : maxlen;
+		goto retry;
 	}
 	return -1;
 }
diff --git a/tests/vpd.c b/tests/vpd.c
index d9f80eaa..4dbce010 100644
--- a/tests/vpd.c
+++ b/tests/vpd.c
@@ -306,7 +306,7 @@ static int create_vpd83(unsigned char *buf, size_t bufsiz, const char *id,
 	default:
 		break;
 	}
-	put_unaligned_be16(n, buf + 2);
+	put_unaligned_be16(bufsiz, buf + 2);
 	return n + 4;
 }
 
@@ -429,6 +429,8 @@ static void test_vpd_vnd_ ## len ## _ ## wlen(void **state)             \
 	free(exp_wwid);							\
 	will_return(__wrap_ioctl, n);					\
 	will_return(__wrap_ioctl, vt->vpdbuf);				\
+	will_return(__wrap_ioctl, n);					\
+	will_return(__wrap_ioctl, vt->vpdbuf);				\
 	ret = get_vpd_sgio(10, 0x83, vt->wwid, wlen);			\
 	assert_correct_wwid("test_vpd_vnd_" #len "_" #wlen,		\
 			    exp_len, ret, '1', 0, false,		\
@@ -459,6 +461,8 @@ static void test_vpd_str_ ## typ ## _ ## len ## _ ## wlen(void **state) \
 		exp_len = wlen - 1;					\
 	will_return(__wrap_ioctl, n);					\
 	will_return(__wrap_ioctl, vt->vpdbuf);				\
+	will_return(__wrap_ioctl, n);					\
+	will_return(__wrap_ioctl, vt->vpdbuf);				\
 	ret = get_vpd_sgio(10, 0x83, vt->wwid, wlen);			\
 	assert_correct_wwid("test_vpd_str_" #typ "_" #len "_" #wlen,	\
 			    exp_len, ret, byte0[type], 0,		\
@@ -496,6 +500,8 @@ static void test_vpd_naa_ ## naa ## _ ## wlen(void **state)             \
 			 3, naa, 0);					\
 	will_return(__wrap_ioctl, n);					\
 	will_return(__wrap_ioctl, vt->vpdbuf);				\
+	will_return(__wrap_ioctl, n);					\
+	will_return(__wrap_ioctl, vt->vpdbuf);				\
 	ret = get_vpd_sgio(10, 0x83, vt->wwid, wlen);			\
 	assert_correct_wwid("test_vpd_naa_" #naa "_" #wlen,		\
 			    exp_len, ret, '3', '0' + naa, true,		\
@@ -506,22 +512,26 @@ static void test_vpd_naa_ ## naa ## _ ## wlen(void **state)             \
  * test_vpd_eui_LEN_WLEN() - test code for VPD 83, EUI64
  * @LEN:	EUI64 length (8, 12, or 16)
  * @WLEN:	WWID buffer size
+ * @SML:	Use small VPD page size
  */
-#define make_test_vpd_eui(len, wlen)					\
-static void test_vpd_eui_ ## len ## _ ## wlen(void **state)             \
+#define make_test_vpd_eui(len, wlen, sml)				\
+static void test_vpd_eui_ ## len ## _ ## wlen ## _ ## sml(void **state)	\
 {									\
 	struct vpdtest *vt = *state;                                    \
 	int n, ret;							\
 	/* returned size is always uneven */				\
 	int exp_len = wlen > 2 * len + 1 ? 2 * len + 1 :		\
 		wlen % 2 == 0 ? wlen - 1 : wlen - 2;			\
+	int bufsize = sml ? 255 : sizeof(vt->vpdbuf);			\
 									\
-	n = create_vpd83(vt->vpdbuf, sizeof(vt->vpdbuf), test_id,	\
+	n = create_vpd83(vt->vpdbuf, bufsize, test_id,			\
 			 2, 0, len);					\
 	will_return(__wrap_ioctl, n);					\
 	will_return(__wrap_ioctl, vt->vpdbuf);				\
+	will_return(__wrap_ioctl, n);					\
+	will_return(__wrap_ioctl, vt->vpdbuf);				\
 	ret = get_vpd_sgio(10, 0x83, vt->wwid, wlen);			\
-	assert_correct_wwid("test_vpd_eui_" #len "_" #wlen,		\
+	assert_correct_wwid("test_vpd_eui_" #len "_" #wlen "_" #sml,	\
 			    exp_len, ret, '2', 0, true,			\
 			    test_id, vt->wwid);				\
 }
@@ -603,25 +613,30 @@ make_test_vpd_vnd(20, 10);
 make_test_vpd_vnd(10, 10);
 
 /* EUI64 tests */
+/* small vpd page test */
+make_test_vpd_eui(8, 32, 1);
+make_test_vpd_eui(12, 32, 1);
+make_test_vpd_eui(16, 40, 1);
+
 /* 64bit, WWID size: 18 */
-make_test_vpd_eui(8, 32);
-make_test_vpd_eui(8, 18);
-make_test_vpd_eui(8, 17);
-make_test_vpd_eui(8, 16);
-make_test_vpd_eui(8, 10);
+make_test_vpd_eui(8, 32, 0);
+make_test_vpd_eui(8, 18, 0);
+make_test_vpd_eui(8, 17, 0);
+make_test_vpd_eui(8, 16, 0);
+make_test_vpd_eui(8, 10, 0);
 
 /* 96 bit, WWID size: 26 */
-make_test_vpd_eui(12, 32);
-make_test_vpd_eui(12, 26);
-make_test_vpd_eui(12, 25);
-make_test_vpd_eui(12, 20);
-make_test_vpd_eui(12, 10);
+make_test_vpd_eui(12, 32, 0);
+make_test_vpd_eui(12, 26, 0);
+make_test_vpd_eui(12, 25, 0);
+make_test_vpd_eui(12, 20, 0);
+make_test_vpd_eui(12, 10, 0);
 
 /* 128 bit, WWID size: 34 */
-make_test_vpd_eui(16, 40);
-make_test_vpd_eui(16, 34);
-make_test_vpd_eui(16, 33);
-make_test_vpd_eui(16, 20);
+make_test_vpd_eui(16, 40, 0);
+make_test_vpd_eui(16, 34, 0);
+make_test_vpd_eui(16, 33, 0);
+make_test_vpd_eui(16, 20, 0);
 
 /* NAA IEEE registered extended (36), WWID size: 34 */
 make_test_vpd_naa(6, 40);
@@ -722,20 +737,23 @@ static int test_vpd(void)
 		cmocka_unit_test(test_vpd_vnd_19_20),
 		cmocka_unit_test(test_vpd_vnd_20_10),
 		cmocka_unit_test(test_vpd_vnd_10_10),
-		cmocka_unit_test(test_vpd_eui_8_32),
-		cmocka_unit_test(test_vpd_eui_8_18),
-		cmocka_unit_test(test_vpd_eui_8_17),
-		cmocka_unit_test(test_vpd_eui_8_16),
-		cmocka_unit_test(test_vpd_eui_8_10),
-		cmocka_unit_test(test_vpd_eui_12_32),
-		cmocka_unit_test(test_vpd_eui_12_26),
-		cmocka_unit_test(test_vpd_eui_12_25),
-		cmocka_unit_test(test_vpd_eui_12_20),
-		cmocka_unit_test(test_vpd_eui_12_10),
-		cmocka_unit_test(test_vpd_eui_16_40),
-		cmocka_unit_test(test_vpd_eui_16_34),
-		cmocka_unit_test(test_vpd_eui_16_33),
-		cmocka_unit_test(test_vpd_eui_16_20),
+		cmocka_unit_test(test_vpd_eui_8_32_1),
+		cmocka_unit_test(test_vpd_eui_12_32_1),
+		cmocka_unit_test(test_vpd_eui_16_40_1),
+		cmocka_unit_test(test_vpd_eui_8_32_0),
+		cmocka_unit_test(test_vpd_eui_8_18_0),
+		cmocka_unit_test(test_vpd_eui_8_17_0),
+		cmocka_unit_test(test_vpd_eui_8_16_0),
+		cmocka_unit_test(test_vpd_eui_8_10_0),
+		cmocka_unit_test(test_vpd_eui_12_32_0),
+		cmocka_unit_test(test_vpd_eui_12_26_0),
+		cmocka_unit_test(test_vpd_eui_12_25_0),
+		cmocka_unit_test(test_vpd_eui_12_20_0),
+		cmocka_unit_test(test_vpd_eui_12_10_0),
+		cmocka_unit_test(test_vpd_eui_16_40_0),
+		cmocka_unit_test(test_vpd_eui_16_34_0),
+		cmocka_unit_test(test_vpd_eui_16_33_0),
+		cmocka_unit_test(test_vpd_eui_16_20_0),
 		cmocka_unit_test(test_vpd_naa_6_40),
 		cmocka_unit_test(test_vpd_naa_6_34),
 		cmocka_unit_test(test_vpd_naa_6_33),
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 08/15] libmultipath: add vend_id to get_vpd_sgio
  2020-01-17  2:18 [PATCH 00/15] Multipath patch dump Benjamin Marzinski
                   ` (6 preceding siblings ...)
  2020-01-17  2:18 ` [PATCH 07/15] libmultipath: fix sgio_get_vpd looping Benjamin Marzinski
@ 2020-01-17  2:18 ` Benjamin Marzinski
  2020-01-17 16:58   ` Martin Wilck
  2020-01-17  2:18 ` [PATCH 09/15] libmultipath: add code to get vendor specific vpd data Benjamin Marzinski
                   ` (6 subsequent siblings)
  14 siblings, 1 reply; 49+ messages in thread
From: Benjamin Marzinski @ 2020-01-17  2:18 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck

This tells multipath how it should decode vendor specific pages. It will
be used by a future patch.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 libmultipath/discovery.c |  4 ++--
 libmultipath/discovery.h |  2 +-
 libmultipath/propsel.c   |  2 +-
 tests/vpd.c              | 10 +++++-----
 4 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/libmultipath/discovery.c b/libmultipath/discovery.c
index 3c72a80a..1d79cbae 100644
--- a/libmultipath/discovery.c
+++ b/libmultipath/discovery.c
@@ -1135,7 +1135,7 @@ get_vpd_sysfs (struct udev_device *parent, int pg, char * str, int maxlen)
 }
 
 int
-get_vpd_sgio (int fd, int pg, char * str, int maxlen)
+get_vpd_sgio (int fd, int pg, int vend_id, char * str, int maxlen)
 {
 	int len, buff_len;
 	unsigned char buff[4096];
@@ -1810,7 +1810,7 @@ static ssize_t uid_fallback(struct path *pp, int path_state,
 		if (len < 0 && path_state == PATH_UP) {
 			condlog(1, "%s: failed to get sysfs uid: %s",
 				pp->dev, strerror(-len));
-			len = get_vpd_sgio(pp->fd, 0x83, pp->wwid,
+			len = get_vpd_sgio(pp->fd, 0x83, 0, pp->wwid,
 					   WWID_SIZE);
 			*origin = "sgio";
 		}
diff --git a/libmultipath/discovery.h b/libmultipath/discovery.h
index 8d04c2af..2f2fd9eb 100644
--- a/libmultipath/discovery.h
+++ b/libmultipath/discovery.h
@@ -35,7 +35,7 @@ int path_get_tpgs(struct path *pp); /* This function never returns TPGS_UNDEF */
 int do_tur (char *);
 int path_offline (struct path *);
 int get_state (struct path * pp, struct config * conf, int daemon, int state);
-int get_vpd_sgio (int fd, int pg, char * str, int maxlen);
+int get_vpd_sgio (int fd, int pg, int vend_id, char * str, int maxlen);
 int pathinfo (struct path * pp, struct config * conf, int mask);
 int alloc_path_with_pathinfo (struct config *conf, struct udev_device *udevice,
 			      const char *wwid, int flag, struct path **pp_ptr);
diff --git a/libmultipath/propsel.c b/libmultipath/propsel.c
index 27e8d68a..b5b5b89f 100644
--- a/libmultipath/propsel.c
+++ b/libmultipath/propsel.c
@@ -490,7 +490,7 @@ check_rdac(struct path * pp)
 	if (__do_set_from_hwe(checker_name, pp, checker_name) &&
 	    strcmp(checker_name, RDAC))
 		return 0;
-	len = get_vpd_sgio(pp->fd, 0xC9, buff, 44);
+	len = get_vpd_sgio(pp->fd, 0xC9, 0, buff, 44);
 	if (len <= 0)
 		return 0;
 	return !(memcmp(buff + 4, "vac1", 4));
diff --git a/tests/vpd.c b/tests/vpd.c
index 4dbce010..02d6e0bb 100644
--- a/tests/vpd.c
+++ b/tests/vpd.c
@@ -431,7 +431,7 @@ static void test_vpd_vnd_ ## len ## _ ## wlen(void **state)             \
 	will_return(__wrap_ioctl, vt->vpdbuf);				\
 	will_return(__wrap_ioctl, n);					\
 	will_return(__wrap_ioctl, vt->vpdbuf);				\
-	ret = get_vpd_sgio(10, 0x83, vt->wwid, wlen);			\
+	ret = get_vpd_sgio(10, 0x83, 0, vt->wwid, wlen);		\
 	assert_correct_wwid("test_vpd_vnd_" #len "_" #wlen,		\
 			    exp_len, ret, '1', 0, false,		\
 			    exp_subst, vt->wwid);			\
@@ -463,7 +463,7 @@ static void test_vpd_str_ ## typ ## _ ## len ## _ ## wlen(void **state) \
 	will_return(__wrap_ioctl, vt->vpdbuf);				\
 	will_return(__wrap_ioctl, n);					\
 	will_return(__wrap_ioctl, vt->vpdbuf);				\
-	ret = get_vpd_sgio(10, 0x83, vt->wwid, wlen);			\
+	ret = get_vpd_sgio(10, 0x83, 0, vt->wwid, wlen);		\
 	assert_correct_wwid("test_vpd_str_" #typ "_" #len "_" #wlen,	\
 			    exp_len, ret, byte0[type], 0,		\
 			    type != STR_IQN,				\
@@ -502,7 +502,7 @@ static void test_vpd_naa_ ## naa ## _ ## wlen(void **state)             \
 	will_return(__wrap_ioctl, vt->vpdbuf);				\
 	will_return(__wrap_ioctl, n);					\
 	will_return(__wrap_ioctl, vt->vpdbuf);				\
-	ret = get_vpd_sgio(10, 0x83, vt->wwid, wlen);			\
+	ret = get_vpd_sgio(10, 0x83, 0, vt->wwid, wlen);		\
 	assert_correct_wwid("test_vpd_naa_" #naa "_" #wlen,		\
 			    exp_len, ret, '3', '0' + naa, true,		\
 			    test_id, vt->wwid);				\
@@ -530,7 +530,7 @@ static void test_vpd_eui_ ## len ## _ ## wlen ## _ ## sml(void **state)	\
 	will_return(__wrap_ioctl, vt->vpdbuf);				\
 	will_return(__wrap_ioctl, n);					\
 	will_return(__wrap_ioctl, vt->vpdbuf);				\
-	ret = get_vpd_sgio(10, 0x83, vt->wwid, wlen);			\
+	ret = get_vpd_sgio(10, 0x83, 0, vt->wwid, wlen);		\
 	assert_correct_wwid("test_vpd_eui_" #len "_" #wlen "_" #sml,	\
 			    exp_len, ret, '2', 0, true,			\
 			    test_id, vt->wwid);				\
@@ -557,7 +557,7 @@ static void test_vpd80_ ## size ## _ ## len ## _ ## wlen(void **state)  \
 			 size, len);					\
 	will_return(__wrap_ioctl, n);					\
 	will_return(__wrap_ioctl, vt->vpdbuf);				\
-	ret = get_vpd_sgio(10, 0x80, vt->wwid, wlen);			\
+	ret = get_vpd_sgio(10, 0x80, 0, vt->wwid, wlen);		\
 	assert_correct_wwid("test_vpd80_" #size "_" #len "_" #wlen,	\
 			    exp_len, ret, 0, 0, false,			\
 			    input, vt->wwid);				\
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 09/15] libmultipath: add code to get vendor specific vpd data
  2020-01-17  2:18 [PATCH 00/15] Multipath patch dump Benjamin Marzinski
                   ` (7 preceding siblings ...)
  2020-01-17  2:18 ` [PATCH 08/15] libmultipath: add vend_id to get_vpd_sgio Benjamin Marzinski
@ 2020-01-17  2:18 ` Benjamin Marzinski
  2020-01-17 16:56   ` Martin Wilck
  2020-01-18 17:34   ` John Stoffel
  2020-01-17  2:18 ` [PATCH 10/15] libmultipath: change how the checker async is set Benjamin Marzinski
                   ` (5 subsequent siblings)
  14 siblings, 2 replies; 49+ messages in thread
From: Benjamin Marzinski @ 2020-01-17  2:18 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck

This adds the wildcard 'g' for multipath and path formatted printing,
which returns extra data from a device's vendor specific vpd page.  The
specific vendor vpd page to use, and the vendor/product id to decode it
can be set in the hwentry with vpd_vendor_pg and vpd_vendor_id. It can
be configured in the devices section of multipath.conf with the
vpd_vendor parameter. Currently, the only devices that use this are HPE
3PAR arrays, to return the Volume Name.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 libmultipath/config.c      |  4 ++++
 libmultipath/config.h      |  2 ++
 libmultipath/dict.c        | 34 ++++++++++++++++++++++++++++++++++
 libmultipath/discovery.c   | 34 +++++++++++++++++++++++++++++++++-
 libmultipath/hwtable.c     |  2 ++
 libmultipath/print.c       | 27 +++++++++++++++++++++++++++
 libmultipath/propsel.c     | 24 ++++++++++++++++++++++++
 libmultipath/propsel.h     |  2 ++
 libmultipath/structs.h     |  9 +++++++++
 multipath/multipath.conf.5 |  8 ++++++++
 10 files changed, 145 insertions(+), 1 deletion(-)

diff --git a/libmultipath/config.c b/libmultipath/config.c
index 85626e96..72b8d37c 100644
--- a/libmultipath/config.c
+++ b/libmultipath/config.c
@@ -369,6 +369,8 @@ merge_hwe (struct hwentry * dst, struct hwentry * src)
 	merge_num(max_sectors_kb);
 	merge_num(ghost_delay);
 	merge_num(all_tg_pt);
+	merge_num(vpd_vendor_pg);
+	merge_num(vpd_vendor_id);
 	merge_num(san_path_err_threshold);
 	merge_num(san_path_err_forget_rate);
 	merge_num(san_path_err_recovery_time);
@@ -517,6 +519,8 @@ store_hwe (vector hwtable, struct hwentry * dhwe)
 	hwe->detect_prio = dhwe->detect_prio;
 	hwe->detect_checker = dhwe->detect_checker;
 	hwe->ghost_delay = dhwe->ghost_delay;
+	hwe->vpd_vendor_pg = dhwe->vpd_vendor_pg;
+	hwe->vpd_vendor_id = dhwe->vpd_vendor_id;
 
 	if (dhwe->bl_product && !(hwe->bl_product = set_param_str(dhwe->bl_product)))
 		goto out;
diff --git a/libmultipath/config.h b/libmultipath/config.h
index e69aa07c..589146de 100644
--- a/libmultipath/config.h
+++ b/libmultipath/config.h
@@ -87,6 +87,8 @@ struct hwentry {
 	int max_sectors_kb;
 	int ghost_delay;
 	int all_tg_pt;
+	int vpd_vendor_pg;
+	int vpd_vendor_id;
 	char * bl_product;
 };
 
diff --git a/libmultipath/dict.c b/libmultipath/dict.c
index 2b046e1d..d6d8b79b 100644
--- a/libmultipath/dict.c
+++ b/libmultipath/dict.c
@@ -1366,6 +1366,39 @@ def_uxsock_timeout_handler(struct config *conf, vector strvec)
 	return 0;
 }
 
+static int
+hw_vpd_vendor_handler(struct config *conf, vector strvec)
+{
+	int rc = 0;
+	char *buff;
+
+	struct hwentry * hwe = VECTOR_LAST_SLOT(conf->hwtable);
+	if (!hwe)
+		return 1;
+
+	buff = set_value(strvec);
+	if (!buff)
+		return 1;
+	if (strcmp(buff, "hp3par") == 0) {
+		hwe->vpd_vendor_pg = 0xc0;
+		hwe->vpd_vendor_id = VPD_VP_HP3PAR;
+	} else
+		rc = 1;
+	FREE(buff);
+	return rc;
+}
+
+static int
+snprint_hw_vpd_vendor(struct config *conf, char * buff, int len,
+		      const void * data)
+{
+	const struct hwentry * hwe = (const struct hwentry *)data;
+
+	if (hwe->vpd_vendor_pg == 0xc0 && hwe->vpd_vendor_id == VPD_VP_HP3PAR)
+		return snprintf(buff, len, "hp3par");
+	return 0;
+}
+
 /*
  * blacklist block handlers
  */
@@ -1806,6 +1839,7 @@ init_keywords(vector keywords)
 	install_keyword("max_sectors_kb", &hw_max_sectors_kb_handler, &snprint_hw_max_sectors_kb);
 	install_keyword("ghost_delay", &hw_ghost_delay_handler, &snprint_hw_ghost_delay);
 	install_keyword("all_tg_pt", &hw_all_tg_pt_handler, &snprint_hw_all_tg_pt);
+	install_keyword("vpd_vendor", &hw_vpd_vendor_handler, &snprint_hw_vpd_vendor);
 	install_sublevel_end();
 
 	install_keyword_root("overrides", &overrides_handler);
diff --git a/libmultipath/discovery.c b/libmultipath/discovery.c
index 1d79cbae..d2773c3a 100644
--- a/libmultipath/discovery.c
+++ b/libmultipath/discovery.c
@@ -1103,6 +1103,30 @@ parse_vpd_pg83(const unsigned char *in, size_t in_len,
 	return len;
 }
 
+static int
+parse_vpd_c0_hp3par(const unsigned char *in, size_t in_len,
+		    char *out, size_t out_len)
+{
+	size_t len;
+
+	memset(out, 0x0, out_len);
+	if (in_len <= 4 || (in[4] > 3 && in_len < 44)) {
+		condlog(3, "HP/3PAR vendor specific VPD page length too short: %lu", in_len);
+		return -EINVAL;
+	}
+	if (in[4] <= 3) /* revision must be > 3 to have Vomlume Name */
+		return -ENODATA;
+	len = get_unaligned_be32(&in[40]);
+	if (len > out_len || len + 44 > in_len) {
+		condlog(3, "HP/3PAR vendor specific Volume name too long: %lu",
+			len);
+		return -EINVAL;
+	}
+	memcpy(out, &in[44], len);
+	out[out_len - 1] = '\0';
+	return len;
+}
+
 static int
 get_vpd_sysfs (struct udev_device *parent, int pg, char * str, int maxlen)
 {
@@ -1170,7 +1194,9 @@ get_vpd_sgio (int fd, int pg, int vend_id, char * str, int maxlen)
 			len = (buff_len <= maxlen)? buff_len : maxlen;
 			memcpy (str, buff, len);
 		}
-	} else
+	} else if (pg == 0xc0 && vend_id == VPD_VP_HP3PAR)
+		len = parse_vpd_c0_hp3par(buff, buff_len, str, maxlen);
+	else
 		len = -ENOSYS;
 
 	return len;
@@ -1544,6 +1570,12 @@ scsi_ioctl_pathinfo (struct path * pp, struct config *conf, int mask)
 	if (!(mask & DI_SERIAL))
 		return;
 
+	select_vpd_vendor_pg(conf, pp);
+	select_vpd_vendor_id(conf, pp);
+
+	if (pp->vpd_vendor_pg != 0 && get_vpd_sgio(pp->fd, pp->vpd_vendor_pg, pp->vpd_vendor_id, pp->vpd_data, sizeof(pp->vpd_data)) < 0)
+		condlog(3, "%s: failed to get extra vpd data", pp->dev);
+
 	parent = pp->udev;
 	while (parent) {
 		const char *subsys = udev_device_get_subsystem(parent);
diff --git a/libmultipath/hwtable.c b/libmultipath/hwtable.c
index 16627ec5..1f27450c 100644
--- a/libmultipath/hwtable.c
+++ b/libmultipath/hwtable.c
@@ -117,6 +117,8 @@ static struct hwentry default_hw[] = {
 		.no_path_retry = 18,
 		.fast_io_fail  = 10,
 		.dev_loss      = MAX_DEV_LOSS_TMO,
+		.vpd_vendor_pg = 0xc0,
+		.vpd_vendor_id = VPD_VP_HP3PAR,
 	},
 	{
 		/* RA8000 / ESA12000 */
diff --git a/libmultipath/print.c b/libmultipath/print.c
index 907469ad..0aafe3cb 100644
--- a/libmultipath/print.c
+++ b/libmultipath/print.c
@@ -358,6 +358,23 @@ snprint_action (char * buff, size_t len, const struct multipath * mpp)
 	}
 }
 
+static int
+snprint_multipath_vpd_data(char * buff, size_t len,
+			   const struct multipath * mpp)
+{
+	struct pathgroup * pgp;
+	struct path * pp;
+	int i, j;
+
+	vector_foreach_slot(mpp->pg, pgp, i) {
+		vector_foreach_slot(pgp->paths, pp, j) {
+			if (strlen(pp->vpd_data))
+				return snprintf(buff, len, "%s", pp->vpd_data);
+		}
+	}
+	return 0;
+}
+
 /*
  * path info printing functions
  */
@@ -688,6 +705,14 @@ snprint_path_marginal(char * buff, size_t len, const struct path * pp)
 	return snprintf(buff, len, "normal");
 }
 
+static int
+snprint_path_vpd_data(char * buff, size_t len, const struct path * pp)
+{
+	if (strlen(pp->vpd_data) > 0)
+		return snprintf(buff, len, "%s", pp->vpd_data);
+	return 0;
+}
+
 struct multipath_data mpd[] = {
 	{'n', "name",          0, snprint_name},
 	{'w', "uuid",          0, snprint_multipath_uuid},
@@ -712,6 +737,7 @@ struct multipath_data mpd[] = {
 	{'p', "prod",          0, snprint_multipath_prod},
 	{'e', "rev",           0, snprint_multipath_rev},
 	{'G', "foreign",       0, snprint_multipath_foreign},
+	{'g', "vpd page data", 0, snprint_multipath_vpd_data},
 	{0, NULL, 0 , NULL}
 };
 
@@ -737,6 +763,7 @@ struct path_data pd[] = {
 	{'r', "target WWPN",   0, snprint_tgt_wwpn},
 	{'a', "host adapter",  0, snprint_host_adapter},
 	{'G', "foreign",       0, snprint_path_foreign},
+	{'g', "vpd page data", 0, snprint_path_vpd_data},
 	{'0', "failures",      0, snprint_path_failures},
 	{'P', "protocol",      0, snprint_path_protocol},
 	{0, NULL, 0 , NULL}
diff --git a/libmultipath/propsel.c b/libmultipath/propsel.c
index b5b5b89f..3c99f2d4 100644
--- a/libmultipath/propsel.c
+++ b/libmultipath/propsel.c
@@ -1203,3 +1203,27 @@ out:
 		origin);
 	return 0;
 }
+
+int select_vpd_vendor_pg (struct config *conf, struct path *pp)
+{
+	const char *origin;
+
+	pp_set_hwe(vpd_vendor_pg);
+	pp_set_default(vpd_vendor_pg, 0);
+out:
+	condlog(3, "%s: vpd_vendor_pg = 0x%x %s", pp->dev, pp->vpd_vendor_pg,
+		origin);
+	return 0;
+}
+
+int select_vpd_vendor_id (struct config *conf, struct path *pp)
+{
+	const char *origin;
+
+	pp_set_hwe(vpd_vendor_id);
+	pp_set_default(vpd_vendor_id, 0);
+out:
+	condlog(3, "%s: vpd_vendor_id = 0x%x %s", pp->dev, pp->vpd_vendor_id,
+		origin);
+	return 0;
+}
diff --git a/libmultipath/propsel.h b/libmultipath/propsel.h
index ddfd6262..3f6d319a 100644
--- a/libmultipath/propsel.h
+++ b/libmultipath/propsel.h
@@ -37,3 +37,5 @@ void reconcile_features_with_options(const char *id, char **features,
 				     int* no_path_retry,
 				     int *retain_hwhandler);
 int select_all_tg_pt (struct config *conf, struct multipath * mp);
+int select_vpd_vendor_pg (struct config *conf, struct path *pp);
+int select_vpd_vendor_id (struct config *conf, struct path *pp);
diff --git a/libmultipath/structs.h b/libmultipath/structs.h
index 1c32a799..1ad5f64a 100644
--- a/libmultipath/structs.h
+++ b/libmultipath/structs.h
@@ -21,6 +21,7 @@
 #define HOST_NAME_LEN		16
 #define SLOT_NAME_SIZE		40
 #define PRKEY_SIZE		19
+#define VPD_DATA_SIZE		128
 
 #define SCSI_VENDOR_SIZE	9
 #define SCSI_PRODUCT_SIZE	17
@@ -243,6 +244,11 @@ struct hd_geometry {
 };
 #endif
 
+/*
+ * from sg_vpd_vendor.c
+ */
+#define VPD_VP_HP3PAR 4
+
 struct path {
 	char dev[FILE_NAME_SIZE];
 	char dev_t[BLK_DEV_SIZE];
@@ -255,6 +261,7 @@ struct path {
 	char rev[PATH_REV_SIZE];
 	char serial[SERIAL_SIZE];
 	char tgt_node_name[NODE_NAME_SIZE];
+	char vpd_data[VPD_DATA_SIZE];
 	unsigned long long size;
 	unsigned int checkint;
 	unsigned int tick;
@@ -287,6 +294,8 @@ struct path {
 	int io_err_pathfail_starttime;
 	int find_multipaths_timeout;
 	int marginal;
+	int vpd_vendor_pg;
+	int vpd_vendor_id;
 	/* configlet pointers */
 	vector hwe;
 	struct gen_path generic_path;
diff --git a/multipath/multipath.conf.5 b/multipath/multipath.conf.5
index e866da23..dc103fd8 100644
--- a/multipath/multipath.conf.5
+++ b/multipath/multipath.conf.5
@@ -1472,6 +1472,14 @@ the \fIproduct\fR attribute set to the value of \fIproduct_blacklist\fR.
 The user_friendly_names prefix to use for this
 device type, instead of the default "mpath".
 .TP
+.B vpd_vendor
+The vendor specific vpd page information, using the vpd page abbreviation.
+The vpd page abbreviation can be found by running \fIsg_vpd -e\fR. multipathd
+will use this information to gather device specific information that can be
+displayed with the \fI%g\fR wilcard for the \fImultipathd show maps format\fR
+and \fImultipathd show paths format\fR commands. Currently only the
+\fBhp3par\fR vpd page is supported.
+.TP
 .B hardware_handler
 The hardware handler to use for this device type.
 The following hardware handler are implemented:
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 10/15] libmultipath: change how the checker async is set
  2020-01-17  2:18 [PATCH 00/15] Multipath patch dump Benjamin Marzinski
                   ` (8 preceding siblings ...)
  2020-01-17  2:18 ` [PATCH 09/15] libmultipath: add code to get vendor specific vpd data Benjamin Marzinski
@ 2020-01-17  2:18 ` Benjamin Marzinski
  2020-01-17 17:02   ` Martin Wilck
                     ` (2 more replies)
  2020-01-17  2:18 ` [PATCH 11/15] libmultipath: change failed path prio timeout Benjamin Marzinski
                   ` (4 subsequent siblings)
  14 siblings, 3 replies; 49+ messages in thread
From: Benjamin Marzinski @ 2020-01-17  2:18 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck

The way that the checkers async mode worked in multipathd didn't make
much sense. When multipathd starts up, all checker classes are in the
sync mode, and any pathinfo() calls on devices would run the checker in
sync mode. However, the First time a checker class was used in
checkerloop, it would set that checker class to async (assuming
force_sync wasn't set).  After that, no matter when a checker from that
class was called, it would always run in async mode.  Multipathd doesn't
need to run checkers in sync mode at all, so don't.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 libmpathpersist/mpath_persist.c |  2 +-
 libmultipath/discovery.c        | 10 ++++------
 multipath/main.c                |  1 +
 3 files changed, 6 insertions(+), 7 deletions(-)

diff --git a/libmpathpersist/mpath_persist.c b/libmpathpersist/mpath_persist.c
index 603cfc3b..b2238f00 100644
--- a/libmpathpersist/mpath_persist.c
+++ b/libmpathpersist/mpath_persist.c
@@ -47,7 +47,7 @@ mpath_lib_init (void)
 		condlog(0, "Failed to initialize multipath config.");
 		return NULL;
 	}
-
+	conf->force_sync = 1;
 	set_max_fds(conf->max_fds);
 
 	return conf;
diff --git a/libmultipath/discovery.c b/libmultipath/discovery.c
index d2773c3a..1ab093f4 100644
--- a/libmultipath/discovery.c
+++ b/libmultipath/discovery.c
@@ -1643,12 +1643,10 @@ get_state (struct path * pp, struct config *conf, int daemon, int oldstate)
 	if (pp->mpp && !c->mpcontext)
 		checker_mp_init(c, &pp->mpp->mpcontext);
 	checker_clear_message(c);
-	if (daemon) {
-		if (conf->force_sync == 0)
-			checker_set_async(c);
-		else
-			checker_set_sync(c);
-	}
+	if (conf->force_sync == 0)
+		checker_set_async(c);
+	else
+		checker_set_sync(c);
 	if (!conf->checker_timeout &&
 	    sysfs_get_timeout(pp, &(c->timeout)) <= 0)
 		c->timeout = DEF_TIMEOUT;
diff --git a/multipath/main.c b/multipath/main.c
index 4f4d8e89..aebabd9b 100644
--- a/multipath/main.c
+++ b/multipath/main.c
@@ -905,6 +905,7 @@ main (int argc, char *argv[])
 		exit(RTVL_FAIL);
 	multipath_conf = conf;
 	conf->retrigger_tries = 0;
+	conf->force_sync = 1;
 	while ((arg = getopt(argc, argv, ":adcChl::FfM:v:p:b:BrR:itTquUwW")) != EOF ) {
 		switch(arg) {
 		case 1: printf("optarg : %s\n",optarg);
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 11/15] libmultipath: change failed path prio timeout
  2020-01-17  2:18 [PATCH 00/15] Multipath patch dump Benjamin Marzinski
                   ` (9 preceding siblings ...)
  2020-01-17  2:18 ` [PATCH 10/15] libmultipath: change how the checker async is set Benjamin Marzinski
@ 2020-01-17  2:18 ` Benjamin Marzinski
  2020-01-17 17:18   ` Martin Wilck
  2020-01-17  2:18 ` [PATCH 12/15] multipathd: add new paths under vecs lock Benjamin Marzinski
                   ` (3 subsequent siblings)
  14 siblings, 1 reply; 49+ messages in thread
From: Benjamin Marzinski @ 2020-01-17  2:18 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck

multipath will try to get the priority from a PATH_DOWN path, if the
path doesn't currently have a valid priority. However, if the priority
code needs to contact the device to get the priority, this is likely to
fail for PATH_DOWN paths.  This code dates back to when multipathd could
not easily reload device tables with failed paths, so getting the
correct priority was important to have a correctly configured device.
Now multipathd can simply reload the device to move the path to the
correct pathgroup when the path comes back up.  Since there are a number
of prioritizers that don't require talking to the device, multipath
shouldn't completely skip attempting to get the priority of these paths,
but it should set a small timeout, so that it isn't hanging in the
case where it needs to contact a device through a failed path.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 libmultipath/discovery.c | 14 ++++++--------
 libmultipath/prio.c      |  2 +-
 2 files changed, 7 insertions(+), 9 deletions(-)

diff --git a/libmultipath/discovery.c b/libmultipath/discovery.c
index 1ab093f4..2c331db8 100644
--- a/libmultipath/discovery.c
+++ b/libmultipath/discovery.c
@@ -1661,11 +1661,10 @@ get_state (struct path * pp, struct config *conf, int daemon, int oldstate)
 }
 
 static int
-get_prio (struct path * pp)
+get_prio (struct path * pp, int timeout)
 {
 	struct prio * p;
 	struct config *conf;
-	int checker_timeout;
 	int old_prio;
 
 	if (!pp)
@@ -1684,11 +1683,8 @@ get_prio (struct path * pp)
 			return 1;
 		}
 	}
-	conf = get_multipath_config();
-	checker_timeout = conf->checker_timeout;
-	put_multipath_config(conf);
 	old_prio = pp->priority;
-	pp->priority = prio_getprio(p, pp, checker_timeout);
+	pp->priority = prio_getprio(p, pp, timeout);
 	if (pp->priority < 0) {
 		/* this changes pp->offline, but why not */
 		int state = path_offline(pp);
@@ -2095,11 +2091,13 @@ int pathinfo(struct path *pp, struct config *conf, int mask)
 
 	 /*
 	  * Retrieve path priority, even for PATH_DOWN paths if it has never
-	  * been successfully obtained before.
+	  * been successfully obtained before. If path is down don't try
+	  * for too long.
 	  */
 	if ((mask & DI_PRIO) && path_state == PATH_UP && strlen(pp->wwid)) {
 		if (pp->state != PATH_DOWN || pp->priority == PRIO_UNDEF) {
-			get_prio(pp);
+			get_prio(pp, (pp->state != PATH_DOWN)?
+				     (conf->checker_timeout * 1000) : 10);
 		}
 	}
 
diff --git a/libmultipath/prio.c b/libmultipath/prio.c
index 87de1f97..21c1b092 100644
--- a/libmultipath/prio.c
+++ b/libmultipath/prio.c
@@ -14,7 +14,7 @@ unsigned int get_prio_timeout(unsigned int checker_timeout,
 			      unsigned int default_timeout)
 {
 	if (checker_timeout)
-		return checker_timeout * 1000;
+		return checker_timeout;
 	return default_timeout;
 }
 
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 12/15] multipathd: add new paths under vecs lock
  2020-01-17  2:18 [PATCH 00/15] Multipath patch dump Benjamin Marzinski
                   ` (10 preceding siblings ...)
  2020-01-17  2:18 ` [PATCH 11/15] libmultipath: change failed path prio timeout Benjamin Marzinski
@ 2020-01-17  2:18 ` Benjamin Marzinski
  2020-01-17 17:39   ` Martin Wilck
  2020-01-17  2:18 ` [PATCH 13/15] libmultipath: add new checker class functions Benjamin Marzinski
                   ` (2 subsequent siblings)
  14 siblings, 1 reply; 49+ messages in thread
From: Benjamin Marzinski @ 2020-01-17  2:18 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck

Right now, multipathd only ever calls pathinfo on a path outside of the
vecs lock in one circumstance, when it's adding a new path from a
uevent. Doing this can cause weirdness or crash multipathd.

First off, the path checker code is written assuming that only one
checker instance will be active at a time.  If this isn't the case, two
processes can modify the checker's refcounts at the same time,
potentially causing a checker to get removed while still in use (if two
threads try to increment the refcount at the same time, causing it to
only increment once).

Another issue is that since the vecs lock isn't grabbed until after all
of the pathinfo is gathered, and the vecs lock is the only thing keeping
the uevent servicing thread from running at the same time as a
reconfigure is happening, it is possible to have multipathd add a path
using the old config after reconfigure has already added that path using
the new config. Leading to two versions of the path on the pathvec.
There are possibly other issues with running pathinfo outside of the
vecs lock that I haven't noticed, as well.

The easiest solution is to call pathinfo on the path while holding the
vecs lock, to keep other threads from modifying the checker structure or
the config while setting up the path. This is what happens every other
time multipathd calls pathinfo, including when a path is added through
the multipathd command interface.

This does mean that any time spent calling pathinfo on new paths will
block other threads that need the vecs lock, but since the checker is
now always called in async mode, and the prio function will use a
short timeout if called on a failed path, this shouldn't be any more
noticeable than the calls to check_path() for already existing paths.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 multipathd/main.c | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/multipathd/main.c b/multipathd/main.c
index 7b364cfe..9e01cfaa 100644
--- a/multipathd/main.c
+++ b/multipathd/main.c
@@ -906,9 +906,8 @@ uev_add_path (struct uevent *uev, struct vectors * vecs, int need_do_map)
 			}
 		}
 	}
-	lock_cleanup_pop(vecs->lock);
 	if (pp)
-		return ret;
+		goto out;
 
 	/*
 	 * get path vital state
@@ -920,13 +919,13 @@ uev_add_path (struct uevent *uev, struct vectors * vecs, int need_do_map)
 	pthread_cleanup_pop(1);
 	if (!pp) {
 		if (ret == PATHINFO_SKIPPED)
-			return 0;
-		condlog(3, "%s: failed to get path info", uev->kernel);
-		return 1;
+			ret = 0;
+		else {
+			condlog(3, "%s: failed to get path info", uev->kernel);
+			ret = 1;
+		}
+		goto out;
 	}
-	pthread_cleanup_push(cleanup_lock, &vecs->lock);
-	lock(&vecs->lock);
-	pthread_testcancel();
 	ret = store_path(vecs->pathvec, pp);
 	if (!ret) {
 		conf = get_multipath_config();
@@ -940,6 +939,7 @@ uev_add_path (struct uevent *uev, struct vectors * vecs, int need_do_map)
 		free_path(pp);
 		ret = 1;
 	}
+out:
 	lock_cleanup_pop(vecs->lock);
 	return ret;
 }
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 13/15] libmultipath: add new checker class functions
  2020-01-17  2:18 [PATCH 00/15] Multipath patch dump Benjamin Marzinski
                   ` (11 preceding siblings ...)
  2020-01-17  2:18 ` [PATCH 12/15] multipathd: add new paths under vecs lock Benjamin Marzinski
@ 2020-01-17  2:18 ` Benjamin Marzinski
  2020-01-17 17:41   ` Martin Wilck
  2020-01-17  2:18 ` [PATCH 14/15] libmultipath: make directio checker share io contexts Benjamin Marzinski
  2020-01-17  2:18 ` [PATCH 15/15] tests: add directio unit tests Benjamin Marzinski
  14 siblings, 1 reply; 49+ messages in thread
From: Benjamin Marzinski @ 2020-01-17  2:18 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck

This commit adds the optional functions libcheck_load, libcheck_unload,
and libcheck_reset. libcheck_load is called when a checker is first
loaded, libcheck_unload is called when it is unloaded, and
libcheck_reset is called during reconfigure, after all the current
paths have been removed. They are designed to setup, reset, and destroy
global state that all checkers of this class use. They will be used
in future commits.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 libmultipath/checkers.c | 29 ++++++++++++++++++++++++++++-
 libmultipath/checkers.h |  1 +
 multipathd/main.c       |  1 +
 3 files changed, 30 insertions(+), 1 deletion(-)

diff --git a/libmultipath/checkers.c b/libmultipath/checkers.c
index a08bf418..5c7d3004 100644
--- a/libmultipath/checkers.c
+++ b/libmultipath/checkers.c
@@ -18,6 +18,9 @@ struct checker_class {
 	int (*init)(struct checker *);       /* to allocate the context */
 	int (*mp_init)(struct checker *);    /* to allocate the mpcontext */
 	void (*free)(struct checker *);      /* to free the context */
+	int (*load)(void);                   /* to allocate global variables */
+	void (*unload)(void);                /* to free global variables */
+	int (*reset)(void);		     /* to reset the global variables */
 	const char **msgtable;
 	short msgtable_size;
 };
@@ -66,6 +69,8 @@ void free_checker_class(struct checker_class *c)
 	}
 	condlog(3, "unloading %s checker", c->name);
 	list_del(&c->node);
+	if (c->unload)
+		c->unload();
 	if (c->handle) {
 		if (dlclose(c->handle) != 0) {
 			condlog(0, "Cannot unload checker %s: %s",
@@ -98,6 +103,18 @@ static struct checker_class *checker_class_lookup(const char *name)
 	return NULL;
 }
 
+int reset_checker_classes(void)
+{
+	int ret = 0;
+	struct checker_class *c;
+
+	list_for_each_entry(c, &checkers, node) {
+		if (c->reset)
+			ret = ret? ret : c->reset();
+	}
+	return ret;
+}
+
 static struct checker_class *add_checker_class(const char *multipath_dir,
 					       const char *name)
 {
@@ -142,7 +159,11 @@ static struct checker_class *add_checker_class(const char *multipath_dir,
 		goto out;
 
 	c->mp_init = (int (*)(struct checker *)) dlsym(c->handle, "libcheck_mp_init");
-	/* NULL mp_init is o.k. call dlerror() to clear out any error string */
+	c->load = (int (*)(void)) dlsym(c->handle, "libcheck_load");
+	c->unload = (void (*)(void)) dlsym(c->handle, "libcheck_unload");
+	c->reset = (int (*)(void)) dlsym(c->handle, "libcheck_reset");
+	/* These 4 functions can be NULL. call dlerror() to clear out any
+	 * error string */
 	dlerror();
 
 	c->free = (void (*)(struct checker *)) dlsym(c->handle, "libcheck_free");
@@ -168,6 +189,12 @@ static struct checker_class *add_checker_class(const char *multipath_dir,
 	condlog(3, "checker %s: message table size = %d",
 		c->name, c->msgtable_size);
 
+	if (c->load && c->load() != 0) {
+		condlog(0, "%s: failed to load checker", c->name);
+		c->unload = NULL;
+		goto out;
+	}
+
 done:
 	c->sync = 1;
 	list_add(&c->node, &checkers);
diff --git a/libmultipath/checkers.h b/libmultipath/checkers.h
index 5237e7ec..d9930467 100644
--- a/libmultipath/checkers.h
+++ b/libmultipath/checkers.h
@@ -150,6 +150,7 @@ void checker_disable (struct checker *);
 int checker_check (struct checker *, int);
 int checker_is_sync(const struct checker *);
 const char *checker_name (const struct checker *);
+int reset_checker_classes(void);
 /*
  * This returns a string that's best prepended with "$NAME checker",
  * where $NAME is the return value of checker_name().
diff --git a/multipathd/main.c b/multipathd/main.c
index 9e01cfaa..8ee8a77d 100644
--- a/multipathd/main.c
+++ b/multipathd/main.c
@@ -2607,6 +2607,7 @@ reconfigure (struct vectors * vecs)
 	vecs->pathvec = NULL;
 	delete_all_foreign();
 
+	reset_checker_classes();
 	/* Re-read any timezone changes */
 	tzset();
 
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 14/15] libmultipath: make directio checker share io contexts
  2020-01-17  2:18 [PATCH 00/15] Multipath patch dump Benjamin Marzinski
                   ` (12 preceding siblings ...)
  2020-01-17  2:18 ` [PATCH 13/15] libmultipath: add new checker class functions Benjamin Marzinski
@ 2020-01-17  2:18 ` Benjamin Marzinski
  2020-01-20  8:09   ` Martin Wilck
  2020-01-17  2:18 ` [PATCH 15/15] tests: add directio unit tests Benjamin Marzinski
  14 siblings, 1 reply; 49+ messages in thread
From: Benjamin Marzinski @ 2020-01-17  2:18 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck

On systems with a large number of cores (>500), io_destroy() can take
tens to hundreds of milliseconds to complete, due to RCU
synchronization. If there are a large number of paths using the directio
checker on such a system, this can lead to multipath taking almost a
minute to complete, with the vast majority of time taken up by
io_destroy().

To solve this, the directio checker now allocates one aio context for
every 1024 checkers. This reduces the io_destroy() delay to a thousandth
of its previous level. However, this means that muliple checkers are
sharing the same aio context, and must be able to handle getting results
for other checkers.  Because only one checker is ever running at a
time, this doesn't require any locking.  However, locking could be added
in the future if necessary, to allow multiple checkers to run at the
same time.

When checkers are freed, they usually no longer destroy the io context.
Instead, they attempt to cancel any outstanding request. If that fails,
they put the request on an orphan list, so that it can be freed by other
checkers, once it has completed. IO contexts are only destroyed at three
times, during reconfigure (to deal with the possibility that multipathd
is holding more aio events than it needs to be, since there is a single
limit for the whole system), when the checker class is unloaded, and
in a corner case when checkers are freed. If an aio_group (which
contains the aio context) is entirely full of orphaned requests, then
no checker can use it. Since no checker is using it, there is no checker
to clear out the orphaned requests. In this (likely rare) case, the
last checker from that group to be freed and leave behind an orphaned
request will call io_destroy() and remove the group.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 libmultipath/checkers/directio.c | 337 +++++++++++++++++++++++++------
 1 file changed, 278 insertions(+), 59 deletions(-)

diff --git a/libmultipath/checkers/directio.c b/libmultipath/checkers/directio.c
index 1b00b775..0863db2a 100644
--- a/libmultipath/checkers/directio.c
+++ b/libmultipath/checkers/directio.c
@@ -17,6 +17,33 @@
 
 #include "checkers.h"
 #include "../libmultipath/debug.h"
+#include "../libmultipath/time-util.h"
+
+#define AIO_GROUP_SIZE 1024
+
+/* Note: This checker type relies on the fact that only one checker can be run
+ * at a time, since multiple checkers share the same aio_group, and must be
+ * able to modify other checker's async_reqs. If multple checkers become able
+ * to be run at the same time, this checker will need to add locking, and
+ * probably polling on event fds, to deal with that */
+
+struct aio_group {
+	struct list_head node;
+	int holders;
+	io_context_t ioctx;
+	struct list_head orphans;
+};
+
+struct async_req {
+	struct iocb io;
+	int blksize;
+	unsigned char *	buf;
+	unsigned char * ptr;
+	struct list_head node;
+	int state; /* PATH_REMOVED means this is an orphan */
+};
+
+static LIST_HEAD(aio_grp_list);
 
 enum {
 	MSG_DIRECTIO_UNKNOWN = CHECKER_FIRST_MSGID,
@@ -37,18 +64,134 @@ const char *libcheck_msgtable[] = {
 struct directio_context {
 	int		running;
 	int		reset_flags;
-	int		blksize;
-	unsigned char *	buf;
-	unsigned char * ptr;
-	io_context_t	ioctx;
-	struct iocb	io;
+	struct aio_group *aio_grp;
+	struct async_req *req;
 };
 
+static struct aio_group *
+add_aio_group(void)
+{
+	struct aio_group *aio_grp;
+
+	aio_grp = malloc(sizeof(struct aio_group));
+	if (!aio_grp)
+		return NULL;
+	memset(aio_grp, 0, sizeof(struct aio_group));
+	INIT_LIST_HEAD(&aio_grp->orphans);
+
+	if (io_setup(AIO_GROUP_SIZE, &aio_grp->ioctx) != 0) {
+		LOG(1, "io_setup failed");
+		free(aio_grp);
+		return NULL;
+	}
+	list_add(&aio_grp->node, &aio_grp_list);
+	return aio_grp;
+}
+
+static int
+set_aio_group(struct directio_context *ct)
+{
+	struct aio_group *aio_grp = NULL;
+
+	list_for_each_entry(aio_grp, &aio_grp_list, node)
+		if (aio_grp->holders < AIO_GROUP_SIZE)
+			goto found;
+	aio_grp = add_aio_group();
+	if (!aio_grp) {
+		ct->aio_grp = NULL;
+		return -1;
+	}
+found:
+	aio_grp->holders++;
+	ct->aio_grp = aio_grp;
+	return 0;
+}
+
+static void
+remove_aio_group(struct aio_group *aio_grp)
+{
+	struct async_req *req, *tmp;
+
+	io_destroy(aio_grp->ioctx);
+	list_for_each_entry_safe(req, tmp, &aio_grp->orphans, node) {
+		list_del(&req->node);
+		free(req->buf);
+		free(req);
+	}
+	list_del(&aio_grp->node);
+	free(aio_grp);
+}
+
+/* If an aio_group is completely full of orphans, then no checkers can
+ * use it, which means that no checkers can clear out the orphans. To
+ * avoid keeping the useless group around, simply remove remove the
+ * group */
+static void
+check_orphaned_group(struct aio_group *aio_grp)
+{
+	int count = 0;
+	struct list_head *item;
+
+	if (aio_grp->holders < AIO_GROUP_SIZE)
+		return;
+	list_for_each(item, &aio_grp->orphans)
+		count++;
+	if (count >= AIO_GROUP_SIZE) {
+		remove_aio_group(aio_grp);
+		if (list_empty(&aio_grp_list))
+			add_aio_group();
+	}
+}
+
+int libcheck_load (void)
+{
+	if (add_aio_group() == NULL) {
+		LOG(1, "libcheck_load failed: %s", strerror(errno));
+		return 1;
+	}
+	return 0;
+}
+
+void libcheck_unload (void)
+{
+	struct aio_group *aio_grp, *tmp;
+
+	list_for_each_entry_safe(aio_grp, tmp, &aio_grp_list, node)
+		remove_aio_group(aio_grp);
+}
+
+int libcheck_reset (void)
+{
+	struct aio_group *aio_grp, *tmp, *reset_grp = NULL;
+
+	/* If a clean existing aio_group exists, use that. Otherwise add a
+	 * new one */
+	list_for_each_entry(aio_grp, &aio_grp_list, node) {
+		if (aio_grp->holders == 0 &&
+		    list_empty(&aio_grp->orphans)) {
+			reset_grp = aio_grp;
+			break;
+		}
+	}
+	if (!reset_grp)
+		reset_grp = add_aio_group();
+	if (!reset_grp) {
+		LOG(1, "checker reset failed");
+		return 1;
+	}
+
+	list_for_each_entry_safe(aio_grp, tmp, &aio_grp_list, node) {
+		if (aio_grp != reset_grp)
+			remove_aio_group(aio_grp);
+	}
+	return 0;
+}
 
 int libcheck_init (struct checker * c)
 {
 	unsigned long pgsize = getpagesize();
 	struct directio_context * ct;
+	struct async_req *req = NULL;
 	long flags;
 
 	ct = malloc(sizeof(struct directio_context));
@@ -56,26 +199,31 @@ int libcheck_init (struct checker * c)
 		return 1;
 	memset(ct, 0, sizeof(struct directio_context));
 
-	if (io_setup(1, &ct->ioctx) != 0) {
-		condlog(1, "io_setup failed");
-		free(ct);
-		return 1;
+	if (set_aio_group(ct) < 0)
+		goto out;
+
+	req = malloc(sizeof(struct async_req));
+	if (!req) {
+		goto out;
 	}
+	memset(req, 0, sizeof(struct async_req));
+	INIT_LIST_HEAD(&req->node);
 
-	if (ioctl(c->fd, BLKBSZGET, &ct->blksize) < 0) {
+	if (ioctl(c->fd, BLKBSZGET, &req->blksize) < 0) {
 		c->msgid = MSG_DIRECTIO_BLOCKSIZE;
-		ct->blksize = 512;
+		req->blksize = 512;
 	}
-	if (ct->blksize > 4096) {
+	if (req->blksize > 4096) {
 		/*
 		 * Sanity check for DASD; BSZGET is broken
 		 */
-		ct->blksize = 4096;
+		req->blksize = 4096;
 	}
-	if (!ct->blksize)
+	if (!req->blksize)
 		goto out;
-	ct->buf = (unsigned char *)malloc(ct->blksize + pgsize);
-	if (!ct->buf)
+
+	req->buf = (unsigned char *)malloc(req->blksize + pgsize);
+	if (!req->buf)
 		goto out;
 
 	flags = fcntl(c->fd, F_GETFL);
@@ -88,17 +236,22 @@ int libcheck_init (struct checker * c)
 		ct->reset_flags = 1;
 	}
 
-	ct->ptr = (unsigned char *) (((unsigned long)ct->buf + pgsize - 1) &
+	req->ptr = (unsigned char *) (((unsigned long)req->buf + pgsize - 1) &
 		  (~(pgsize - 1)));
 
 	/* Successfully initialized, return the context. */
+	ct->req = req;
 	c->context = (void *) ct;
 	return 0;
 
 out:
-	if (ct->buf)
-		free(ct->buf);
-	io_destroy(ct->ioctx);
+	if (req) {
+		if (req->buf)
+			free(req->buf);
+		free(req);
+	}
+	if (ct->aio_grp)
+		ct->aio_grp->holders--;
 	free(ct);
 	return 1;
 }
@@ -106,6 +259,7 @@ out:
 void libcheck_free (struct checker * c)
 {
 	struct directio_context * ct = (struct directio_context *)c->context;
+	struct io_event event;
 	long flags;
 
 	if (!ct)
@@ -121,20 +275,73 @@ void libcheck_free (struct checker * c)
 		}
 	}
 
-	if (ct->buf)
-		free(ct->buf);
-	io_destroy(ct->ioctx);
+	if (ct->running &&
+	    (ct->req->state != PATH_PENDING ||
+	     io_cancel(ct->aio_grp->ioctx, &ct->req->io, &event) == 0))
+		ct->running = 0;
+	if (!ct->running) {
+		free(ct->req->buf);
+		free(ct->req);
+		ct->aio_grp->holders--;
+	} else {
+		LOG(3, "io_cancel error while freeing: %i", errno);
+		ct->req->state = PATH_REMOVED;
+		list_add(&ct->req->node, &ct->aio_grp->orphans);
+		check_orphaned_group(ct->aio_grp);
+	}
+
 	free(ct);
+	c->context = NULL;
+}
+
+static int
+get_events(struct aio_group *aio_grp, struct timespec *timeout)
+{
+	struct io_event events[128];
+	int i, nr, got_events = 0;
+	struct timespec zero_timeout = {0};
+	struct timespec *timep = (timeout)? timeout : &zero_timeout;
+
+	do {
+		errno = 0;
+		nr = io_getevents(aio_grp->ioctx, 1, 128, events, timep);
+		got_events |= (nr > 0);
+
+		for (i = 0; i < nr; i++) {
+			struct async_req *req = container_of(events[i].obj, struct async_req, io);
+
+			LOG(3, "io finished %lu/%lu", events[i].res,
+			    events[i].res2);
+
+			/* got an orphaned request */
+			if (req->state == PATH_REMOVED) {
+				list_del(&req->node);
+				free(req->buf);
+				free(req);
+				aio_grp->holders--;
+			} else
+				req->state = (events[i].res == req->blksize) ?
+					      PATH_UP : PATH_DOWN;
+		}
+		timep = &zero_timeout;
+	} while (nr == 128); /* assume there are more events and try again */
+
+	if (nr < 0)
+		LOG(3, "async io getevents returned %i (errno=%s)",
+		    nr, strerror(errno));
+
+	return got_events;
 }
 
 static int
 check_state(int fd, struct directio_context *ct, int sync, int timeout_secs)
 {
 	struct timespec	timeout = { .tv_nsec = 5 };
-	struct io_event event;
 	struct stat	sb;
-	int		rc = PATH_UNCHECKED;
+	int		rc;
 	long		r;
+	struct timespec currtime, endtime;
+	struct timespec *timep = &timeout;
 
 	if (fstat(fd, &sb) == 0) {
 		LOG(4, "called for %x", (unsigned) sb.st_rdev);
@@ -145,50 +352,62 @@ check_state(int fd, struct directio_context *ct, int sync, int timeout_secs)
 		timeout.tv_nsec = 0;
 	}
 
-	if (!ct->running) {
-		struct iocb *ios[1] = { &ct->io };
+	if (ct->running) {
+		if (ct->req->state != PATH_PENDING) {
+			ct->running = 0;
+			return ct->req->state;
+		}
+	} else {
+		struct iocb *ios[1] = { &ct->req->io };
 
 		LOG(3, "starting new request");
-		memset(&ct->io, 0, sizeof(struct iocb));
-		io_prep_pread(&ct->io, fd, ct->ptr, ct->blksize, 0);
-		if (io_submit(ct->ioctx, 1, ios) != 1) {
+		memset(&ct->req->io, 0, sizeof(struct iocb));
+		io_prep_pread(&ct->req->io, fd, ct->req->ptr,
+			      ct->req->blksize, 0);
+		ct->req->state = PATH_PENDING;
+		if (io_submit(ct->aio_grp->ioctx, 1, ios) != 1) {
 			LOG(3, "io_submit error %i", errno);
 			return PATH_UNCHECKED;
 		}
 	}
 	ct->running++;
 
-	errno = 0;
-	r = io_getevents(ct->ioctx, 1L, 1L, &event, &timeout);
+	get_monotonic_time(&endtime);
+	endtime.tv_sec += timeout.tv_sec;
+	endtime.tv_nsec += timeout.tv_nsec;
+	normalize_timespec(&endtime);
+	while(1) {
+		r = get_events(ct->aio_grp, timep);
 
-	if (r < 0 ) {
-		LOG(3, "async io getevents returned %li (errno=%s)", r,
-		    strerror(errno));
-		ct->running = 0;
-		rc = PATH_UNCHECKED;
-	} else if (r < 1L) {
-		if (ct->running > timeout_secs || sync) {
-			struct iocb *ios[1] = { &ct->io };
-
-			LOG(3, "abort check on timeout");
-			r = io_cancel(ct->ioctx, ios[0], &event);
-			/*
-			 * Only reset ct->running if we really
-			 * could abort the pending I/O
-			 */
-			if (r)
-				LOG(3, "io_cancel error %i", errno);
-			else
-				ct->running = 0;
-			rc = PATH_DOWN;
-		} else {
-			LOG(3, "async io pending");
-			rc = PATH_PENDING;
-		}
+		if (ct->req->state != PATH_PENDING) {
+			ct->running = 0;
+			return ct->req->state;
+		} else if (r == 0 || !timep)
+			break;
+
+		get_monotonic_time(&currtime);
+		timespecsub(&endtime, &currtime, &timeout);
+		if (timeout.tv_sec < 0)
+			timep = NULL;
+	}
+	if (ct->running > timeout_secs || sync) {
+		struct io_event event;
+
+		LOG(3, "abort check on timeout");
+
+		r = io_cancel(ct->aio_grp->ioctx, &ct->req->io, &event);
+		/*
+		 * Only reset ct->running if we really
+		 * could abort the pending I/O
+		 */
+		if (r)
+			LOG(3, "io_cancel error %i", errno);
+		else
+			ct->running = 0;
+		rc = PATH_DOWN;
 	} else {
-		LOG(3, "io finished %lu/%lu", event.res, event.res2);
-		ct->running = 0;
-		rc = (event.res == ct->blksize) ? PATH_UP : PATH_DOWN;
+		LOG(3, "async io pending");
+		rc = PATH_PENDING;
 	}
 
 	return rc;
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 15/15] tests: add directio unit tests
  2020-01-17  2:18 [PATCH 00/15] Multipath patch dump Benjamin Marzinski
                   ` (13 preceding siblings ...)
  2020-01-17  2:18 ` [PATCH 14/15] libmultipath: make directio checker share io contexts Benjamin Marzinski
@ 2020-01-17  2:18 ` Benjamin Marzinski
  2020-01-20  8:27   ` Martin Wilck
  14 siblings, 1 reply; 49+ messages in thread
From: Benjamin Marzinski @ 2020-01-17  2:18 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 tests/Makefile   |   3 +-
 tests/directio.c | 704 +++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 706 insertions(+), 1 deletion(-)
 create mode 100644 tests/directio.c

diff --git a/tests/Makefile b/tests/Makefile
index a5cdf390..275fdd7d 100644
--- a/tests/Makefile
+++ b/tests/Makefile
@@ -3,7 +3,8 @@ include ../Makefile.inc
 CFLAGS += $(BIN_CFLAGS) -I$(multipathdir) -I$(mpathcmddir)
 LIBDEPS += -L$(multipathdir) -lmultipath -lcmocka
 
-TESTS := uevent parser util dmevents hwtable blacklist unaligned vpd pgpolicy
+TESTS := uevent parser util dmevents hwtable blacklist unaligned vpd pgpolicy \
+	 directio
 
 .SILENT: $(TESTS:%=%.o)
 .PRECIOUS: $(TESTS:%=%-test)
diff --git a/tests/directio.c b/tests/directio.c
new file mode 100644
index 00000000..c64d21d2
--- /dev/null
+++ b/tests/directio.c
@@ -0,0 +1,704 @@
+/*
+ * Copyright (c) 2018 Benjamin Marzinski, Redhat
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <https://www.gnu.org/licenses/>.
+ *
+ */
+
+#define _GNU_SOURCE
+#include <stdint.h>
+#include <stdbool.h>
+#include <stdarg.h>
+#include <stddef.h>
+#include <setjmp.h>
+#include <stdlib.h>
+#define UNIT_TESTING /* enable memory testing in directio.c */
+#include <cmocka.h>
+#include "globals.c"
+#include "../libmultipath/checkers/directio.c"
+
+int test_fd = 111;
+int ioctx_count = 0;
+struct io_event mock_events[AIO_GROUP_SIZE]; /* same as the checker max */
+int ev_off = 0;
+struct timespec zero_timeout = {0};
+struct timespec full_timeout = { .tv_sec = -1 };
+
+int __wrap_ioctl(int fd, unsigned long request, void *argp)
+{
+	int *blocksize = (int *)argp;
+
+	assert_int_equal(fd, test_fd);
+	assert_int_equal(request, BLKBSZGET);
+	assert_non_null(blocksize);
+	*blocksize = mock_type(int);
+	return 0;
+}
+
+int __wrap_fcntl(int fd, int cmd, long arg)
+{
+	assert_int_equal(fd, test_fd);
+	assert_int_equal(cmd, F_GETFL);
+	return O_DIRECT;
+}
+
+int __wrap___fxstat(int ver, int fd, struct stat *statbuf)
+{
+	assert_int_equal(fd, test_fd);
+	assert_non_null(statbuf);
+	memset(statbuf, 0, sizeof(struct stat));
+	return 0;
+}
+
+int __wrap_io_setup(int maxevents, io_context_t *ctxp)
+{
+	ioctx_count++;
+	return mock_type(int);
+}
+
+int __wrap_io_destroy(io_context_t ctx)
+{
+	ioctx_count--;
+	return mock_type(int);
+}
+
+int __wrap_io_submit(io_context_t ctx, long nr, struct iocb *ios[])
+{
+	return mock_type(int);
+}
+
+int __wrap_io_cancel(io_context_t ctx, struct iocb *iocb, struct io_event *evt)
+{
+	return mock_type(int);
+}
+
+int __wrap_io_getevents(io_context_t ctx, long min_nr, long nr,
+			struct io_event *events, struct timespec *timeout)
+{
+	int i, nr_evs;
+	struct timespec *sleep_tmo;
+	struct io_event *evs;
+
+	assert_non_null(timeout);
+	nr_evs = mock_type(int);
+	assert_true(nr_evs <= nr);
+	if (!nr_evs)
+		return 0;
+	sleep_tmo = mock_ptr_type(struct timespec *);
+	if (sleep_tmo) {
+		if (sleep_tmo->tv_sec < 0)
+			nanosleep(timeout, NULL);
+		else
+			nanosleep(sleep_tmo, NULL);
+	}
+	if (nr_evs < 0) {
+		errno = -nr_evs;
+		return -1;
+	}
+	evs = mock_ptr_type(struct io_event *);
+	for (i = 0; i < nr_evs; i++)
+		events[i] = evs[i];
+	ev_off -= nr_evs;
+	return nr_evs;
+}
+
+static void return_io_getevents_none(void)
+{
+	will_return(__wrap_io_getevents, 0);
+}
+
+static void return_io_getevents_nr(struct timespec *ts, int nr,
+				   struct async_req **reqs, int *res)
+{
+	int i, off = 0;
+
+	for(i = 0; i < nr; i++) {
+		mock_events[i + ev_off].obj = &reqs[i]->io;
+		if (res[i] == 0)
+			mock_events[i + ev_off].res = reqs[i]->blksize;
+	}
+	while (nr > 0) {
+		will_return(__wrap_io_getevents, (nr > 128)? 128 : nr);
+		will_return(__wrap_io_getevents, ts);
+		will_return(__wrap_io_getevents, &mock_events[off + ev_off]);
+		ts = NULL;
+		off += 128;
+		nr -= 128;
+	}
+	if (nr == 0)
+		will_return(__wrap_io_getevents, 0);
+	ev_off += i;
+}
+
+void do_check_state(struct checker *c, int sync, int timeout, int chk_state)
+{
+	struct directio_context * ct = (struct directio_context *)c->context;
+
+	if (!ct->running)
+		will_return(__wrap_io_submit, 1);
+	assert_int_equal(check_state(test_fd, ct, sync, timeout), chk_state);
+	assert_int_equal(ev_off, 0);
+	memset(mock_events, 0, sizeof(mock_events));
+}
+
+void do_libcheck_unload(int nr_aio_grps)
+{
+	int count = 0;
+	struct aio_group *aio_grp;
+
+	list_for_each_entry(aio_grp, &aio_grp_list, node)
+		count++;
+	assert_int_equal(count, nr_aio_grps);
+	for (count = 0; count < nr_aio_grps; count++)
+		will_return(__wrap_io_destroy, 0);
+	libcheck_unload();
+	assert_true(list_empty(&aio_grp_list));
+	assert_int_equal(ioctx_count, 0);
+}
+
+static void do_libcheck_init(struct checker *c, int blocksize,
+			     struct async_req **req)
+{
+	struct directio_context * ct;
+
+	c->fd = test_fd;
+	will_return(__wrap_ioctl, blocksize);
+	assert_int_equal(libcheck_init(c), 0);
+	ct = (struct directio_context *)c->context;
+	assert_non_null(ct);
+	assert_non_null(ct->aio_grp);
+	assert_non_null(ct->req);
+	if (req)
+		*req = ct->req;
+	assert_int_equal(ct->req->blksize, blocksize);
+}
+
+static int is_checker_running(struct checker *c)
+{
+	struct directio_context * ct = (struct directio_context *)c->context;
+	return ct->running;
+}
+
+static struct aio_group *get_aio_grp(struct checker *c)
+{
+	struct directio_context * ct = (struct directio_context *)c->context;
+
+	assert_non_null(ct);
+	return ct->aio_grp;
+}
+
+static void check_aio_grp(struct aio_group *aio_grp, int holders,
+			  int orphans)
+{
+	int count = 0;
+	struct list_head *item;
+
+	list_for_each(item, &aio_grp->orphans)
+		count++;
+	assert_int_equal(holders, aio_grp->holders);
+	assert_int_equal(orphans, count);
+}
+
+static void do_libcheck_load(void)
+{
+	int count = 0;
+	struct aio_group *aio_grp;
+
+	assert_true(list_empty(&aio_grp_list));
+	will_return(__wrap_io_setup, 0);
+	assert_int_equal(libcheck_load(), 0);
+	list_for_each_entry(aio_grp, &aio_grp_list, node) {
+		assert_int_equal(aio_grp->holders, 0);
+		count++;
+	}
+	assert_int_equal(count, 1);
+}
+
+/* simple loading resetting and unloading test */
+static void test_load_reset_unload(void **state)
+{
+	struct list_head *item;
+	do_libcheck_load();
+	item = aio_grp_list.next;
+	assert_int_equal(libcheck_reset(), 0);
+	assert_false(list_empty(&aio_grp_list));
+	assert_true(item == aio_grp_list.next);
+	do_libcheck_unload(1);
+}
+
+/* test initializing and then freeing 4096 checkers */
+static void test_init_free(void **state)
+{
+	int i, count = 0;
+	struct checker c[4096] = {0};
+	struct aio_group *aio_grp;
+
+	do_libcheck_load();
+	aio_grp = list_entry(aio_grp_list.next, typeof(*aio_grp), node);
+	will_return(__wrap_io_setup, 0);
+	will_return(__wrap_io_setup, 0);
+	will_return(__wrap_io_setup, 0);
+	for (i = 0; i < 4096; i++) {
+		struct directio_context * ct;
+
+		if (i % 3 == 0)
+			do_libcheck_init(&c[i], 512, NULL);
+		else if (i % 3 == 1)
+			do_libcheck_init(&c[i], 1024, NULL);
+		else
+			do_libcheck_init(&c[i], 4096, NULL);
+		ct = (struct directio_context *)c[i].context;
+		assert_non_null(ct->aio_grp);
+		if (i && (i & 1023) == 0)
+			aio_grp = ct->aio_grp;
+		else {
+			assert_ptr_equal(ct->aio_grp, aio_grp);
+			assert_int_equal(aio_grp->holders, (i & 1023) + 1);
+		}
+	}
+	count = 0;
+	list_for_each_entry(aio_grp, &aio_grp_list, node)
+		count++;
+	assert_int_equal(count, 4);
+	for (i = 0; i < 4096; i++) {
+		struct directio_context * ct = (struct directio_context *)c[i].context;
+
+		aio_grp = ct->aio_grp;
+		libcheck_free(&c[i]);
+		assert_int_equal(aio_grp->holders, 1023 - (i & 1023));
+	}
+	count = 0;
+	list_for_each_entry(aio_grp, &aio_grp_list, node) {
+		assert_int_equal(aio_grp->holders, 0);
+		count++;
+	}
+	assert_int_equal(count, 4);
+	will_return(__wrap_io_destroy, 0);
+	will_return(__wrap_io_destroy, 0);
+	will_return(__wrap_io_destroy, 0);
+	assert_int_equal(libcheck_reset(), 0);
+	do_libcheck_unload(1);
+}
+
+/* check mixed initializing and freeing 4096 checkers */
+static void test_multi_init_free(void **state)
+{
+	int i, count;
+	struct checker c[4096] = {0};
+	struct aio_group *aio_grp;
+
+	do_libcheck_load();
+	will_return(__wrap_io_setup, 0);
+	will_return(__wrap_io_setup, 0);
+	will_return(__wrap_io_setup, 0);
+	for (count = 0, i = 0; i < 4096; count++) {
+		/* usually init, but occasionally free checkers */
+		if (count == 0 || (count % 5 != 0 && count % 7 != 0)) {
+			do_libcheck_init(&c[i], 4096, NULL);
+			i++;
+		} else {
+			i--;
+			libcheck_free(&c[i]);
+		}
+	}
+	count = 0;
+	list_for_each_entry(aio_grp, &aio_grp_list, node) {
+		assert_int_equal(aio_grp->holders, 1024);
+		count++;
+	}
+	assert_int_equal(count, 4);
+	for (count = 0, i = 4096; i > 0; count++) {
+		/* usually free, but occasionally init checkers */
+		if (count == 0 || (count % 5 != 0 && count % 7 != 0)) {
+			i--;
+			libcheck_free(&c[i]);
+		} else {
+			do_libcheck_init(&c[i], 4096, NULL);
+			i++;
+		}
+	}
+	do_libcheck_unload(4);
+}
+
+/* simple single checker sync test */
+static void test_check_state_simple(void **state)
+{
+	struct checker c = {0};
+	struct async_req *req;
+	int res = 0;
+
+	do_libcheck_load();
+	do_libcheck_init(&c, 4096, &req);
+	return_io_getevents_nr(NULL, 1, &req, &res);
+	do_check_state(&c, 1, 30, PATH_UP);
+	libcheck_free(&c);
+	do_libcheck_unload(1);
+}
+
+/* test sync timeout */
+static void test_check_state_timeout(void **state)
+{
+	struct checker c = {0};
+	struct aio_group *aio_grp;
+
+	do_libcheck_load();
+	do_libcheck_init(&c, 4096, NULL);
+	aio_grp = get_aio_grp(&c);
+	return_io_getevents_none();
+	will_return(__wrap_io_cancel, 0);
+	do_check_state(&c, 1, 30, PATH_DOWN);
+	check_aio_grp(aio_grp, 1, 0);
+	libcheck_free(&c);
+	do_libcheck_unload(1);
+}
+
+/* test async timeout */
+static void test_check_state_async_timeout(void **state)
+{
+	struct checker c = {0};
+	struct aio_group *aio_grp;
+
+	do_libcheck_load();
+	do_libcheck_init(&c, 4096, NULL);
+	aio_grp = get_aio_grp(&c);
+	return_io_getevents_none();
+	do_check_state(&c, 0, 3, PATH_PENDING);
+	return_io_getevents_none();
+	do_check_state(&c, 0, 3, PATH_PENDING);
+	return_io_getevents_none();
+	do_check_state(&c, 0, 3, PATH_PENDING);
+	return_io_getevents_none();
+	will_return(__wrap_io_cancel, 0);
+	do_check_state(&c, 0, 3, PATH_DOWN);
+	check_aio_grp(aio_grp, 1, 0);
+	libcheck_free(&c);
+	do_libcheck_unload(1);
+}
+
+/* test freeing checkers with outstanding requests */
+static void test_free_with_pending(void **state)
+{
+        struct checker c[2] = {0};
+        struct aio_group *aio_grp;
+	struct async_req *req;
+	int res = 0;
+
+        do_libcheck_load();
+        do_libcheck_init(&c[0], 4096, &req);
+	do_libcheck_init(&c[1], 4096, NULL);
+        aio_grp = get_aio_grp(c);
+        return_io_getevents_none();
+        do_check_state(&c[0], 0, 30, PATH_PENDING);
+	return_io_getevents_nr(NULL, 1, &req, &res);
+	return_io_getevents_none();
+	do_check_state(&c[1], 0, 30, PATH_PENDING);
+	assert_true(is_checker_running(&c[0]));
+	assert_true(is_checker_running(&c[1]));
+	check_aio_grp(aio_grp, 2, 0);
+        libcheck_free(&c[0]);
+	check_aio_grp(aio_grp, 1, 0);
+        will_return(__wrap_io_cancel, 0);
+        libcheck_free(&c[1]);
+        check_aio_grp(aio_grp, 0, 0);
+        do_libcheck_unload(1);
+}
+
+/* test removing orpahed aio_group on free */
+static void test_orphaned_aio_group(void **state)
+{
+	struct checker c[AIO_GROUP_SIZE] = {0};
+	struct aio_group *aio_grp, *tmp_grp;
+	int i;
+
+        do_libcheck_load();
+	for (i = 0; i < AIO_GROUP_SIZE; i++) {
+		do_libcheck_init(&c[i], 4096, NULL);
+		return_io_getevents_none();
+		do_check_state(&c[i], 0, 30, PATH_PENDING);
+	}
+	aio_grp = get_aio_grp(c);
+	check_aio_grp(aio_grp, AIO_GROUP_SIZE, 0);
+	i = 0;
+	list_for_each_entry(tmp_grp, &aio_grp_list, node)
+		i++;
+	assert_int_equal(i, 1);
+	for (i = 0; i < AIO_GROUP_SIZE; i++) {
+		assert_true(is_checker_running(&c[i]));
+		will_return(__wrap_io_cancel, -1);
+		if (i == AIO_GROUP_SIZE - 1) {
+			/* remove the orphaned group and create a new one */
+			will_return(__wrap_io_destroy, 0);
+			will_return(__wrap_io_setup, 0);
+		}
+		libcheck_free(&c[i]);
+	}
+	i = 0;
+	list_for_each_entry(tmp_grp, &aio_grp_list, node) {
+		i++;
+		check_aio_grp(aio_grp, 0, 0);
+	}
+	assert_int_equal(i, 1);
+        do_libcheck_unload(1);
+}
+
+/* test sync timeout with failed cancel and cleanup by another
+ * checker */
+static void test_timeout_cancel_failed(void **state)
+{
+	struct checker c[2] = {0};
+	struct aio_group *aio_grp;
+	struct async_req *reqs[2];
+	int res[] = {0,0};
+	int i;
+
+	do_libcheck_load();
+	for (i = 0; i < 2; i++)
+		do_libcheck_init(&c[i], 4096, &reqs[i]);
+	aio_grp = get_aio_grp(c);
+	return_io_getevents_none();
+	will_return(__wrap_io_cancel, -1);
+	do_check_state(&c[0], 1, 30, PATH_DOWN);
+	assert_true(is_checker_running(&c[0]));
+	check_aio_grp(aio_grp, 2, 0);
+	return_io_getevents_none();
+	will_return(__wrap_io_cancel, -1);
+	do_check_state(&c[0], 1, 30, PATH_DOWN);
+	assert_true(is_checker_running(&c[0]));
+	return_io_getevents_nr(NULL, 2, reqs, res);
+	do_check_state(&c[1], 1, 30, PATH_UP);
+	do_check_state(&c[0], 1, 30, PATH_UP);
+	for (i = 0; i < 2; i++) {
+		assert_false(is_checker_running(&c[i]));
+		libcheck_free(&c[i]);
+	}
+	do_libcheck_unload(1);
+}
+
+/* test async timeout with failed cancel and cleanup by another
+ * checker */
+static void test_async_timeout_cancel_failed(void **state)
+{
+	struct checker c[2] = {0};
+	struct async_req *reqs[2];
+	int res[] = {0,0};
+	int i;
+
+	do_libcheck_load();
+	for (i = 0; i < 2; i++)
+		do_libcheck_init(&c[i], 4096, &reqs[i]);
+	return_io_getevents_none();
+	do_check_state(&c[0], 0, 2, PATH_PENDING);
+	return_io_getevents_none();
+	do_check_state(&c[1], 0, 2, PATH_PENDING);
+	return_io_getevents_none();
+	do_check_state(&c[0], 0, 2, PATH_PENDING);
+	return_io_getevents_none();
+	do_check_state(&c[1], 0, 2, PATH_PENDING);
+	return_io_getevents_none();
+	will_return(__wrap_io_cancel, -1);
+	do_check_state(&c[0], 0, 2, PATH_DOWN);
+	return_io_getevents_nr(NULL, 1, &reqs[1], &res[1]);
+	do_check_state(&c[1], 0, 2, PATH_UP);
+	return_io_getevents_none();
+	will_return(__wrap_io_cancel, -1);
+	do_check_state(&c[0], 0, 2, PATH_DOWN);
+	assert_true(is_checker_running(&c[0]));
+	return_io_getevents_nr(NULL, 2, reqs, res);
+	do_check_state(&c[1], 0, 2, PATH_UP);
+	do_check_state(&c[0], 0, 2, PATH_UP);
+	for (i = 0; i < 2; i++) {
+		assert_false(is_checker_running(&c[i]));
+		libcheck_free(&c[i]);
+	}
+	do_libcheck_unload(1);
+}
+
+/* test orphaning a request, and having another checker clean it up */
+static void test_orphan_checker_cleanup(void **state)
+{
+	struct checker c[2] = {0};
+	struct async_req *reqs[2];
+	int res[] = {0,0};
+	struct aio_group *aio_grp;
+	int i;
+
+	do_libcheck_load();
+	for (i = 0; i < 2; i++)
+		do_libcheck_init(&c[i], 4096, &reqs[i]);
+	aio_grp = get_aio_grp(c);
+	return_io_getevents_none();
+	do_check_state(&c[0], 0, 30, PATH_PENDING);
+	will_return(__wrap_io_cancel, -1);
+	check_aio_grp(aio_grp, 2, 0);
+	libcheck_free(&c[0]);
+	check_aio_grp(aio_grp, 2, 1);
+	return_io_getevents_nr(NULL, 2, reqs, res);
+	do_check_state(&c[1], 0, 2, PATH_UP);
+	check_aio_grp(aio_grp, 1, 0);
+	libcheck_free(&c[1]);
+	check_aio_grp(aio_grp, 0, 0);
+	do_libcheck_unload(1);
+}
+
+/* test orphaning a request, and having reset clean it up */
+static void test_orphan_reset_cleanup(void **state)
+{
+	struct checker c;
+	struct aio_group *orphan_aio_grp, *tmp_aio_grp;
+	int found, count;
+
+	do_libcheck_load();
+	do_libcheck_init(&c, 4096, NULL);
+	orphan_aio_grp = get_aio_grp(&c);
+	return_io_getevents_none();
+	do_check_state(&c, 0, 30, PATH_PENDING);
+	will_return(__wrap_io_cancel, -1);
+	check_aio_grp(orphan_aio_grp, 1, 0);
+	libcheck_free(&c);
+	check_aio_grp(orphan_aio_grp, 1, 1);
+	found = count = 0;
+	list_for_each_entry(tmp_aio_grp, &aio_grp_list, node) {
+		count++;
+		if (tmp_aio_grp == orphan_aio_grp)
+			found = 1;
+	}
+	assert_int_equal(count, 1);
+	assert_int_equal(found, 1);
+	will_return(__wrap_io_setup, 0);
+	will_return(__wrap_io_destroy, 0);
+	assert_int_equal(libcheck_reset(), 0);
+	found = count = 0;
+	list_for_each_entry(tmp_aio_grp, &aio_grp_list, node) {
+		count++;
+		check_aio_grp(tmp_aio_grp, 0, 0);
+		if (tmp_aio_grp == orphan_aio_grp)
+			found = 1;
+	}
+	assert_int_equal(count, 1);
+	assert_int_equal(found, 0);
+	do_libcheck_unload(1);
+}
+
+/* test orphaning a request, and having unload clean it up */
+static void test_orphan_unload_cleanup(void **state)
+{
+	struct checker c;
+	struct aio_group *orphan_aio_grp, *tmp_aio_grp;
+	int found, count;
+
+	do_libcheck_load();
+	do_libcheck_init(&c, 4096, NULL);
+	orphan_aio_grp = get_aio_grp(&c);
+	return_io_getevents_none();
+	do_check_state(&c, 0, 30, PATH_PENDING);
+	will_return(__wrap_io_cancel, -1);
+	check_aio_grp(orphan_aio_grp, 1, 0);
+	libcheck_free(&c);
+	check_aio_grp(orphan_aio_grp, 1, 1);
+	found = count = 0;
+	list_for_each_entry(tmp_aio_grp, &aio_grp_list, node) {
+		count++;
+		if (tmp_aio_grp == orphan_aio_grp)
+			found = 1;
+	}
+	assert_int_equal(count, 1);
+	assert_int_equal(found, 1);
+	do_libcheck_unload(1);
+}
+
+/* test checkers with different blocksizes */
+static void test_check_state_blksize(void **state)
+{
+	int i;
+	struct checker c[3] = {0};
+	int blksize[] = {4096, 1024, 512};
+	struct async_req *reqs[3];
+	int res[] = {0,1,0};
+	int chk_state[] = {PATH_UP, PATH_DOWN, PATH_UP};
+
+	do_libcheck_load();
+	for (i = 0; i < 3; i++)
+		do_libcheck_init(&c[i], blksize[i], &reqs[i]);
+	for (i = 0; i < 3; i++) {
+		return_io_getevents_nr(NULL, 1, &reqs[i], &res[i]);
+		do_check_state(&c[i], 1, 30, chk_state[i]);
+	}
+	for (i = 0; i < 3; i++) {
+		assert_false(is_checker_running(&c[i]));
+		libcheck_free(&c[i]);
+	}
+	do_libcheck_unload(1);
+}
+
+/* test async checkers pending and getting resovled by another checker
+ * as well as the loops for getting multiple events */
+static void test_check_state_async(void **state)
+{
+	int i;
+	struct checker c[257] = {0};
+	struct async_req *reqs[257];
+	int res[257] = {0};
+
+	do_libcheck_load();
+	for (i = 0; i < 257; i++)
+		do_libcheck_init(&c[i], 4096, &reqs[i]);
+	for (i = 0; i < 256; i++) {
+		return_io_getevents_none();
+		do_check_state(&c[i], 0, 30, PATH_PENDING);
+		assert_true(is_checker_running(&c[i]));
+	}
+	return_io_getevents_nr(&full_timeout, 256, reqs, res);
+	return_io_getevents_nr(NULL, 1, &reqs[256], &res[256]);
+	do_check_state(&c[256], 0, 30, PATH_UP);
+	assert_false(is_checker_running(&c[256]));
+	libcheck_free(&c[i]);
+	for (i = 0; i < 256; i++) {
+		do_check_state(&c[i], 0, 30, PATH_UP);
+		assert_false(is_checker_running(&c[i]));
+		libcheck_free(&c[i]);
+	}
+	do_libcheck_unload(1);
+}
+
+int test_directio(void)
+{
+	const struct CMUnitTest tests[] = {
+		cmocka_unit_test(test_load_reset_unload),
+		cmocka_unit_test(test_init_free),
+		cmocka_unit_test(test_multi_init_free),
+		cmocka_unit_test(test_check_state_simple),
+		cmocka_unit_test(test_check_state_timeout),
+		cmocka_unit_test(test_check_state_async_timeout),
+		cmocka_unit_test(test_free_with_pending),
+		cmocka_unit_test(test_timeout_cancel_failed),
+		cmocka_unit_test(test_async_timeout_cancel_failed),
+		cmocka_unit_test(test_orphan_checker_cleanup),
+		cmocka_unit_test(test_orphan_reset_cleanup),
+		cmocka_unit_test(test_orphan_unload_cleanup),
+		cmocka_unit_test(test_check_state_blksize),
+		cmocka_unit_test(test_check_state_async),
+		cmocka_unit_test(test_orphaned_aio_group),
+	};
+
+	return cmocka_run_group_tests(tests, NULL, NULL);
+}
+
+int main(void)
+{
+	int ret = 0;
+
+	conf.verbosity = 2;
+	ret += test_directio();
+	return ret;
+}
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* Re: [PATCH 01/15] multipathd: warn when configuration has been changed.
  2020-01-17  2:18 ` [PATCH 01/15] multipathd: warn when configuration has been changed Benjamin Marzinski
@ 2020-01-17 15:45   ` Martin Wilck
  2020-01-22 11:59     ` Benjamin Marzinski
  2020-01-17 15:51   ` Martin Wilck
  1 sibling, 1 reply; 49+ messages in thread
From: Martin Wilck @ 2020-01-17 15:45 UTC (permalink / raw)
  To: bmarzins, christophe.varoqui; +Cc: dm-devel

On Thu, 2020-01-16 at 20:18 -0600, Benjamin Marzinski wrote:
> It would be helpful if multipathd could log a message when
> multipath.conf or files in the config_dir have been written to, both
> so
> that it can be used to send a notification to users, and to help with
> determining after the fact if multipathd was running with an older
> config, when the logs of multipathd's behaviour don't match with the
> current multipath.conf.
> 
> To do this, the multipathd uxlsnr thread now sets up inotify watches
> on
> both /etc/multipath.conf and the config_dir to watch if the files are
> deleted or closed after being opened for writing.  In order to keep
> uxlsnr from polling repeatedly if the multipath.conf or the
> config_dir
> aren't present, it will only set up the watches once per reconfigure.
> However, since multipath.conf is far more likely to be replaced by a
> text editor than modified in place, if it gets removed, multipathd
> will
> immediately try to restart the watch on it (which will succeed if the
> file was simply replaced by a new copy).  This does mean that if
> multipath.conf or the config_dir are actually removed and then later
> re-added, multipathd won't log any more messages for changes until
> the
> next reconfigure. But that seems like a fair trade-off to avoid
> repeatedly polling for files that aren't likely to appear.
> 
> Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> ---
>  libmultipath/config.h |   1 +
>  multipathd/main.c     |   1 +
>  multipathd/uxlsnr.c   | 134
> ++++++++++++++++++++++++++++++++++++++++--
>  3 files changed, 130 insertions(+), 6 deletions(-)

I know I reviewed this already, but this time I have some more remarks,
mostly style.

> 
> diff --git a/libmultipath/config.h b/libmultipath/config.h
> index ffec3103..e69aa07c 100644
> --- a/libmultipath/config.h
> +++ b/libmultipath/config.h
> @@ -188,6 +188,7 @@ struct config {
>  	int find_multipaths_timeout;
>  	int marginal_pathgroups;
>  	unsigned int version[3];
> +	unsigned int sequence_nr;
>  
>  	char * multipath_dir;
>  	char * selector;
> diff --git a/multipathd/main.c b/multipathd/main.c
> index 34a57689..7b364cfe 100644
> --- a/multipathd/main.c
> +++ b/multipathd/main.c
> @@ -2618,6 +2618,7 @@ reconfigure (struct vectors * vecs)
>  	uxsock_timeout = conf->uxsock_timeout;
>  
>  	old = rcu_dereference(multipath_conf);
> +	conf->sequence_nr = old->sequence_nr + 1;
>  	rcu_assign_pointer(multipath_conf, conf);
>  	call_rcu(&old->rcu, rcu_free_config);
>  
> diff --git a/multipathd/uxlsnr.c b/multipathd/uxlsnr.c
> index bc71679e..92d9a79a 100644
> --- a/multipathd/uxlsnr.c
> +++ b/multipathd/uxlsnr.c
> @@ -23,6 +23,7 @@
>  #include <sys/time.h>
>  #include <signal.h>
>  #include <stdbool.h>
> +#include <sys/inotify.h>
>  #include "checkers.h"
>  #include "memory.h"
>  #include "debug.h"
> @@ -51,6 +52,8 @@ struct client {
>  LIST_HEAD(clients);
>  pthread_mutex_t client_lock = PTHREAD_MUTEX_INITIALIZER;
>  struct pollfd *polls;
> +int notify_fd = -1;
> +char *config_dir;

Nit: please declare these as static, as they are used only in this
file. Also, naming the variable differently, e.g. conf_dir, would
decrease the number of false hits when grepping for the variable name.

>  
>  static bool _socket_client_is_root(int fd);
>  
> @@ -151,6 +154,8 @@ void uxsock_cleanup(void *arg)
>  	long ux_sock = (long)arg;
>  
>  	close(ux_sock);
> +	close(notify_fd);
> +	free(config_dir);
>  
>  	pthread_mutex_lock(&client_lock);
>  	list_for_each_entry_safe(client_loop, client_tmp, &clients,
> node) {
> @@ -162,6 +167,106 @@ void uxsock_cleanup(void *arg)
>  	free_polls();
>  }
>  
> +/* failing to set the watch descriptor is o.k. we just miss a
> warning
> + * message */
> +void reset_watch(int notify_fd, int *wds, unsigned int *sequence_nr)

This function could also be static.

> +{
> +	struct config *conf;
> +	int dir_reset = 0;
> +	int conf_reset = 0;
> +
> +	if (notify_fd == -1)
> +		return;
> +
> +	conf = get_multipath_config();
> +	/* instead of repeatedly try to reset the inotify watch if
> +	 * the config directory or multipath.conf isn't there, just
> +	 * do it once per reconfigure */
> +	if (*sequence_nr != conf->sequence_nr) {
> +		*sequence_nr = conf->sequence_nr;
> +		if (wds[0] == -1)
> +			conf_reset = 1;
> +		if (!config_dir || !conf->config_dir ||
> +		    strcmp(config_dir, conf->config_dir)) {
> +			dir_reset = 1;
> +			if (config_dir)
> +				free(config_dir);
> +			if (conf->config_dir)
> +				config_dir = strdup(conf->config_dir);
> +			else
> +				config_dir = NULL;
> +		} else if (wds[1] == -1)
> +			dir_reset = 1;
> +	}
> +	put_multipath_config(conf);
> +
> +	if (dir_reset) {
> +		if (wds[1] != -1) {
> +			inotify_rm_watch(notify_fd, wds[1]);
> +			wds[1] = -1;
> +		}
> +		if (config_dir) {
> +			wds[1] = inotify_add_watch(notify_fd,
> config_dir,
> +						   IN_CLOSE_WRITE |
> IN_DELETE |
> +						   IN_ONLYDIR);
> +			if (wds[1] == -1)
> +				condlog(3, "didn't set up notifications
> on %s: %s", config_dir, strerror(errno));

Another nitpick: IMO we should be using "%m" for this in new code.

> +		}
> +	}
> +	if (conf_reset) {
> +		wds[0] = inotify_add_watch(notify_fd,
> DEFAULT_CONFIGFILE,
> +					   IN_CLOSE_WRITE);
> +		if (wds[0] == -1)
> +			condlog(3, "didn't set up notifications on
> /etc/multipath.conf: %s", strerror(errno));
> +	}
> +	return;
> +}
> +
> +void handle_inotify(int fd, int  *wds)

Again, static.

> +{
> +	char buff[1024]
> +		__attribute__ ((aligned(__alignof__(struct
> inotify_event))));
> +	const struct inotify_event *event;
> +	ssize_t len;
> +	char *ptr;
> +	int i, got_notify = 0;
> +
> +	for (;;) {
> +		len = read(fd, buff, sizeof(buff));
> +		if (len <= 0) {
> +			if (len < 0 && errno != EAGAIN) {
> +				condlog(3, "error reading from
> inotify_fd");
> +				for (i = 0; i < 2; i++) {
> +					if (wds[i] != -1) {
> +						inotify_rm_watch(fd,
> wds[i]);
> +						wds[i] = -1;
> +					}
> +				}
> +			}
> +			break;
> +		}
> +
> +		got_notify = 1;
> +		for (ptr = buff; ptr < buff + len;
> +		     ptr += sizeof(struct inotify_event) + event->len)
> {
> +			event = (const struct inotify_event *) ptr;
> +
> +			if (event->mask & IN_IGNORED) {
> +				/* multipathd.conf may have been
> overwritten.
> +				 * Try once to reset the notification
> */
> +				if (wds[0] == event->wd)
> +					wds[0] =
> inotify_add_watch(notify_fd,
> +							DEFAULT_CONFIGF
> ILE,
> +							IN_CLOSE_WRITE)
> ;
> +				else if (wds[1] == event->wd)
> +					wds[1] = -1;
> +			}
> +		}
> +	}
> +	if (got_notify)
> +		condlog(1, "Multipath configuration updated.\nReload
> multipathd for changes to take effect");
> +}
> +
>  /*
>   * entry point
>   */
> @@ -173,13 +278,19 @@ void * uxsock_listen(uxsock_trigger_fn
> uxsock_trigger, long ux_sock,
>  	char *reply;
>  	sigset_t mask;
>  	int old_clients = MIN_POLLS;
> +	/* conf->sequence_nr will be 1 when uxsock_listen is first
> called */
> +	unsigned int sequence_nr = 0;
> +	int wds[2] = { -1, -1 };

Style issue: The code might be better readable if we used a struct for
this,

struct watch_descriptors {
	int conf_wd;
	int dir_wd;
};	

-- 
Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
SUSE  Software Solutions Germany GmbH
HRB 36809, AG Nürnberg GF: Felix
Imendörffer

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 02/15] libmultipath: fix leak in foreign code
  2020-01-17  2:18 ` [PATCH 02/15] libmultipath: fix leak in foreign code Benjamin Marzinski
@ 2020-01-17 15:50   ` Martin Wilck
  0 siblings, 0 replies; 49+ messages in thread
From: Martin Wilck @ 2020-01-17 15:50 UTC (permalink / raw)
  To: Benjamin Marzinski, Christophe Varoqui; +Cc: device-mapper development

On Thu, 2020-01-16 at 20:18 -0600, Benjamin Marzinski wrote:
> If scandir fails or finds no foreign libraries, enable_re needs to be
> freed before exitting.
> 
> Fixes: 8d03eda4 'multipath.conf: add "enable_foreign" parameter'
> Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> ---
>  libmultipath/foreign.c | 11 +++++++----
>  1 file changed, 7 insertions(+), 4 deletions(-)
> 

Reviewed-by: Martin Wilck <mwilck@suse.com>

-- 

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 01/15] multipathd: warn when configuration has been changed.
  2020-01-17  2:18 ` [PATCH 01/15] multipathd: warn when configuration has been changed Benjamin Marzinski
  2020-01-17 15:45   ` Martin Wilck
@ 2020-01-17 15:51   ` Martin Wilck
  1 sibling, 0 replies; 49+ messages in thread
From: Martin Wilck @ 2020-01-17 15:51 UTC (permalink / raw)
  To: Benjamin Marzinski, Christophe Varoqui; +Cc: device-mapper development

On Thu, 2020-01-16 at 20:18 -0600, Benjamin Marzinski wrote:
> It would be helpful if multipathd could log a message when
> multipath.conf or files in the config_dir have been written to, both
> so
> that it can be used to send a notification to users, and to help with
> determining after the fact if multipathd was running with an older
> config, when the logs of multipathd's behaviour don't match with the
> current multipath.conf.
> 
> To do this, the multipathd uxlsnr thread now sets up inotify watches
> on
> both /etc/multipath.conf and the config_dir to watch if the files are
> deleted or closed after being opened for writing.  In order to keep
> uxlsnr from polling repeatedly if the multipath.conf or the
> config_dir
> aren't present, it will only set up the watches once per reconfigure.
> However, since multipath.conf is far more likely to be replaced by a
> text editor than modified in place, if it gets removed, multipathd
> will
> immediately try to restart the watch on it (which will succeed if the
> file was simply replaced by a new copy).  This does mean that if
> multipath.conf or the config_dir are actually removed and then later
> re-added, multipathd won't log any more messages for changes until
> the
> next reconfigure. But that seems like a fair trade-off to avoid
> repeatedly polling for files that aren't likely to appear.
> 
> Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> ---
>  libmultipath/config.h |   1 +
>  multipathd/main.c     |   1 +
>  multipathd/uxlsnr.c   | 134
> ++++++++++++++++++++++++++++++++++++++++--
>  3 files changed, 130 insertions(+), 6 deletions(-)

I know I reviewed this already, but this time I have some more remarks,
mostly style.

> diff --git a/libmultipath/config.h b/libmultipath/config.h
> index ffec3103..e69aa07c 100644
> --- a/libmultipath/config.h
> +++ b/libmultipath/config.h
> @@ -188,6 +188,7 @@ struct config {
>  	int find_multipaths_timeout;
>  	int marginal_pathgroups;
>  	unsigned int version[3];
> +	unsigned int sequence_nr;
>  
>  	char * multipath_dir;
>  	char * selector;
> diff --git a/multipathd/main.c b/multipathd/main.c
> index 34a57689..7b364cfe 100644
> --- a/multipathd/main.c
> +++ b/multipathd/main.c
> @@ -2618,6 +2618,7 @@ reconfigure (struct vectors * vecs)
>  	uxsock_timeout = conf->uxsock_timeout;
>  
>  	old = rcu_dereference(multipath_conf);
> +	conf->sequence_nr = old->sequence_nr + 1;
>  	rcu_assign_pointer(multipath_conf, conf);
>  	call_rcu(&old->rcu, rcu_free_config);
>  
> diff --git a/multipathd/uxlsnr.c b/multipathd/uxlsnr.c
> index bc71679e..92d9a79a 100644
> --- a/multipathd/uxlsnr.c
> +++ b/multipathd/uxlsnr.c
> @@ -23,6 +23,7 @@
>  #include <sys/time.h>
>  #include <signal.h>
>  #include <stdbool.h>
> +#include <sys/inotify.h>
>  #include "checkers.h"
>  #include "memory.h"
>  #include "debug.h"
> @@ -51,6 +52,8 @@ struct client {
>  LIST_HEAD(clients);
>  pthread_mutex_t client_lock = PTHREAD_MUTEX_INITIALIZER;
>  struct pollfd *polls;
> +int notify_fd = -1;
> +char *config_dir;

Nit: please declare these as static, as they are used only in this
file. Also, naming the variable differently, e.g. conf_dir, would
decrease the number of false hits when grepping for the variable name.

>  
>  static bool _socket_client_is_root(int fd);
>  
> @@ -151,6 +154,8 @@ void uxsock_cleanup(void *arg)
>  	long ux_sock = (long)arg;
>  
>  	close(ux_sock);
> +	close(notify_fd);
> +	free(config_dir);
>  
>  	pthread_mutex_lock(&client_lock);
>  	list_for_each_entry_safe(client_loop, client_tmp, &clients,
> node) {
> @@ -162,6 +167,106 @@ void uxsock_cleanup(void *arg)
>  	free_polls();
>  }
>  
> +/* failing to set the watch descriptor is o.k. we just miss a
> warning
> + * message */
> +void reset_watch(int notify_fd, int *wds, unsigned int *sequence_nr)

This function could also be static.

> +{
> +	struct config *conf;
> +	int dir_reset = 0;
> +	int conf_reset = 0;
> +
> +	if (notify_fd == -1)
> +		return;
> +
> +	conf = get_multipath_config();
> +	/* instead of repeatedly try to reset the inotify watch if
> +	 * the config directory or multipath.conf isn't there, just
> +	 * do it once per reconfigure */
> +	if (*sequence_nr != conf->sequence_nr) {
> +		*sequence_nr = conf->sequence_nr;
> +		if (wds[0] == -1)
> +			conf_reset = 1;
> +		if (!config_dir || !conf->config_dir ||
> +		    strcmp(config_dir, conf->config_dir)) {
> +			dir_reset = 1;
> +			if (config_dir)
> +				free(config_dir);
> +			if (conf->config_dir)
> +				config_dir = strdup(conf->config_dir);
> +			else
> +				config_dir = NULL;
> +		} else if (wds[1] == -1)
> +			dir_reset = 1;
> +	}
> +	put_multipath_config(conf);
> +
> +	if (dir_reset) {
> +		if (wds[1] != -1) {
> +			inotify_rm_watch(notify_fd, wds[1]);
> +			wds[1] = -1;
> +		}
> +		if (config_dir) {
> +			wds[1] = inotify_add_watch(notify_fd,
> config_dir,
> +						   IN_CLOSE_WRITE |
> IN_DELETE |
> +						   IN_ONLYDIR);
> +			if (wds[1] == -1)
> +				condlog(3, "didn't set up notifications
> on %s: %s", config_dir, strerror(errno));

Another nitpick: IMO we should be using "%m" for this in new code.

> +		}
> +	}
> +	if (conf_reset) {
> +		wds[0] = inotify_add_watch(notify_fd,
> DEFAULT_CONFIGFILE,
> +					   IN_CLOSE_WRITE);
> +		if (wds[0] == -1)
> +			condlog(3, "didn't set up notifications on
> /etc/multipath.conf: %s", strerror(errno));
> +	}
> +	return;
> +}
> +
> +void handle_inotify(int fd, int  *wds)

Again, static.

> +{
> +	char buff[1024]
> +		__attribute__ ((aligned(__alignof__(struct
> inotify_event))));
> +	const struct inotify_event *event;
> +	ssize_t len;
> +	char *ptr;
> +	int i, got_notify = 0;
> +
> +	for (;;) {
> +		len = read(fd, buff, sizeof(buff));
> +		if (len <= 0) {
> +			if (len < 0 && errno != EAGAIN) {
> +				condlog(3, "error reading from
> inotify_fd");
> +				for (i = 0; i < 2; i++) {
> +					if (wds[i] != -1) {
> +						inotify_rm_watch(fd,
> wds[i]);
> +						wds[i] = -1;
> +					}
> +				}
> +			}
> +			break;
> +		}
> +
> +		got_notify = 1;
> +		for (ptr = buff; ptr < buff + len;
> +		     ptr += sizeof(struct inotify_event) + event->len)
> {
> +			event = (const struct inotify_event *) ptr;
> +
> +			if (event->mask & IN_IGNORED) {
> +				/* multipathd.conf may have been
> overwritten.
> +				 * Try once to reset the notification
> */
> +				if (wds[0] == event->wd)
> +					wds[0] =
> inotify_add_watch(notify_fd,
> +							DEFAULT_CONFIGF
> ILE,
> +							IN_CLOSE_WRITE)
> ;
> +				else if (wds[1] == event->wd)
> +					wds[1] = -1;
> +			}
> +		}
> +	}
> +	if (got_notify)
> +		condlog(1, "Multipath configuration updated.\nReload
> multipathd for changes to take effect");
> +}
> +
>  /*
>   * entry point
>   */
> @@ -173,13 +278,19 @@ void * uxsock_listen(uxsock_trigger_fn
> uxsock_trigger, long ux_sock,
>  	char *reply;
>  	sigset_t mask;
>  	int old_clients = MIN_POLLS;
> +	/* conf->sequence_nr will be 1 when uxsock_listen is first
> called */
> +	unsigned int sequence_nr = 0;
> +	int wds[2] = { -1, -1 };

Style issue: The code might be better readable if we used a struct for
this,

struct watch_descriptors {
	int conf_wd;
	int dir_wd;
};	

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 03/15] Fix leak in mpathpersist
  2020-01-17  2:18 ` [PATCH 03/15] Fix leak in mpathpersist Benjamin Marzinski
@ 2020-01-17 15:53   ` Martin Wilck
  0 siblings, 0 replies; 49+ messages in thread
From: Martin Wilck @ 2020-01-17 15:53 UTC (permalink / raw)
  To: Benjamin Marzinski, Christophe Varoqui; +Cc: device-mapper development

On Thu, 2020-01-16 at 20:18 -0600, Benjamin Marzinski wrote:
> If the persistent in command fails, the response buffer must be
> freed.
> Found by Coverity
> 
> Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> ---
>  mpathpersist/main.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/mpathpersist/main.c b/mpathpersist/main.c
> index 278b8d51..920e686c 100644
> --- a/mpathpersist/main.c
> +++ b/mpathpersist/main.c
> @@ -499,6 +499,7 @@ static int handle_args(int argc, char * argv[],
> int nline)
>  		if (ret != MPATH_PR_SUCCESS )
>  		{
>  			fprintf (stderr, "Persistent Reserve IN command
> failed\n");
> +			free(resp);
>  			goto out_fd;
>  		}
>  

Reviewed-by: Martin Wilck <mwilck@suse.com>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 04/15] libmultipath: remove unused path->prio_args
  2020-01-17  2:18 ` [PATCH 04/15] libmultipath: remove unused path->prio_args Benjamin Marzinski
@ 2020-01-17 15:55   ` Martin Wilck
  0 siblings, 0 replies; 49+ messages in thread
From: Martin Wilck @ 2020-01-17 15:55 UTC (permalink / raw)
  To: Benjamin Marzinski, Christophe Varoqui; +Cc: device-mapper development

On Thu, 2020-01-16 at 20:18 -0600, Benjamin Marzinski wrote:
> Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> ---
>  libmultipath/structs.h | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/libmultipath/structs.h b/libmultipath/structs.h
> index a3adf906..1c32a799 100644
> --- a/libmultipath/structs.h
> +++ b/libmultipath/structs.h
> @@ -272,7 +272,6 @@ struct path {
>  	char * uid_attribute;
>  	char * getuid;
>  	struct prio prio;
> -	char * prio_args;
>  	struct checker checker;
>  	struct multipath * mpp;
>  	int fd;

Reviewed-by: Martin Wilck <mwilck@suse.com>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 05/15] libmultipath: constify get_unaligned_be*
  2020-01-17  2:18 ` [PATCH 05/15] libmultipath: constify get_unaligned_be* Benjamin Marzinski
@ 2020-01-17 15:57   ` Martin Wilck
  0 siblings, 0 replies; 49+ messages in thread
From: Martin Wilck @ 2020-01-17 15:57 UTC (permalink / raw)
  To: Benjamin Marzinski, Christophe Varoqui; +Cc: device-mapper development

On Thu, 2020-01-16 at 20:18 -0600, Benjamin Marzinski wrote:
> Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> ---
>  libmultipath/unaligned.h | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 

Reviewed-by: Martin Wilck <mwilck@suse.com>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 06/15] libmultipath: add missing hwe mpe variable merges
  2020-01-17  2:18 ` [PATCH 06/15] libmultipath: add missing hwe mpe variable merges Benjamin Marzinski
@ 2020-01-17 16:04   ` Martin Wilck
  0 siblings, 0 replies; 49+ messages in thread
From: Martin Wilck @ 2020-01-17 16:04 UTC (permalink / raw)
  To: Benjamin Marzinski, Christophe Varoqui; +Cc: device-mapper development

On Thu, 2020-01-16 at 20:18 -0600, Benjamin Marzinski wrote:
> There were some variables in the hwe and mpe structs that weren't
> being
> merged by merge_hwe() and merge_mpe().
> 
> Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> ---
>  libmultipath/config.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/libmultipath/config.c b/libmultipath/config.c
> index 20e3b8bf..85626e96 100644
> --- a/libmultipath/config.c
> +++ b/libmultipath/config.c
> @@ -372,6 +372,10 @@ merge_hwe (struct hwentry * dst, struct hwentry
> * src)
>  	merge_num(san_path_err_threshold);
>  	merge_num(san_path_err_forget_rate);
>  	merge_num(san_path_err_recovery_time);
> +	merge_num(marginal_path_err_sample_time);
> +	merge_num(marginal_path_err_rate_threshold);
> +	merge_num(marginal_path_err_recheck_gap_time);
> +	merge_num(marginal_path_double_failed_time);

Hm, looking at this, I think we should not attempt to merge sets of
parameters that belong together like this. The marginal_path params
should only be merged if they are *all* unset in the destination.
Otherwise we risk to end up with an inconsistent parameter set from two
different sources.

But that's not a failure of your patch, it should be done in a follow-
up fix. So:

Reviewed-by: Martin Wilck <mwilck@suse.com>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 07/15] libmultipath: fix sgio_get_vpd looping
  2020-01-17  2:18 ` [PATCH 07/15] libmultipath: fix sgio_get_vpd looping Benjamin Marzinski
@ 2020-01-17 16:27   ` Martin Wilck
  2020-01-20 16:59     ` Benjamin Marzinski
  0 siblings, 1 reply; 49+ messages in thread
From: Martin Wilck @ 2020-01-17 16:27 UTC (permalink / raw)
  To: Benjamin Marzinski, Christophe Varoqui; +Cc: device-mapper development

On Thu, 2020-01-16 at 20:18 -0600, Benjamin Marzinski wrote:
> If do_inq returns a page with a length that is less than maxlen, but
> larger than DEFAULT_SGIO_LEN, this function will loop forever. Also
> if do_inq returns with a length equal to or greater than maxlen,
> sgio_get_vpd will exit immediately, even if it hasn't read the entire
> page.  Fix these issues, modify the tests to verify the new behavior.
> 
> Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> ---
>  libmultipath/discovery.c | 12 +++---
>  tests/vpd.c              | 84 ++++++++++++++++++++++++------------
> ----
>  2 files changed, 57 insertions(+), 39 deletions(-)
> 
> diff --git a/libmultipath/discovery.c b/libmultipath/discovery.c
> index 72f455e8..3c72a80a 100644
> --- a/libmultipath/discovery.c
> +++ b/libmultipath/discovery.c
> @@ -870,6 +870,7 @@ static int
>  sgio_get_vpd (unsigned char * buff, int maxlen, int fd, int pg)
>  {
>  	int len = DEFAULT_SGIO_LEN;
> +	int rlen;
>  
>  	if (fd < 0) {
>  		errno = EBADF;
> @@ -877,12 +878,11 @@ sgio_get_vpd (unsigned char * buff, int maxlen,
> int fd, int pg)
>  	}
>  retry:
>  	if (0 == do_inq(fd, 0, 1, pg, buff, len)) {
> -		len = get_unaligned_be16(&buff[2]) + 4;
> -		if (len >= maxlen)
> -			return len;
> -		if (len > DEFAULT_SGIO_LEN)
> -			goto retry;
> -		return len;
> +		rlen = get_unaligned_be16(&buff[2]) + 4;
> +		if (rlen <= len || len >= maxlen)
> +			return rlen;
> +		len = (rlen < maxlen)? rlen : maxlen;
> +		goto retry;
>  	}
>  	return -1;
>  }

This looks good.

> diff --git a/tests/vpd.c b/tests/vpd.c
> index d9f80eaa..4dbce010 100644
> --- a/tests/vpd.c
> +++ b/tests/vpd.c
> @@ -306,7 +306,7 @@ static int create_vpd83(unsigned char *buf,
> size_t bufsiz, const char *id,
>  	default:
>  		break;
>  	}
> -	put_unaligned_be16(n, buf + 2);
> +	put_unaligned_be16(bufsiz, buf + 2);
>  	return n + 4;
>  }

I can see that you are trying to create a VPD with a certain given
length. But this way you intentionally create a VPD that doesn't
conform to the spec (offset 2 should contain the real length of the
designator list, not some arbitrary value). This is dangerous, in the
future someone may copy this code thinking that it creates a valid
VPD. At least you should add a big fat comment. Better even, you
should leave out this hunk and override the length value in the
actual test (make_test_vpd_eui) if (sml == 1) (and also add a comment).

Regards
Martin

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 09/15] libmultipath: add code to get vendor specific vpd data
  2020-01-17  2:18 ` [PATCH 09/15] libmultipath: add code to get vendor specific vpd data Benjamin Marzinski
@ 2020-01-17 16:56   ` Martin Wilck
  2020-01-22  8:51     ` Benjamin Marzinski
  2020-01-18 17:34   ` John Stoffel
  1 sibling, 1 reply; 49+ messages in thread
From: Martin Wilck @ 2020-01-17 16:56 UTC (permalink / raw)
  To: Benjamin Marzinski, Christophe Varoqui; +Cc: device-mapper development

On Thu, 2020-01-16 at 20:18 -0600, Benjamin Marzinski wrote:
> This adds the wildcard 'g' for multipath and path formatted printing,
> which returns extra data from a device's vendor specific vpd
> page.  The
> specific vendor vpd page to use, and the vendor/product id to decode
> it
> can be set in the hwentry with vpd_vendor_pg and vpd_vendor_id. It
> can
> be configured in the devices section of multipath.conf with the
> vpd_vendor parameter. Currently, the only devices that use this are
> HPE
> 3PAR arrays, to return the Volume Name.
> 
> Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> ---
>  libmultipath/config.c      |  4 ++++
>  libmultipath/config.h      |  2 ++
>  libmultipath/dict.c        | 34 ++++++++++++++++++++++++++++++++++
>  libmultipath/discovery.c   | 34 +++++++++++++++++++++++++++++++++-
>  libmultipath/hwtable.c     |  2 ++
>  libmultipath/print.c       | 27 +++++++++++++++++++++++++++
>  libmultipath/propsel.c     | 24 ++++++++++++++++++++++++
>  libmultipath/propsel.h     |  2 ++
>  libmultipath/structs.h     |  9 +++++++++
>  multipath/multipath.conf.5 |  8 ++++++++
>  10 files changed, 145 insertions(+), 1 deletion(-)
> 
> diff --git a/libmultipath/config.c b/libmultipath/config.c
> index 85626e96..72b8d37c 100644
> --- a/libmultipath/config.c
> +++ b/libmultipath/config.c
> @@ -369,6 +369,8 @@ merge_hwe (struct hwentry * dst, struct hwentry *
> src)
>  	merge_num(max_sectors_kb);
>  	merge_num(ghost_delay);
>  	merge_num(all_tg_pt);
> +	merge_num(vpd_vendor_pg);
> +	merge_num(vpd_vendor_id);
>  	merge_num(san_path_err_threshold);
>  	merge_num(san_path_err_forget_rate);
>  	merge_num(san_path_err_recovery_time);
> @@ -517,6 +519,8 @@ store_hwe (vector hwtable, struct hwentry * dhwe)
>  	hwe->detect_prio = dhwe->detect_prio;
>  	hwe->detect_checker = dhwe->detect_checker;
>  	hwe->ghost_delay = dhwe->ghost_delay;
> +	hwe->vpd_vendor_pg = dhwe->vpd_vendor_pg;
> +	hwe->vpd_vendor_id = dhwe->vpd_vendor_id;
>  
>  	if (dhwe->bl_product && !(hwe->bl_product = set_param_str(dhwe-
> >bl_product)))
>  		goto out;
> diff --git a/libmultipath/config.h b/libmultipath/config.h
> index e69aa07c..589146de 100644
> --- a/libmultipath/config.h
> +++ b/libmultipath/config.h
> @@ -87,6 +87,8 @@ struct hwentry {
>  	int max_sectors_kb;
>  	int ghost_delay;
>  	int all_tg_pt;
> +	int vpd_vendor_pg;
> +	int vpd_vendor_id;
>  	char * bl_product;
>  };
>  
> diff --git a/libmultipath/dict.c b/libmultipath/dict.c
> index 2b046e1d..d6d8b79b 100644
> --- a/libmultipath/dict.c
> +++ b/libmultipath/dict.c
> @@ -1366,6 +1366,39 @@ def_uxsock_timeout_handler(struct config
> *conf, vector strvec)
>  	return 0;
>  }
>  
> +static int
> +hw_vpd_vendor_handler(struct config *conf, vector strvec)
> +{
> +	int rc = 0;
> +	char *buff;
> +
> +	struct hwentry * hwe = VECTOR_LAST_SLOT(conf->hwtable);
> +	if (!hwe)
> +		return 1;
> +
> +	buff = set_value(strvec);
> +	if (!buff)
> +		return 1;
> +	if (strcmp(buff, "hp3par") == 0) {
> +		hwe->vpd_vendor_pg = 0xc0;
> +		hwe->vpd_vendor_id = VPD_VP_HP3PAR;
> +	} else
> +		rc = 1;
> +	FREE(buff);
> +	return rc;
> +}
> +
> +static int
> +snprint_hw_vpd_vendor(struct config *conf, char * buff, int len,
> +		      const void * data)
> +{
> +	const struct hwentry * hwe = (const struct hwentry *)data;
> +
> +	if (hwe->vpd_vendor_pg == 0xc0 && hwe->vpd_vendor_id ==
> VPD_VP_HP3PAR)
> +		return snprintf(buff, len, "hp3par");
> +	return 0;
> +}

I don't understand this design. The way you set up the hwe, it would be
logical to add "vpd_vendor_page" and "vpd_vendor_id" properties for
device entries.

BUT ok, that's overengineered with just one supported combination of
page and vendor. I can understand that. But then, it seems also
overengineered to carry around vpd_vendor_pg and vpd_vendor_id in the
hwe. 

I'd suggest creating a hard-coded table with "vendor vpd schemes",

struct {
	int pg;
	int vendor_id;
	const char *name;
} vendor_vpd_schemes[]  = { 
    { 0xc0, VPD_VP_HP3PAR, "hp3par", }, 
    { 0, }, 
};

and in the hwentry, only carry around the index into this table,
and look up the page and vendor ID there. Actually, with just that
single user, we might as well not introduce a new device property at
all, but simply use a separate hardcoded table with lookup values for
vendor and product ID. I'm unsure if it's wise to make this
configurable via multipath.conf.

Regards,
Martin

-- 
Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
SUSE
Software Solutions Germany GmbH 
HRB 36809 (AG Nürnberg) GF: Felix
Imendörffer

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 08/15] libmultipath: add vend_id to get_vpd_sgio
  2020-01-17  2:18 ` [PATCH 08/15] libmultipath: add vend_id to get_vpd_sgio Benjamin Marzinski
@ 2020-01-17 16:58   ` Martin Wilck
  0 siblings, 0 replies; 49+ messages in thread
From: Martin Wilck @ 2020-01-17 16:58 UTC (permalink / raw)
  To: Benjamin Marzinski, Christophe Varoqui; +Cc: device-mapper development

On Thu, 2020-01-16 at 20:18 -0600, Benjamin Marzinski wrote:
> This tells multipath how it should decode vendor specific pages. It
> will
> be used by a future patch.
> 
> Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> ---
>  libmultipath/discovery.c |  4 ++--
>  libmultipath/discovery.h |  2 +-
>  libmultipath/propsel.c   |  2 +-
>  tests/vpd.c              | 10 +++++-----
>  4 files changed, 9 insertions(+), 9 deletions(-)

Reviewed-by: Martin Wilck <mwilck@suse.com>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 10/15] libmultipath: change how the checker async is set
  2020-01-17  2:18 ` [PATCH 10/15] libmultipath: change how the checker async is set Benjamin Marzinski
@ 2020-01-17 17:02   ` Martin Wilck
  2020-01-17 17:04   ` Martin Wilck
  2020-01-18 17:43   ` John Stoffel
  2 siblings, 0 replies; 49+ messages in thread
From: Martin Wilck @ 2020-01-17 17:02 UTC (permalink / raw)
  To: bmarzins, christophe.varoqui; +Cc: dm-devel

On Thu, 2020-01-16 at 20:18 -0600, Benjamin Marzinski wrote:
> The way that the checkers async mode worked in multipathd didn't make
> much sense. When multipathd starts up, all checker classes are in the
> sync mode, and any pathinfo() calls on devices would run the checker
> in
> sync mode. However, the First time a checker class was used in
> checkerloop, it would set that checker class to async (assuming
> force_sync wasn't set).  After that, no matter when a checker from
> that
> class was called, it would always run in async mode.  Multipathd
> doesn't
> need to run checkers in sync mode at all, so don't.
> 
> Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> ---
>  libmpathpersist/mpath_persist.c |  2 +-
>  libmultipath/discovery.c        | 10 ++++------
>  multipath/main.c                |  1 +
>  3 files changed, 6 insertions(+), 7 deletions(-)
> 

Reviewed-by: Martin Wilck <mwilck@suse.com>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 10/15] libmultipath: change how the checker async is set
  2020-01-17  2:18 ` [PATCH 10/15] libmultipath: change how the checker async is set Benjamin Marzinski
  2020-01-17 17:02   ` Martin Wilck
@ 2020-01-17 17:04   ` Martin Wilck
  2020-01-18 17:43   ` John Stoffel
  2 siblings, 0 replies; 49+ messages in thread
From: Martin Wilck @ 2020-01-17 17:04 UTC (permalink / raw)
  To: Benjamin Marzinski, Christophe Varoqui; +Cc: device-mapper development

On Thu, 2020-01-16 at 20:18 -0600, Benjamin Marzinski wrote:
> The way that the checkers async mode worked in multipathd didn't make
> much sense. When multipathd starts up, all checker classes are in the
> sync mode, and any pathinfo() calls on devices would run the checker
> in
> sync mode. However, the First time a checker class was used in
> checkerloop, it would set that checker class to async (assuming
> force_sync wasn't set).  After that, no matter when a checker from
> that
> class was called, it would always run in async mode.  Multipathd
> doesn't
> need to run checkers in sync mode at all, so don't.
> 
> Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> ---
>  libmpathpersist/mpath_persist.c |  2 +-
>  libmultipath/discovery.c        | 10 ++++------
>  multipath/main.c                |  1 +
>  3 files changed, 6 insertions(+), 7 deletions(-)
> 

Reviewed-by: Martin Wilck <mwilck@suse.com>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 11/15] libmultipath: change failed path prio timeout
  2020-01-17  2:18 ` [PATCH 11/15] libmultipath: change failed path prio timeout Benjamin Marzinski
@ 2020-01-17 17:18   ` Martin Wilck
  2020-01-20 16:22     ` Benjamin Marzinski
  0 siblings, 1 reply; 49+ messages in thread
From: Martin Wilck @ 2020-01-17 17:18 UTC (permalink / raw)
  To: Benjamin Marzinski, Christophe Varoqui; +Cc: device-mapper development

On Thu, 2020-01-16 at 20:18 -0600, Benjamin Marzinski wrote:
> multipath will try to get the priority from a PATH_DOWN path, if the
> path doesn't currently have a valid priority. However, if the
> priority
> code needs to contact the device to get the priority, this is likely
> to
> fail for PATH_DOWN paths.  This code dates back to when multipathd
> could
> not easily reload device tables with failed paths, so getting the
> correct priority was important to have a correctly configured device.
> Now multipathd can simply reload the device to move the path to the
> correct pathgroup when the path comes back up.  Since there are a
> number
> of prioritizers that don't require talking to the device, multipath
> shouldn't completely skip attempting to get the priority of these
> paths,
> but it should set a small timeout, so that it isn't hanging in the
> case where it needs to contact a device through a failed path.
> 
> Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> ---
>  libmultipath/discovery.c | 14 ++++++--------
>  libmultipath/prio.c      |  2 +-
>  2 files changed, 7 insertions(+), 9 deletions(-)
> 
> diff --git a/libmultipath/discovery.c b/libmultipath/discovery.c
> index 1ab093f4..2c331db8 100644
> --- a/libmultipath/discovery.c
> +++ b/libmultipath/discovery.c
> @@ -1661,11 +1661,10 @@ get_state (struct path * pp, struct config
> *conf, int daemon, int oldstate)
>  }
>  
>  static int
> -get_prio (struct path * pp)
> +get_prio (struct path * pp, int timeout)
>  {
>  	struct prio * p;
>  	struct config *conf;
> -	int checker_timeout;
>  	int old_prio;
>  
>  	if (!pp)
> @@ -1684,11 +1683,8 @@ get_prio (struct path * pp)
>  			return 1;
>  		}
>  	}
> -	conf = get_multipath_config();
> -	checker_timeout = conf->checker_timeout;
> -	put_multipath_config(conf);
>  	old_prio = pp->priority;
> -	pp->priority = prio_getprio(p, pp, checker_timeout);
> +	pp->priority = prio_getprio(p, pp, timeout);
>  	if (pp->priority < 0) {
>  		/* this changes pp->offline, but why not */
>  		int state = path_offline(pp);
> @@ -2095,11 +2091,13 @@ int pathinfo(struct path *pp, struct config
> *conf, int mask)
>  
>  	 /*
>  	  * Retrieve path priority, even for PATH_DOWN paths if it has
> never
> -	  * been successfully obtained before.
> +	  * been successfully obtained before. If path is down don't
> try
> +	  * for too long.
>  	  */
>  	if ((mask & DI_PRIO) && path_state == PATH_UP && strlen(pp-
> >wwid)) {
>  		if (pp->state != PATH_DOWN || pp->priority ==
> PRIO_UNDEF) {
> -			get_prio(pp);
> +			get_prio(pp, (pp->state != PATH_DOWN)?
> +				     (conf->checker_timeout * 1000) :
> 10);
>  		}
>  	}
>  
> diff --git a/libmultipath/prio.c b/libmultipath/prio.c
> index 87de1f97..21c1b092 100644
> --- a/libmultipath/prio.c
> +++ b/libmultipath/prio.c
> @@ -14,7 +14,7 @@ unsigned int get_prio_timeout(unsigned int
> checker_timeout,
>  			      unsigned int default_timeout)
>  {
>  	if (checker_timeout)
> -		return checker_timeout * 1000;
> +		return checker_timeout;
>  	return default_timeout;
>  }
> 

This changes the unit of the first get_prio_timeout() argument from
seconds to milliseconds. While that's a good thing (it was questionable
design to have the same function take several "timeout" arguments in
different units), we should rename the argument there to avoid
confusion (checker_timeout's unit is seconds all around).

Apart from that, ACK.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 12/15] multipathd: add new paths under vecs lock
  2020-01-17  2:18 ` [PATCH 12/15] multipathd: add new paths under vecs lock Benjamin Marzinski
@ 2020-01-17 17:39   ` Martin Wilck
  0 siblings, 0 replies; 49+ messages in thread
From: Martin Wilck @ 2020-01-17 17:39 UTC (permalink / raw)
  To: Benjamin Marzinski, Christophe Varoqui; +Cc: device-mapper development

On Thu, 2020-01-16 at 20:18 -0600, Benjamin Marzinski wrote:
> Right now, multipathd only ever calls pathinfo on a path outside of
> the
> vecs lock in one circumstance, when it's adding a new path from a
> uevent. Doing this can cause weirdness or crash multipathd.
> 
> First off, the path checker code is written assuming that only one
> checker instance will be active at a time.  If this isn't the case,
> two
> processes can modify the checker's refcounts at the same time,
> potentially causing a checker to get removed while still in use (if
> two
> threads try to increment the refcount at the same time, causing it to
> only increment once).

The obvious solution to that problem would be making the refcount an
atomic variable.

> Another issue is that since the vecs lock isn't grabbed until after
> all
> of the pathinfo is gathered, and the vecs lock is the only thing
> keeping
> the uevent servicing thread from running at the same time as a
> reconfigure is happening, it is possible to have multipathd add a
> path
> using the old config after reconfigure has already added that path
> using
> the new config. Leading to two versions of the path on the pathvec.

This is a valid argument, but strengthens the gut feeling that I've had
for some time: that it was not a good idea to handle the multipath
configuration with RCU. Using RCU means that while a new configuration
is already in place, other threads may (continue to) use the old
configuration, and that's dangerous. I believe that the configuration
should be protected by a lock. When reconfigure() is called, all
asynchrous processing should essentially stop, and be allowed to
continue only after reconfigure() has finished.

> There are possibly other issues with running pathinfo outside of the
> vecs lock that I haven't noticed, as well.

Sooner or later we should reach a point where it's not necessary
any more to take the global lock just to call pathinfo(). But for the
current situation, I guess you're right.

> The easiest solution is to call pathinfo on the path while holding
> the
> vecs lock, to keep other threads from modifying the checker structure
> or
> the config while setting up the path. This is what happens every
> other
> time multipathd calls pathinfo, including when a path is added
> through
> the multipathd command interface.
> 
> This does mean that any time spent calling pathinfo on new paths will
> block other threads that need the vecs lock, but since the checker is
> now always called in async mode, and the prio function will use a
> short timeout if called on a failed path, this shouldn't be any more
> noticeable than the calls to check_path() for already existing paths.
> 
> Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> ---
>  multipathd/main.c | 16 ++++++++--------
>  1 file changed, 8 insertions(+), 8 deletions(-)

Reviewed-by: Martin Wilck <mwilck@suse.com>



> 
> diff --git a/multipathd/main.c b/multipathd/main.c
> index 7b364cfe..9e01cfaa 100644
> --- a/multipathd/main.c
> +++ b/multipathd/main.c
> @@ -906,9 +906,8 @@ uev_add_path (struct uevent *uev, struct vectors
> * vecs, int need_do_map)
>  			}
>  		}
>  	}
> -	lock_cleanup_pop(vecs->lock);
>  	if (pp)
> -		return ret;
> +		goto out;
>  
>  	/*
>  	 * get path vital state
> @@ -920,13 +919,13 @@ uev_add_path (struct uevent *uev, struct
> vectors * vecs, int need_do_map)
>  	pthread_cleanup_pop(1);
>  	if (!pp) {
>  		if (ret == PATHINFO_SKIPPED)
> -			return 0;
> -		condlog(3, "%s: failed to get path info", uev->kernel);
> -		return 1;
> +			ret = 0;
> +		else {
> +			condlog(3, "%s: failed to get path info", uev-
> >kernel);
> +			ret = 1;
> +		}
> +		goto out;
>  	}
> -	pthread_cleanup_push(cleanup_lock, &vecs->lock);
> -	lock(&vecs->lock);
> -	pthread_testcancel();
>  	ret = store_path(vecs->pathvec, pp);
>  	if (!ret) {
>  		conf = get_multipath_config();
> @@ -940,6 +939,7 @@ uev_add_path (struct uevent *uev, struct vectors
> * vecs, int need_do_map)
>  		free_path(pp);
>  		ret = 1;
>  	}
> +out:
>  	lock_cleanup_pop(vecs->lock);
>  	return ret;
>  }

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 13/15] libmultipath: add new checker class functions
  2020-01-17  2:18 ` [PATCH 13/15] libmultipath: add new checker class functions Benjamin Marzinski
@ 2020-01-17 17:41   ` Martin Wilck
  0 siblings, 0 replies; 49+ messages in thread
From: Martin Wilck @ 2020-01-17 17:41 UTC (permalink / raw)
  To: Benjamin Marzinski, Christophe Varoqui; +Cc: device-mapper development

On Thu, 2020-01-16 at 20:18 -0600, Benjamin Marzinski wrote:
> This commit adds the optional functions libcheck_load,
> libcheck_unload,
> and libcheck_reset. libcheck_load is called when a checker is first
> loaded, libcheck_unload is called when it is unloaded, and
> libcheck_reset is called during reconfigure, after all the current
> paths have been removed. They are designed to setup, reset, and
> destroy
> global state that all checkers of this class use. They will be used
> in future commits.
> 
> Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> ---
>  libmultipath/checkers.c | 29 ++++++++++++++++++++++++++++-
>  libmultipath/checkers.h |  1 +
>  multipathd/main.c       |  1 +
>  3 files changed, 30 insertions(+), 1 deletion(-)
> 

Reviewed-by: Martin Wilck <mwilck@suse.com>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 09/15] libmultipath: add code to get vendor specific vpd data
  2020-01-17  2:18 ` [PATCH 09/15] libmultipath: add code to get vendor specific vpd data Benjamin Marzinski
  2020-01-17 16:56   ` Martin Wilck
@ 2020-01-18 17:34   ` John Stoffel
  2020-01-22  9:46     ` Benjamin Marzinski
  1 sibling, 1 reply; 49+ messages in thread
From: John Stoffel @ 2020-01-18 17:34 UTC (permalink / raw)
  To: Benjamin Marzinski; +Cc: device-mapper development, Martin Wilck

>>>>> "Benjamin" == Benjamin Marzinski <bmarzins@redhat.com> writes:

Benjamin> This adds the wildcard 'g' for multipath and path formatted
Benjamin> printing, which returns extra data from a device's vendor
Benjamin> specific vpd page.  The specific vendor vpd page to use, and
Benjamin> the vendor/product id to decode it can be set in the hwentry
Benjamin> with vpd_vendor_pg and vpd_vendor_id. It can be configured
Benjamin> in the devices section of multipath.conf with the vpd_vendor
Benjamin> parameter. Currently, the only devices that use this are HPE
Benjamin> 3PAR arrays, to return the Volume Name.

So I've looked over this patch and I see how you've hard-coded the
3par entry into the various .c files, but the multipathd.conf example
really doesn't give a good example of how to use it, nor does the code
give a good example, or maybe better yet a template set of functions
for parsing custom VPD info pages.  This is really just a nit.  

Also, is the Vendor VPD page always at 0xco, or is that just a 3par
setting?  I wonder if that constant should be pulled into a #define
somewhere, so that it's simpler to update/change as needed?
Especially if other vendors use different pages.

John



Benjamin> Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Benjamin> ---
Benjamin>  libmultipath/config.c      |  4 ++++
Benjamin>  libmultipath/config.h      |  2 ++
Benjamin>  libmultipath/dict.c        | 34 ++++++++++++++++++++++++++++++++++
Benjamin>  libmultipath/discovery.c   | 34 +++++++++++++++++++++++++++++++++-
Benjamin>  libmultipath/hwtable.c     |  2 ++
Benjamin>  libmultipath/print.c       | 27 +++++++++++++++++++++++++++
Benjamin>  libmultipath/propsel.c     | 24 ++++++++++++++++++++++++
Benjamin>  libmultipath/propsel.h     |  2 ++
Benjamin>  libmultipath/structs.h     |  9 +++++++++
Benjamin>  multipath/multipath.conf.5 |  8 ++++++++
Benjamin>  10 files changed, 145 insertions(+), 1 deletion(-)

Benjamin> diff --git a/libmultipath/config.c b/libmultipath/config.c
Benjamin> index 85626e96..72b8d37c 100644
Benjamin> --- a/libmultipath/config.c
Benjamin> +++ b/libmultipath/config.c
Benjamin> @@ -369,6 +369,8 @@ merge_hwe (struct hwentry * dst, struct hwentry * src)
Benjamin>  	merge_num(max_sectors_kb);
Benjamin>  	merge_num(ghost_delay);
Benjamin>  	merge_num(all_tg_pt);
Benjamin> +	merge_num(vpd_vendor_pg);
Benjamin> +	merge_num(vpd_vendor_id);
Benjamin>  	merge_num(san_path_err_threshold);
Benjamin>  	merge_num(san_path_err_forget_rate);
Benjamin>  	merge_num(san_path_err_recovery_time);
Benjamin> @@ -517,6 +519,8 @@ store_hwe (vector hwtable, struct hwentry * dhwe)
hwe-> detect_prio = dhwe->detect_prio;
hwe-> detect_checker = dhwe->detect_checker;
hwe-> ghost_delay = dhwe->ghost_delay;
Benjamin> +	hwe->vpd_vendor_pg = dhwe->vpd_vendor_pg;
Benjamin> +	hwe->vpd_vendor_id = dhwe->vpd_vendor_id;
 
Benjamin>  	if (dhwe->bl_product && !(hwe->bl_product = set_param_str(dhwe->bl_product)))
Benjamin>  		goto out;
Benjamin> diff --git a/libmultipath/config.h b/libmultipath/config.h
Benjamin> index e69aa07c..589146de 100644
Benjamin> --- a/libmultipath/config.h
Benjamin> +++ b/libmultipath/config.h
Benjamin> @@ -87,6 +87,8 @@ struct hwentry {
Benjamin>  	int max_sectors_kb;
Benjamin>  	int ghost_delay;
Benjamin>  	int all_tg_pt;
Benjamin> +	int vpd_vendor_pg;
Benjamin> +	int vpd_vendor_id;
Benjamin>  	char * bl_product;
Benjamin>  };
 
Benjamin> diff --git a/libmultipath/dict.c b/libmultipath/dict.c
Benjamin> index 2b046e1d..d6d8b79b 100644
Benjamin> --- a/libmultipath/dict.c
Benjamin> +++ b/libmultipath/dict.c
Benjamin> @@ -1366,6 +1366,39 @@ def_uxsock_timeout_handler(struct config *conf, vector strvec)
Benjamin>  	return 0;
Benjamin>  }
 
Benjamin> +static int
Benjamin> +hw_vpd_vendor_handler(struct config *conf, vector strvec)
Benjamin> +{
Benjamin> +	int rc = 0;
Benjamin> +	char *buff;
Benjamin> +
Benjamin> +	struct hwentry * hwe = VECTOR_LAST_SLOT(conf->hwtable);
Benjamin> +	if (!hwe)
Benjamin> +		return 1;
Benjamin> +
Benjamin> +	buff = set_value(strvec);
Benjamin> +	if (!buff)
Benjamin> +		return 1;
Benjamin> +	if (strcmp(buff, "hp3par") == 0) {
Benjamin> +		hwe->vpd_vendor_pg = 0xc0;
Benjamin> +		hwe->vpd_vendor_id = VPD_VP_HP3PAR;
Benjamin> +	} else
Benjamin> +		rc = 1;
Benjamin> +	FREE(buff);
Benjamin> +	return rc;
Benjamin> +}
Benjamin> +
Benjamin> +static int
Benjamin> +snprint_hw_vpd_vendor(struct config *conf, char * buff, int len,
Benjamin> +		      const void * data)
Benjamin> +{
Benjamin> +	const struct hwentry * hwe = (const struct hwentry *)data;
Benjamin> +
Benjamin> +	if (hwe->vpd_vendor_pg == 0xc0 && hwe->vpd_vendor_id == VPD_VP_HP3PAR)
Benjamin> +		return snprintf(buff, len, "hp3par");
Benjamin> +	return 0;
Benjamin> +}
Benjamin> +
Benjamin>  /*
Benjamin>   * blacklist block handlers
Benjamin>   */
Benjamin> @@ -1806,6 +1839,7 @@ init_keywords(vector keywords)
Benjamin>  	install_keyword("max_sectors_kb", &hw_max_sectors_kb_handler, &snprint_hw_max_sectors_kb);
Benjamin>  	install_keyword("ghost_delay", &hw_ghost_delay_handler, &snprint_hw_ghost_delay);
Benjamin>  	install_keyword("all_tg_pt", &hw_all_tg_pt_handler, &snprint_hw_all_tg_pt);
Benjamin> +	install_keyword("vpd_vendor", &hw_vpd_vendor_handler, &snprint_hw_vpd_vendor);
Benjamin>  	install_sublevel_end();
 
Benjamin>  	install_keyword_root("overrides", &overrides_handler);
Benjamin> diff --git a/libmultipath/discovery.c b/libmultipath/discovery.c
Benjamin> index 1d79cbae..d2773c3a 100644
Benjamin> --- a/libmultipath/discovery.c
Benjamin> +++ b/libmultipath/discovery.c
Benjamin> @@ -1103,6 +1103,30 @@ parse_vpd_pg83(const unsigned char *in, size_t in_len,
Benjamin>  	return len;
Benjamin>  }
 
Benjamin> +static int
Benjamin> +parse_vpd_c0_hp3par(const unsigned char *in, size_t in_len,
Benjamin> +		    char *out, size_t out_len)
Benjamin> +{
Benjamin> +	size_t len;
Benjamin> +
Benjamin> +	memset(out, 0x0, out_len);
Benjamin> +	if (in_len <= 4 || (in[4] > 3 && in_len < 44)) {
Benjamin> +		condlog(3, "HP/3PAR vendor specific VPD page length too short: %lu", in_len);
Benjamin> +		return -EINVAL;
Benjamin> +	}
Benjamin> +	if (in[4] <= 3) /* revision must be > 3 to have Vomlume Name */
Benjamin> +		return -ENODATA;
Benjamin> +	len = get_unaligned_be32(&in[40]);
Benjamin> +	if (len > out_len || len + 44 > in_len) {
Benjamin> +		condlog(3, "HP/3PAR vendor specific Volume name too long: %lu",
Benjamin> +			len);
Benjamin> +		return -EINVAL;
Benjamin> +	}
Benjamin> +	memcpy(out, &in[44], len);
Benjamin> +	out[out_len - 1] = '\0';
Benjamin> +	return len;
Benjamin> +}
Benjamin> +
Benjamin>  static int
Benjamin>  get_vpd_sysfs (struct udev_device *parent, int pg, char * str, int maxlen)
Benjamin>  {
Benjamin> @@ -1170,7 +1194,9 @@ get_vpd_sgio (int fd, int pg, int vend_id, char * str, int maxlen)
Benjamin>  			len = (buff_len <= maxlen)? buff_len : maxlen;
Benjamin>  			memcpy (str, buff, len);
Benjamin>  		}
Benjamin> -	} else
Benjamin> +	} else if (pg == 0xc0 && vend_id == VPD_VP_HP3PAR)
Benjamin> +		len = parse_vpd_c0_hp3par(buff, buff_len, str, maxlen);
Benjamin> +	else
Benjamin>  		len = -ENOSYS;
 
Benjamin>  	return len;
Benjamin> @@ -1544,6 +1570,12 @@ scsi_ioctl_pathinfo (struct path * pp, struct config *conf, int mask)
Benjamin>  	if (!(mask & DI_SERIAL))
Benjamin>  		return;
 
Benjamin> +	select_vpd_vendor_pg(conf, pp);
Benjamin> +	select_vpd_vendor_id(conf, pp);
Benjamin> +
Benjamin> +	if (pp->vpd_vendor_pg != 0 && get_vpd_sgio(pp->fd, pp->vpd_vendor_pg, pp->vpd_vendor_id, pp->vpd_data, sizeof(pp->vpd_data)) < 0)
Benjamin> +		condlog(3, "%s: failed to get extra vpd data", pp->dev);
Benjamin> +
Benjamin>  	parent = pp->udev;
Benjamin>  	while (parent) {
Benjamin>  		const char *subsys = udev_device_get_subsystem(parent);
Benjamin> diff --git a/libmultipath/hwtable.c b/libmultipath/hwtable.c
Benjamin> index 16627ec5..1f27450c 100644
Benjamin> --- a/libmultipath/hwtable.c
Benjamin> +++ b/libmultipath/hwtable.c
Benjamin> @@ -117,6 +117,8 @@ static struct hwentry default_hw[] = {
Benjamin>  		.no_path_retry = 18,
Benjamin>  		.fast_io_fail  = 10,
Benjamin>  		.dev_loss      = MAX_DEV_LOSS_TMO,
Benjamin> +		.vpd_vendor_pg = 0xc0,
Benjamin> +		.vpd_vendor_id = VPD_VP_HP3PAR,
Benjamin>  	},
Benjamin>  	{
Benjamin>  		/* RA8000 / ESA12000 */
Benjamin> diff --git a/libmultipath/print.c b/libmultipath/print.c
Benjamin> index 907469ad..0aafe3cb 100644
Benjamin> --- a/libmultipath/print.c
Benjamin> +++ b/libmultipath/print.c
Benjamin> @@ -358,6 +358,23 @@ snprint_action (char * buff, size_t len, const struct multipath * mpp)
Benjamin>  	}
Benjamin>  }
 
Benjamin> +static int
Benjamin> +snprint_multipath_vpd_data(char * buff, size_t len,
Benjamin> +			   const struct multipath * mpp)
Benjamin> +{
Benjamin> +	struct pathgroup * pgp;
Benjamin> +	struct path * pp;
Benjamin> +	int i, j;
Benjamin> +
Benjamin> +	vector_foreach_slot(mpp->pg, pgp, i) {
Benjamin> +		vector_foreach_slot(pgp->paths, pp, j) {
Benjamin> +			if (strlen(pp->vpd_data))
Benjamin> +				return snprintf(buff, len, "%s", pp->vpd_data);
Benjamin> +		}
Benjamin> +	}
Benjamin> +	return 0;
Benjamin> +}
Benjamin> +
Benjamin>  /*
Benjamin>   * path info printing functions
Benjamin>   */
Benjamin> @@ -688,6 +705,14 @@ snprint_path_marginal(char * buff, size_t len, const struct path * pp)
Benjamin>  	return snprintf(buff, len, "normal");
Benjamin>  }
 
Benjamin> +static int
Benjamin> +snprint_path_vpd_data(char * buff, size_t len, const struct path * pp)
Benjamin> +{
Benjamin> +	if (strlen(pp->vpd_data) > 0)
Benjamin> +		return snprintf(buff, len, "%s", pp->vpd_data);
Benjamin> +	return 0;
Benjamin> +}
Benjamin> +
Benjamin>  struct multipath_data mpd[] = {
Benjamin>  	{'n', "name",          0, snprint_name},
Benjamin>  	{'w', "uuid",          0, snprint_multipath_uuid},
Benjamin> @@ -712,6 +737,7 @@ struct multipath_data mpd[] = {
Benjamin>  	{'p', "prod",          0, snprint_multipath_prod},
Benjamin>  	{'e', "rev",           0, snprint_multipath_rev},
Benjamin>  	{'G', "foreign",       0, snprint_multipath_foreign},
Benjamin> +	{'g', "vpd page data", 0, snprint_multipath_vpd_data},
Benjamin>  	{0, NULL, 0 , NULL}
Benjamin>  };
 
Benjamin> @@ -737,6 +763,7 @@ struct path_data pd[] = {
Benjamin>  	{'r', "target WWPN",   0, snprint_tgt_wwpn},
Benjamin>  	{'a', "host adapter",  0, snprint_host_adapter},
Benjamin>  	{'G', "foreign",       0, snprint_path_foreign},
Benjamin> +	{'g', "vpd page data", 0, snprint_path_vpd_data},
Benjamin>  	{'0', "failures",      0, snprint_path_failures},
Benjamin>  	{'P', "protocol",      0, snprint_path_protocol},
Benjamin>  	{0, NULL, 0 , NULL}
Benjamin> diff --git a/libmultipath/propsel.c b/libmultipath/propsel.c
Benjamin> index b5b5b89f..3c99f2d4 100644
Benjamin> --- a/libmultipath/propsel.c
Benjamin> +++ b/libmultipath/propsel.c
Benjamin> @@ -1203,3 +1203,27 @@ out:
Benjamin>  		origin);
Benjamin>  	return 0;
Benjamin>  }
Benjamin> +
Benjamin> +int select_vpd_vendor_pg (struct config *conf, struct path *pp)
Benjamin> +{
Benjamin> +	const char *origin;
Benjamin> +
Benjamin> +	pp_set_hwe(vpd_vendor_pg);
Benjamin> +	pp_set_default(vpd_vendor_pg, 0);
Benjamin> +out:
Benjamin> +	condlog(3, "%s: vpd_vendor_pg = 0x%x %s", pp->dev, pp->vpd_vendor_pg,
Benjamin> +		origin);
Benjamin> +	return 0;
Benjamin> +}
Benjamin> +
Benjamin> +int select_vpd_vendor_id (struct config *conf, struct path *pp)
Benjamin> +{
Benjamin> +	const char *origin;
Benjamin> +
Benjamin> +	pp_set_hwe(vpd_vendor_id);
Benjamin> +	pp_set_default(vpd_vendor_id, 0);
Benjamin> +out:
Benjamin> +	condlog(3, "%s: vpd_vendor_id = 0x%x %s", pp->dev, pp->vpd_vendor_id,
Benjamin> +		origin);
Benjamin> +	return 0;
Benjamin> +}
Benjamin> diff --git a/libmultipath/propsel.h b/libmultipath/propsel.h
Benjamin> index ddfd6262..3f6d319a 100644
Benjamin> --- a/libmultipath/propsel.h
Benjamin> +++ b/libmultipath/propsel.h
Benjamin> @@ -37,3 +37,5 @@ void reconcile_features_with_options(const char *id, char **features,
Benjamin>  				     int* no_path_retry,
Benjamin>  				     int *retain_hwhandler);
Benjamin>  int select_all_tg_pt (struct config *conf, struct multipath * mp);
Benjamin> +int select_vpd_vendor_pg (struct config *conf, struct path *pp);
Benjamin> +int select_vpd_vendor_id (struct config *conf, struct path *pp);
Benjamin> diff --git a/libmultipath/structs.h b/libmultipath/structs.h
Benjamin> index 1c32a799..1ad5f64a 100644
Benjamin> --- a/libmultipath/structs.h
Benjamin> +++ b/libmultipath/structs.h
Benjamin> @@ -21,6 +21,7 @@
Benjamin>  #define HOST_NAME_LEN		16
Benjamin>  #define SLOT_NAME_SIZE		40
Benjamin>  #define PRKEY_SIZE		19
Benjamin> +#define VPD_DATA_SIZE		128
 
Benjamin>  #define SCSI_VENDOR_SIZE	9
Benjamin>  #define SCSI_PRODUCT_SIZE	17
Benjamin> @@ -243,6 +244,11 @@ struct hd_geometry {
Benjamin>  };
Benjamin>  #endif
 
Benjamin> +/*
Benjamin> + * from sg_vpd_vendor.c
Benjamin> + */
Benjamin> +#define VPD_VP_HP3PAR 4
Benjamin> +
Benjamin>  struct path {
Benjamin>  	char dev[FILE_NAME_SIZE];
Benjamin>  	char dev_t[BLK_DEV_SIZE];
Benjamin> @@ -255,6 +261,7 @@ struct path {
Benjamin>  	char rev[PATH_REV_SIZE];
Benjamin>  	char serial[SERIAL_SIZE];
Benjamin>  	char tgt_node_name[NODE_NAME_SIZE];
Benjamin> +	char vpd_data[VPD_DATA_SIZE];
Benjamin>  	unsigned long long size;
Benjamin>  	unsigned int checkint;
Benjamin>  	unsigned int tick;
Benjamin> @@ -287,6 +294,8 @@ struct path {
Benjamin>  	int io_err_pathfail_starttime;
Benjamin>  	int find_multipaths_timeout;
Benjamin>  	int marginal;
Benjamin> +	int vpd_vendor_pg;
Benjamin> +	int vpd_vendor_id;
Benjamin>  	/* configlet pointers */
Benjamin>  	vector hwe;
Benjamin>  	struct gen_path generic_path;
Benjamin> diff --git a/multipath/multipath.conf.5 b/multipath/multipath.conf.5
Benjamin> index e866da23..dc103fd8 100644
Benjamin> --- a/multipath/multipath.conf.5
Benjamin> +++ b/multipath/multipath.conf.5
Benjamin> @@ -1472,6 +1472,14 @@ the \fIproduct\fR attribute set to the value of \fIproduct_blacklist\fR.
Benjamin>  The user_friendly_names prefix to use for this
Benjamin>  device type, instead of the default "mpath".
Benjamin>  .TP
Benjamin> +.B vpd_vendor
Benjamin> +The vendor specific vpd page information, using the vpd page abbreviation.
Benjamin> +The vpd page abbreviation can be found by running \fIsg_vpd -e\fR. multipathd
Benjamin> +will use this information to gather device specific information that can be
Benjamin> +displayed with the \fI%g\fR wilcard for the \fImultipathd show maps format\fR
Benjamin> +and \fImultipathd show paths format\fR commands. Currently only the
Benjamin> +\fBhp3par\fR vpd page is supported.
Benjamin> +.TP
Benjamin>  .B hardware_handler
Benjamin>  The hardware handler to use for this device type.
Benjamin>  The following hardware handler are implemented:
Benjamin> -- 
Benjamin> 2.17.2

Benjamin> --
Benjamin> dm-devel mailing list
Benjamin> dm-devel@redhat.com
Benjamin> https://www.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 10/15] libmultipath: change how the checker async is set
  2020-01-17  2:18 ` [PATCH 10/15] libmultipath: change how the checker async is set Benjamin Marzinski
  2020-01-17 17:02   ` Martin Wilck
  2020-01-17 17:04   ` Martin Wilck
@ 2020-01-18 17:43   ` John Stoffel
  2020-01-20 15:07     ` Benjamin Marzinski
  2 siblings, 1 reply; 49+ messages in thread
From: John Stoffel @ 2020-01-18 17:43 UTC (permalink / raw)
  To: Benjamin Marzinski; +Cc: device-mapper development, Martin Wilck

>>>>> "Benjamin" == Benjamin Marzinski <bmarzins@redhat.com> writes:

Benjamin> The way that the checkers async mode worked in multipathd didn't make
Benjamin> much sense. When multipathd starts up, all checker classes are in the
Benjamin> sync mode, and any pathinfo() calls on devices would run the checker in
Benjamin> sync mode. However, the First time a checker class was used in
Benjamin> checkerloop, it would set that checker class to async (assuming
Benjamin> force_sync wasn't set).  After that, no matter when a checker from that
Benjamin> class was called, it would always run in async mode.  Multipathd doesn't
Benjamin> need to run checkers in sync mode at all, so don't.

Sorry, I had a hard time parsing this description, especially the last
sentence.  Do you mean that checkers should default to async then,
instead of sync mode?  And from looking at the code, don't you mean
that you force sync mode?  I guess the question is whether checker
classes should default sync, or async.  And if they move to async,
should they stay there?
    


Benjamin> Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Benjamin> ---
Benjamin>  libmpathpersist/mpath_persist.c |  2 +-
Benjamin>  libmultipath/discovery.c        | 10 ++++------
Benjamin>  multipath/main.c                |  1 +
Benjamin>  3 files changed, 6 insertions(+), 7 deletions(-)

Benjamin> diff --git a/libmpathpersist/mpath_persist.c b/libmpathpersist/mpath_persist.c
Benjamin> index 603cfc3b..b2238f00 100644
Benjamin> --- a/libmpathpersist/mpath_persist.c
Benjamin> +++ b/libmpathpersist/mpath_persist.c
Benjamin> @@ -47,7 +47,7 @@ mpath_lib_init (void)
Benjamin>  		condlog(0, "Failed to initialize multipath config.");
Benjamin>  		return NULL;
Benjamin>  	}
Benjamin> -
Benjamin> +	conf->force_sync = 1;
Benjamin>  	set_max_fds(conf->max_fds);
 
Benjamin>  	return conf;
Benjamin> diff --git a/libmultipath/discovery.c b/libmultipath/discovery.c
Benjamin> index d2773c3a..1ab093f4 100644
Benjamin> --- a/libmultipath/discovery.c
Benjamin> +++ b/libmultipath/discovery.c
Benjamin> @@ -1643,12 +1643,10 @@ get_state (struct path * pp, struct config *conf, int daemon, int oldstate)
Benjamin>  	if (pp->mpp && !c->mpcontext)
Benjamin>  		checker_mp_init(c, &pp->mpp->mpcontext);
Benjamin>  	checker_clear_message(c);
Benjamin> -	if (daemon) {
Benjamin> -		if (conf->force_sync == 0)
Benjamin> -			checker_set_async(c);
Benjamin> -		else
Benjamin> -			checker_set_sync(c);
Benjamin> -	}
Benjamin> +	if (conf->force_sync == 0)
Benjamin> +		checker_set_async(c);
Benjamin> +	else
Benjamin> +		checker_set_sync(c);
Benjamin>  	if (!conf->checker_timeout &&
Benjamin>  	    sysfs_get_timeout(pp, &(c->timeout)) <= 0)
c-> timeout = DEF_TIMEOUT;
Benjamin> diff --git a/multipath/main.c b/multipath/main.c
Benjamin> index 4f4d8e89..aebabd9b 100644
Benjamin> --- a/multipath/main.c
Benjamin> +++ b/multipath/main.c
Benjamin> @@ -905,6 +905,7 @@ main (int argc, char *argv[])
Benjamin>  		exit(RTVL_FAIL);
Benjamin>  	multipath_conf = conf;
conf-> retrigger_tries = 0;
Benjamin> +	conf->force_sync = 1;
Benjamin>  	while ((arg = getopt(argc, argv, ":adcChl::FfM:v:p:b:BrR:itTquUwW")) != EOF ) {
Benjamin>  		switch(arg) {
Benjamin>  		case 1: printf("optarg : %s\n",optarg);
Benjamin> -- 
Benjamin> 2.17.2

Benjamin> --
Benjamin> dm-devel mailing list
Benjamin> dm-devel@redhat.com
Benjamin> https://www.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 14/15] libmultipath: make directio checker share io contexts
  2020-01-17  2:18 ` [PATCH 14/15] libmultipath: make directio checker share io contexts Benjamin Marzinski
@ 2020-01-20  8:09   ` Martin Wilck
  2020-01-20 16:08     ` Benjamin Marzinski
  0 siblings, 1 reply; 49+ messages in thread
From: Martin Wilck @ 2020-01-20  8:09 UTC (permalink / raw)
  To: Benjamin Marzinski, Christophe Varoqui; +Cc: device-mapper development

Hi Ben,

On Thu, 2020-01-16 at 20:18 -0600, Benjamin Marzinski wrote:
> On systems with a large number of cores (>500), io_destroy() can take
> tens to hundreds of milliseconds to complete, due to RCU
> synchronization. If there are a large number of paths using the
> directio
> checker on such a system, this can lead to multipath taking almost a
> minute to complete, with the vast majority of time taken up by
> io_destroy().
> 
> To solve this, the directio checker now allocates one aio context for
> every 1024 checkers. This reduces the io_destroy() delay to a
> thousandth
> of its previous level.

>  However, this means that muliple checkers are
> sharing the same aio context, and must be able to handle getting
> results
> for other checkers.  Because only one checker is ever running at a
> time, this doesn't require any locking.  However, locking could be
> added
> in the future if necessary, to allow multiple checkers to run at the
> same time.
> 
> When checkers are freed, they usually no longer destroy the io
> context.
> Instead, they attempt to cancel any outstanding request. If that
> fails,
> they put the request on an orphan list, so that it can be freed by
> other
> checkers, once it has completed. IO contexts are only destroyed at
> three
> times, during reconfigure (to deal with the possibility that
> multipathd
> is holding more aio events than it needs to be, since there is a
> single
> limit for the whole system), when the checker class is unloaded, and
> in a corner case when checkers are freed. If an aio_group (which
> contains the aio context) is entirely full of orphaned requests, then
> no checker can use it. Since no checker is using it, there is no
> checker
> to clear out the orphaned requests. In this (likely rare) case, the
> last checker from that group to be freed and leave behind an orphaned
> request will call io_destroy() and remove the group.

I haven't made up my mind about this patch yet. 

The general idea is of course good, I never liked the fact that we
allocate one aio context per path. Also, technically, I haven't found
any issues with your patch.

Wrt the design, rather than having the checkers mutually catch there
completion events, have you considered simply creating a dedicated
thread for each aio group, which would just call io_getevents() in a
loop and update the checker status, which licbcheck_check() could then
fetch at the next call? That approach would avoid the "no checker
available for freeing orphans" complication. As an added bonus, this
thread could also handle checker timeouts, and perhaps even
io_destroy() and the cleanup. That way multipathd wouldn't have to wait
for io_destroy() at all - it could drop all refs to the aio context to
be deleted, detach the event thread, and let it handle the destruction
of the remaining data structures without worrying further. Some locking
and/or atomic operations would be required of course, but you remarked
already that that might be necessary sooner or later anyway.

Second, I have experimented with this a bit, and AFAICS io_cancel()
NEVER succeeds. Please don't add a level 3 logmessage there, as the
failure of this call is expected. I observed that recent kernels always
return -EINVAL from io_cancel(), and I'm going to report this on LKML
soon. But whatever the error code is, as a matter of fact aio requests
can't be cancelled with currentl Linux kernels. We have to wait for
their completion no matter what.

Finally, I wonder if the code could be simplified by just using a
single aio context, possibly with configurable size. If slots fill up,
we could either fall back to sync IO, or try to re-allocate the
context, returning PATH_PENDING from checkers in the meantime.

Maybe you have considered all this already and will explain why you
made the design decisions the way you did. Looking forward to your
comments.

Martin




> 
> Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> ---
>  libmultipath/checkers/directio.c | 337 +++++++++++++++++++++++++--
> ----
>  1 file changed, 278 insertions(+), 59 deletions(-)
> 
> diff --git a/libmultipath/checkers/directio.c
> b/libmultipath/checkers/directio.c
> index 1b00b775..0863db2a 100644
> --- a/libmultipath/checkers/directio.c
> +++ b/libmultipath/checkers/directio.c
> @@ -17,6 +17,33 @@
>  
>  #include "checkers.h"
>  #include "../libmultipath/debug.h"
> +#include "../libmultipath/time-util.h"
> +
> +#define AIO_GROUP_SIZE 1024
> +
> +/* Note: This checker type relies on the fact that only one checker
> can be run
> + * at a time, since multiple checkers share the same aio_group, and
> must be
> + * able to modify other checker's async_reqs. If multple checkers
> become able
> + * to be run at the same time, this checker will need to add
> locking, and
> + * probably polling on event fds, to deal with that */
> +
> +struct aio_group {
> +	struct list_head node;
> +	int holders;
> +	io_context_t ioctx;
> +	struct list_head orphans;
> +};
> +
> +struct async_req {
> +	struct iocb io;
> +	int blksize;
> +	unsigned char *	buf;
> +	unsigned char * ptr;
> +	struct list_head node;
> +	int state; /* PATH_REMOVED means this is an orphan */
> +};
> +
> +static LIST_HEAD(aio_grp_list);
>  
>  enum {
>  	MSG_DIRECTIO_UNKNOWN = CHECKER_FIRST_MSGID,
> @@ -37,18 +64,134 @@ const char *libcheck_msgtable[] = {
>  struct directio_context {
>  	int		running;
>  	int		reset_flags;
> -	int		blksize;
> -	unsigned char *	buf;
> -	unsigned char * ptr;
> -	io_context_t	ioctx;
> -	struct iocb	io;
> +	struct aio_group *aio_grp;
> +	struct async_req *req;
>  };
>  
> +static struct aio_group *
> +add_aio_group(void)
> +{
> +	struct aio_group *aio_grp;
> +
> +	aio_grp = malloc(sizeof(struct aio_group));
> +	if (!aio_grp)
> +		return NULL;
> +	memset(aio_grp, 0, sizeof(struct aio_group));
> +	INIT_LIST_HEAD(&aio_grp->orphans);
> +
> +	if (io_setup(AIO_GROUP_SIZE, &aio_grp->ioctx) != 0) {
> +		LOG(1, "io_setup failed");
> +		free(aio_grp);
> +		return NULL;
> +	}
> +	list_add(&aio_grp->node, &aio_grp_list);
> +	return aio_grp;
> +}
> +
> +static int
> +set_aio_group(struct directio_context *ct)
> +{
> +	struct aio_group *aio_grp = NULL;
> +
> +	list_for_each_entry(aio_grp, &aio_grp_list, node)
> +		if (aio_grp->holders < AIO_GROUP_SIZE)
> +			goto found;
> +	aio_grp = add_aio_group();
> +	if (!aio_grp) {
> +		ct->aio_grp = NULL;
> +		return -1;
> +	}
> +found:
> +	aio_grp->holders++;
> +	ct->aio_grp = aio_grp;
> +	return 0;
> +}
> +
> +static void
> +remove_aio_group(struct aio_group *aio_grp)
> +{
> +	struct async_req *req, *tmp;
> +
> +	io_destroy(aio_grp->ioctx);
> +	list_for_each_entry_safe(req, tmp, &aio_grp->orphans, node) {
> +		list_del(&req->node);
> +		free(req->buf);
> +		free(req);
> +	}
> +	list_del(&aio_grp->node);
> +	free(aio_grp);
> +}
> +
> +/* If an aio_group is completely full of orphans, then no checkers
> can
> + * use it, which means that no checkers can clear out the orphans.
> To
> + * avoid keeping the useless group around, simply remove remove the
> + * group */
> +static void
> +check_orphaned_group(struct aio_group *aio_grp)
> +{
> +	int count = 0;
> +	struct list_head *item;
> +
> +	if (aio_grp->holders < AIO_GROUP_SIZE)
> +		return;
> +	list_for_each(item, &aio_grp->orphans)
> +		count++;
> +	if (count >= AIO_GROUP_SIZE) {
> +		remove_aio_group(aio_grp);
> +		if (list_empty(&aio_grp_list))
> +			add_aio_group();
> +	}
> +}
> +
> +int libcheck_load (void)
> +{
> +	if (add_aio_group() == NULL) {
> +		LOG(1, "libcheck_load failed: %s", strerror(errno));
> +		return 1;
> +	}
> +	return 0;
> +}
> +
> +void libcheck_unload (void)
> +{
> +	struct aio_group *aio_grp, *tmp;
> +
> +	list_for_each_entry_safe(aio_grp, tmp, &aio_grp_list, node)
> +		remove_aio_group(aio_grp);
> +}
> +
> +int libcheck_reset (void)
> +{
> +	struct aio_group *aio_grp, *tmp, *reset_grp = NULL;
> +
> +	/* If a clean existing aio_group exists, use that. Otherwise
> add a
> +	 * new one */
> +	list_for_each_entry(aio_grp, &aio_grp_list, node) {
> +		if (aio_grp->holders == 0 &&
> +		    list_empty(&aio_grp->orphans)) {
> +			reset_grp = aio_grp;
> +			break;
> +		}
> +	}
> +	if (!reset_grp)
> +		reset_grp = add_aio_group();
> +	if (!reset_grp) {
> +		LOG(1, "checker reset failed");
> +		return 1;
> +	}
> +
> +	list_for_each_entry_safe(aio_grp, tmp, &aio_grp_list, node) {
> +		if (aio_grp != reset_grp)
> +			remove_aio_group(aio_grp);
> +	}
> +	return 0;
> +}
>  
>  int libcheck_init (struct checker * c)
>  {
>  	unsigned long pgsize = getpagesize();
>  	struct directio_context * ct;
> +	struct async_req *req = NULL;
>  	long flags;
>  
>  	ct = malloc(sizeof(struct directio_context));
> @@ -56,26 +199,31 @@ int libcheck_init (struct checker * c)
>  		return 1;
>  	memset(ct, 0, sizeof(struct directio_context));
>  
> -	if (io_setup(1, &ct->ioctx) != 0) {
> -		condlog(1, "io_setup failed");
> -		free(ct);
> -		return 1;
> +	if (set_aio_group(ct) < 0)
> +		goto out;
> +
> +	req = malloc(sizeof(struct async_req));
> +	if (!req) {
> +		goto out;
>  	}
> +	memset(req, 0, sizeof(struct async_req));
> +	INIT_LIST_HEAD(&req->node);
>  
> -	if (ioctl(c->fd, BLKBSZGET, &ct->blksize) < 0) {
> +	if (ioctl(c->fd, BLKBSZGET, &req->blksize) < 0) {
>  		c->msgid = MSG_DIRECTIO_BLOCKSIZE;
> -		ct->blksize = 512;
> +		req->blksize = 512;
>  	}
> -	if (ct->blksize > 4096) {
> +	if (req->blksize > 4096) {
>  		/*
>  		 * Sanity check for DASD; BSZGET is broken
>  		 */
> -		ct->blksize = 4096;
> +		req->blksize = 4096;
>  	}
> -	if (!ct->blksize)
> +	if (!req->blksize)
>  		goto out;
> -	ct->buf = (unsigned char *)malloc(ct->blksize + pgsize);
> -	if (!ct->buf)
> +
> +	req->buf = (unsigned char *)malloc(req->blksize + pgsize);
> +	if (!req->buf)
>  		goto out;
>  
>  	flags = fcntl(c->fd, F_GETFL);
> @@ -88,17 +236,22 @@ int libcheck_init (struct checker * c)
>  		ct->reset_flags = 1;
>  	}
>  
> -	ct->ptr = (unsigned char *) (((unsigned long)ct->buf + pgsize -
> 1) &
> +	req->ptr = (unsigned char *) (((unsigned long)req->buf + pgsize
> - 1) &
>  		  (~(pgsize - 1)));
>  
>  	/* Successfully initialized, return the context. */
> +	ct->req = req;
>  	c->context = (void *) ct;
>  	return 0;
>  
>  out:
> -	if (ct->buf)
> -		free(ct->buf);
> -	io_destroy(ct->ioctx);
> +	if (req) {
> +		if (req->buf)
> +			free(req->buf);
> +		free(req);
> +	}
> +	if (ct->aio_grp)
> +		ct->aio_grp->holders--;
>  	free(ct);
>  	return 1;
>  }
> @@ -106,6 +259,7 @@ out:
>  void libcheck_free (struct checker * c)
>  {
>  	struct directio_context * ct = (struct directio_context *)c-
> >context;
> +	struct io_event event;
>  	long flags;
>  
>  	if (!ct)
> @@ -121,20 +275,73 @@ void libcheck_free (struct checker * c)
>  		}
>  	}
>  
> -	if (ct->buf)
> -		free(ct->buf);
> -	io_destroy(ct->ioctx);
> +	if (ct->running &&
> +	    (ct->req->state != PATH_PENDING ||
> +	     io_cancel(ct->aio_grp->ioctx, &ct->req->io, &event) == 0))
> +		ct->running = 0;
> +	if (!ct->running) {
> +		free(ct->req->buf);
> +		free(ct->req);
> +		ct->aio_grp->holders--;
> +	} else {
> +		LOG(3, "io_cancel error while freeing: %i", errno);
> +		ct->req->state = PATH_REMOVED;
> +		list_add(&ct->req->node, &ct->aio_grp->orphans);
> +		check_orphaned_group(ct->aio_grp);
> +	}
> +
>  	free(ct);
> +	c->context = NULL;
> +}
> +
> +static int
> +get_events(struct aio_group *aio_grp, struct timespec *timeout)
> +{
> +	struct io_event events[128];
> +	int i, nr, got_events = 0;
> +	struct timespec zero_timeout = {0};
> +	struct timespec *timep = (timeout)? timeout : &zero_timeout;
> +
> +	do {
> +		errno = 0;
> +		nr = io_getevents(aio_grp->ioctx, 1, 128, events,
> timep);
> +		got_events |= (nr > 0);
> +
> +		for (i = 0; i < nr; i++) {
> +			struct async_req *req =
> container_of(events[i].obj, struct async_req, io);
> +
> +			LOG(3, "io finished %lu/%lu", events[i].res,
> +			    events[i].res2);
> +
> +			/* got an orphaned request */
> +			if (req->state == PATH_REMOVED) {
> +				list_del(&req->node);
> +				free(req->buf);
> +				free(req);
> +				aio_grp->holders--;
> +			} else
> +				req->state = (events[i].res == req-
> >blksize) ?
> +					      PATH_UP : PATH_DOWN;
> +		}
> +		timep = &zero_timeout;
> +	} while (nr == 128); /* assume there are more events and try
> again */
> +
> +	if (nr < 0)
> +		LOG(3, "async io getevents returned %i (errno=%s)",
> +		    nr, strerror(errno));
> +
> +	return got_events;
>  }
>  
>  static int
>  check_state(int fd, struct directio_context *ct, int sync, int
> timeout_secs)
>  {
>  	struct timespec	timeout = { .tv_nsec = 5 };
> -	struct io_event event;
>  	struct stat	sb;
> -	int		rc = PATH_UNCHECKED;
> +	int		rc;
>  	long		r;
> +	struct timespec currtime, endtime;
> +	struct timespec *timep = &timeout;
>  
>  	if (fstat(fd, &sb) == 0) {
>  		LOG(4, "called for %x", (unsigned) sb.st_rdev);
> @@ -145,50 +352,62 @@ check_state(int fd, struct directio_context
> *ct, int sync, int timeout_secs)
>  		timeout.tv_nsec = 0;
>  	}
>  
> -	if (!ct->running) {
> -		struct iocb *ios[1] = { &ct->io };
> +	if (ct->running) {
> +		if (ct->req->state != PATH_PENDING) {
> +			ct->running = 0;
> +			return ct->req->state;
> +		}
> +	} else {
> +		struct iocb *ios[1] = { &ct->req->io };
>  
>  		LOG(3, "starting new request");
> -		memset(&ct->io, 0, sizeof(struct iocb));
> -		io_prep_pread(&ct->io, fd, ct->ptr, ct->blksize, 0);
> -		if (io_submit(ct->ioctx, 1, ios) != 1) {
> +		memset(&ct->req->io, 0, sizeof(struct iocb));
> +		io_prep_pread(&ct->req->io, fd, ct->req->ptr,
> +			      ct->req->blksize, 0);
> +		ct->req->state = PATH_PENDING;
> +		if (io_submit(ct->aio_grp->ioctx, 1, ios) != 1) {
>  			LOG(3, "io_submit error %i", errno);
>  			return PATH_UNCHECKED;
>  		}
>  	}
>  	ct->running++;
>  
> -	errno = 0;
> -	r = io_getevents(ct->ioctx, 1L, 1L, &event, &timeout);
> +	get_monotonic_time(&endtime);
> +	endtime.tv_sec += timeout.tv_sec;
> +	endtime.tv_nsec += timeout.tv_nsec;
> +	normalize_timespec(&endtime);
> +	while(1) {
> +		r = get_events(ct->aio_grp, timep);
>  
> -	if (r < 0 ) {
> -		LOG(3, "async io getevents returned %li (errno=%s)", r,
> -		    strerror(errno));
> -		ct->running = 0;
> -		rc = PATH_UNCHECKED;
> -	} else if (r < 1L) {
> -		if (ct->running > timeout_secs || sync) {
> -			struct iocb *ios[1] = { &ct->io };
> -
> -			LOG(3, "abort check on timeout");
> -			r = io_cancel(ct->ioctx, ios[0], &event);
> -			/*
> -			 * Only reset ct->running if we really
> -			 * could abort the pending I/O
> -			 */
> -			if (r)
> -				LOG(3, "io_cancel error %i", errno);
> -			else
> -				ct->running = 0;
> -			rc = PATH_DOWN;
> -		} else {
> -			LOG(3, "async io pending");
> -			rc = PATH_PENDING;
> -		}
> +		if (ct->req->state != PATH_PENDING) {
> +			ct->running = 0;
> +			return ct->req->state;
> +		} else if (r == 0 || !timep)
> +			break;
> +
> +		get_monotonic_time(&currtime);
> +		timespecsub(&endtime, &currtime, &timeout);
> +		if (timeout.tv_sec < 0)
> +			timep = NULL;
> +	}
> +	if (ct->running > timeout_secs || sync) {
> +		struct io_event event;
> +
> +		LOG(3, "abort check on timeout");
> +
> +		r = io_cancel(ct->aio_grp->ioctx, &ct->req->io,
> &event);
> +		/*
> +		 * Only reset ct->running if we really
> +		 * could abort the pending I/O
> +		 */
> +		if (r)
> +			LOG(3, "io_cancel error %i", errno);
> +		else
> +			ct->running = 0;
> +		rc = PATH_DOWN;
>  	} else {
> -		LOG(3, "io finished %lu/%lu", event.res, event.res2);
> -		ct->running = 0;
> -		rc = (event.res == ct->blksize) ? PATH_UP : PATH_DOWN;
> +		LOG(3, "async io pending");
> +		rc = PATH_PENDING;
>  	}
>  
>  	return rc;

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 15/15] tests: add directio unit tests
  2020-01-17  2:18 ` [PATCH 15/15] tests: add directio unit tests Benjamin Marzinski
@ 2020-01-20  8:27   ` Martin Wilck
  2020-01-20 15:25     ` Benjamin Marzinski
  0 siblings, 1 reply; 49+ messages in thread
From: Martin Wilck @ 2020-01-20  8:27 UTC (permalink / raw)
  To: Benjamin Marzinski, Christophe Varoqui; +Cc: device-mapper development

On Thu, 2020-01-16 at 20:18 -0600, Benjamin Marzinski wrote:
> Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> ---
>  tests/Makefile   |   3 +-
>  tests/directio.c | 704
> +++++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 706 insertions(+), 1 deletion(-)
>  create mode 100644 tests/directio.c
> 
> diff --git a/tests/Makefile b/tests/Makefile
> index a5cdf390..275fdd7d 100644
> +
> +/* test async checkers pending and getting resovled by another
> checker
> + * as well as the loops for getting multiple events */
> +static void test_check_state_async(void **state)
> +{
> +	int i;
> +	struct checker c[257] = {0};
> +	struct async_req *reqs[257];
> +	int res[257] = {0};
> +
> +	do_libcheck_load();
> +	for (i = 0; i < 257; i++)
> +		do_libcheck_init(&c[i], 4096, &reqs[i]);
> +	for (i = 0; i < 256; i++) {
> +		return_io_getevents_none();
> +		do_check_state(&c[i], 0, 30, PATH_PENDING);
> +		assert_true(is_checker_running(&c[i]));
> +	}
> +	return_io_getevents_nr(&full_timeout, 256, reqs, res);
> +	return_io_getevents_nr(NULL, 1, &reqs[256], &res[256]);
> +	do_check_state(&c[256], 0, 30, PATH_UP);
> +	assert_false(is_checker_running(&c[256]));
> +	libcheck_free(&c[i]);

Writing &c[256] here would be better readable. Apart from that, this is
looking good. Some of the tests are implementation-specific, but that's
fine if we stick with this implementation.

I wish we could have some tests with real aio for this, but that would
be difficult to set up right. Perhaps scsi_debug could be leveraged for
it.

Regards
Martin

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 10/15] libmultipath: change how the checker async is set
  2020-01-18 17:43   ` John Stoffel
@ 2020-01-20 15:07     ` Benjamin Marzinski
  2020-01-20 20:20       ` John Stoffel
  0 siblings, 1 reply; 49+ messages in thread
From: Benjamin Marzinski @ 2020-01-20 15:07 UTC (permalink / raw)
  To: John Stoffel; +Cc: device-mapper development, Martin Wilck

On Sat, Jan 18, 2020 at 12:43:49PM -0500, John Stoffel wrote:
> >>>>> "Benjamin" == Benjamin Marzinski <bmarzins@redhat.com> writes:
> 
> Benjamin> The way that the checkers async mode worked in multipathd didn't make
> Benjamin> much sense. When multipathd starts up, all checker classes are in the
> Benjamin> sync mode, and any pathinfo() calls on devices would run the checker in
> Benjamin> sync mode. However, the First time a checker class was used in
> Benjamin> checkerloop, it would set that checker class to async (assuming
> Benjamin> force_sync wasn't set).  After that, no matter when a checker from that
> Benjamin> class was called, it would always run in async mode.  Multipathd doesn't
> Benjamin> need to run checkers in sync mode at all, so don't.
> 
> Sorry, I had a hard time parsing this description, especially the last
> sentence.  Do you mean that checkers should default to async then,
> instead of sync mode?  And from looking at the code, don't you mean
> that you force sync mode?  I guess the question is whether checker
> classes should default sync, or async.  And if they move to async,
> should they stay there?
>     

Sorry. I mean that right now multipathd runs the checkers from pathinfo(),
wait_for_pending_paths() and check_path(). When multipathd starts, all
checkers are in sync mode. The first time a checker is run from
check_path(), it is switched to async mode, and stays there for the rest
of the time multipathd is runing.

There is no need for multipathd to run checkers in sync mode at all, so
we shouldn't be doing so for these first checks.

-Ben

> 
> 
> Benjamin> Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> Benjamin> ---
> Benjamin>  libmpathpersist/mpath_persist.c |  2 +-
> Benjamin>  libmultipath/discovery.c        | 10 ++++------
> Benjamin>  multipath/main.c                |  1 +
> Benjamin>  3 files changed, 6 insertions(+), 7 deletions(-)
> 
> Benjamin> diff --git a/libmpathpersist/mpath_persist.c b/libmpathpersist/mpath_persist.c
> Benjamin> index 603cfc3b..b2238f00 100644
> Benjamin> --- a/libmpathpersist/mpath_persist.c
> Benjamin> +++ b/libmpathpersist/mpath_persist.c
> Benjamin> @@ -47,7 +47,7 @@ mpath_lib_init (void)
> Benjamin>  		condlog(0, "Failed to initialize multipath config.");
> Benjamin>  		return NULL;
> Benjamin>  	}
> Benjamin> -
> Benjamin> +	conf->force_sync = 1;
> Benjamin>  	set_max_fds(conf->max_fds);
>  
> Benjamin>  	return conf;
> Benjamin> diff --git a/libmultipath/discovery.c b/libmultipath/discovery.c
> Benjamin> index d2773c3a..1ab093f4 100644
> Benjamin> --- a/libmultipath/discovery.c
> Benjamin> +++ b/libmultipath/discovery.c
> Benjamin> @@ -1643,12 +1643,10 @@ get_state (struct path * pp, struct config *conf, int daemon, int oldstate)
> Benjamin>  	if (pp->mpp && !c->mpcontext)
> Benjamin>  		checker_mp_init(c, &pp->mpp->mpcontext);
> Benjamin>  	checker_clear_message(c);
> Benjamin> -	if (daemon) {
> Benjamin> -		if (conf->force_sync == 0)
> Benjamin> -			checker_set_async(c);
> Benjamin> -		else
> Benjamin> -			checker_set_sync(c);
> Benjamin> -	}
> Benjamin> +	if (conf->force_sync == 0)
> Benjamin> +		checker_set_async(c);
> Benjamin> +	else
> Benjamin> +		checker_set_sync(c);
> Benjamin>  	if (!conf->checker_timeout &&
> Benjamin>  	    sysfs_get_timeout(pp, &(c->timeout)) <= 0)
> c-> timeout = DEF_TIMEOUT;
> Benjamin> diff --git a/multipath/main.c b/multipath/main.c
> Benjamin> index 4f4d8e89..aebabd9b 100644
> Benjamin> --- a/multipath/main.c
> Benjamin> +++ b/multipath/main.c
> Benjamin> @@ -905,6 +905,7 @@ main (int argc, char *argv[])
> Benjamin>  		exit(RTVL_FAIL);
> Benjamin>  	multipath_conf = conf;
> conf-> retrigger_tries = 0;
> Benjamin> +	conf->force_sync = 1;
> Benjamin>  	while ((arg = getopt(argc, argv, ":adcChl::FfM:v:p:b:BrR:itTquUwW")) != EOF ) {
> Benjamin>  		switch(arg) {
> Benjamin>  		case 1: printf("optarg : %s\n",optarg);
> Benjamin> -- 
> Benjamin> 2.17.2
> 
> Benjamin> --
> Benjamin> dm-devel mailing list
> Benjamin> dm-devel@redhat.com
> Benjamin> https://www.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 15/15] tests: add directio unit tests
  2020-01-20  8:27   ` Martin Wilck
@ 2020-01-20 15:25     ` Benjamin Marzinski
  0 siblings, 0 replies; 49+ messages in thread
From: Benjamin Marzinski @ 2020-01-20 15:25 UTC (permalink / raw)
  To: Martin Wilck; +Cc: device-mapper development

On Mon, Jan 20, 2020 at 09:27:25AM +0100, Martin Wilck wrote:
> On Thu, 2020-01-16 at 20:18 -0600, Benjamin Marzinski wrote:
> > Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> > ---
> >  tests/Makefile   |   3 +-
> >  tests/directio.c | 704
> > +++++++++++++++++++++++++++++++++++++++++++++++
> >  2 files changed, 706 insertions(+), 1 deletion(-)
> >  create mode 100644 tests/directio.c
> > 
> > diff --git a/tests/Makefile b/tests/Makefile
> > index a5cdf390..275fdd7d 100644
> > +
> > +/* test async checkers pending and getting resovled by another
> > checker
> > + * as well as the loops for getting multiple events */
> > +static void test_check_state_async(void **state)
> > +{
> > +	int i;
> > +	struct checker c[257] = {0};
> > +	struct async_req *reqs[257];
> > +	int res[257] = {0};
> > +
> > +	do_libcheck_load();
> > +	for (i = 0; i < 257; i++)
> > +		do_libcheck_init(&c[i], 4096, &reqs[i]);
> > +	for (i = 0; i < 256; i++) {
> > +		return_io_getevents_none();
> > +		do_check_state(&c[i], 0, 30, PATH_PENDING);
> > +		assert_true(is_checker_running(&c[i]));
> > +	}
> > +	return_io_getevents_nr(&full_timeout, 256, reqs, res);
> > +	return_io_getevents_nr(NULL, 1, &reqs[256], &res[256]);
> > +	do_check_state(&c[256], 0, 30, PATH_UP);
> > +	assert_false(is_checker_running(&c[256]));
> > +	libcheck_free(&c[i]);
> 
> Writing &c[256] here would be better readable. Apart from that, this is
> looking good. Some of the tests are implementation-specific, but that's
> fine if we stick with this implementation.
> 
> I wish we could have some tests with real aio for this, but that would
> be difficult to set up right. Perhaps scsi_debug could be leveraged for
> it.

I initially wrote this using a config file in tests, where if you
uncommented a line, and set it to a device, the tests would run on that
device.  Otherwise, it would skip the directio tests entirely. It was
easier to get the failures that I wanted with completely faking the
device, so I switched to this. I reasoned that the parts where you are
talking to the device, and it is responding like it should are easy
enough to test by actually running multipath/multipathd. Perhaps I could
add that back, but only use it for some of the tests, or just use the
wrap code to be able to only use the underlying functions (which talk to
the device) some of the time, and skip them if I need a certain result.
I'll take a look. 

-Ben

> 
> Regards
> Martin
> 
> 

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 14/15] libmultipath: make directio checker share io contexts
  2020-01-20  8:09   ` Martin Wilck
@ 2020-01-20 16:08     ` Benjamin Marzinski
  2020-01-21  8:55       ` Martin Wilck
  0 siblings, 1 reply; 49+ messages in thread
From: Benjamin Marzinski @ 2020-01-20 16:08 UTC (permalink / raw)
  To: Martin Wilck; +Cc: device-mapper development

On Mon, Jan 20, 2020 at 09:09:37AM +0100, Martin Wilck wrote:
> Hi Ben,
> 
> On Thu, 2020-01-16 at 20:18 -0600, Benjamin Marzinski wrote:
> > On systems with a large number of cores (>500), io_destroy() can take
> > tens to hundreds of milliseconds to complete, due to RCU
> > synchronization. If there are a large number of paths using the
> > directio
> > checker on such a system, this can lead to multipath taking almost a
> > minute to complete, with the vast majority of time taken up by
> > io_destroy().
> > 
> > To solve this, the directio checker now allocates one aio context for
> > every 1024 checkers. This reduces the io_destroy() delay to a
> > thousandth
> > of its previous level.
> 
> >  However, this means that muliple checkers are
> > sharing the same aio context, and must be able to handle getting
> > results
> > for other checkers.  Because only one checker is ever running at a
> > time, this doesn't require any locking.  However, locking could be
> > added
> > in the future if necessary, to allow multiple checkers to run at the
> > same time.
> > 
> > When checkers are freed, they usually no longer destroy the io
> > context.
> > Instead, they attempt to cancel any outstanding request. If that
> > fails,
> > they put the request on an orphan list, so that it can be freed by
> > other
> > checkers, once it has completed. IO contexts are only destroyed at
> > three
> > times, during reconfigure (to deal with the possibility that
> > multipathd
> > is holding more aio events than it needs to be, since there is a
> > single
> > limit for the whole system), when the checker class is unloaded, and
> > in a corner case when checkers are freed. If an aio_group (which
> > contains the aio context) is entirely full of orphaned requests, then
> > no checker can use it. Since no checker is using it, there is no
> > checker
> > to clear out the orphaned requests. In this (likely rare) case, the
> > last checker from that group to be freed and leave behind an orphaned
> > request will call io_destroy() and remove the group.
> 
> I haven't made up my mind about this patch yet. 
> 
> The general idea is of course good, I never liked the fact that we
> allocate one aio context per path. Also, technically, I haven't found
> any issues with your patch.
> 
> Wrt the design, rather than having the checkers mutually catch there
> completion events, have you considered simply creating a dedicated
> thread for each aio group, which would just call io_getevents() in a
> loop and update the checker status, which licbcheck_check() could then
> fetch at the next call? That approach would avoid the "no checker
> available for freeing orphans" complication. As an added bonus, this
> thread could also handle checker timeouts, and perhaps even
> io_destroy() and the cleanup. That way multipathd wouldn't have to wait
> for io_destroy() at all - it could drop all refs to the aio context to
> be deleted, detach the event thread, and let it handle the destruction
> of the remaining data structures without worrying further. Some locking
> and/or atomic operations would be required of course, but you remarked
> already that that might be necessary sooner or later anyway.

I actually went back and forth a while before settling on not using
another thread.  In the end, all of the additional locking work, and
extra thread stuff seemed worse than the orphan handling. You will only
ever need the data when the checker runs, and it will always want to
wait for the checker timeout, so I decided against adding more threads.
But like I said, it wasn't a clear cut choice, and I didn't think about
full aio_group corner case until I was actually writing the code. But,
right now the only io_destroy issue that exists is in that corner case.
I believe that you still want to wait for the old IO contexts to be
destroyed before you create new ones in reconfigure. 
 
> Second, I have experimented with this a bit, and AFAICS io_cancel()
> NEVER succeeds. Please don't add a level 3 logmessage there, as the
> failure of this call is expected. I observed that recent kernels always
> return -EINVAL from io_cancel(), and I'm going to report this on LKML
> soon. But whatever the error code is, as a matter of fact aio requests
> can't be cancelled with currentl Linux kernels. We have to wait for
> their completion no matter what.

Sure, I can change that message

> Finally, I wonder if the code could be simplified by just using a
> single aio context, possibly with configurable size. If slots fill up,
> we could either fall back to sync IO, or try to re-allocate the
> context, returning PATH_PENDING from checkers in the meantime.

The issue is that there is a single limit for async io events that is
shared by everything using aio on a system, so we don't want to take
up many more events than we need to. And in order to grow a single
context, we would either need to allocate the new one while still
holding the old one, or wait until we can destroy the old on. If we
wait, then every checker is stalled until the last IO is waited for. If
we don't wait, then we add back a significant chunk of the complexity,
and we also end up temporarily taking up a much larger chunk of aio
events than we need.

-Ben

> Maybe you have considered all this already and will explain why you
> made the design decisions the way you did. Looking forward to your
> comments.
> 
> Martin
> 
> 
> 
> 
> > 
> > Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> > ---
> >  libmultipath/checkers/directio.c | 337 +++++++++++++++++++++++++--
> > ----
> >  1 file changed, 278 insertions(+), 59 deletions(-)
> > 
> > diff --git a/libmultipath/checkers/directio.c
> > b/libmultipath/checkers/directio.c
> > index 1b00b775..0863db2a 100644
> > --- a/libmultipath/checkers/directio.c
> > +++ b/libmultipath/checkers/directio.c
> > @@ -17,6 +17,33 @@
> >  
> >  #include "checkers.h"
> >  #include "../libmultipath/debug.h"
> > +#include "../libmultipath/time-util.h"
> > +
> > +#define AIO_GROUP_SIZE 1024
> > +
> > +/* Note: This checker type relies on the fact that only one checker
> > can be run
> > + * at a time, since multiple checkers share the same aio_group, and
> > must be
> > + * able to modify other checker's async_reqs. If multple checkers
> > become able
> > + * to be run at the same time, this checker will need to add
> > locking, and
> > + * probably polling on event fds, to deal with that */
> > +
> > +struct aio_group {
> > +	struct list_head node;
> > +	int holders;
> > +	io_context_t ioctx;
> > +	struct list_head orphans;
> > +};
> > +
> > +struct async_req {
> > +	struct iocb io;
> > +	int blksize;
> > +	unsigned char *	buf;
> > +	unsigned char * ptr;
> > +	struct list_head node;
> > +	int state; /* PATH_REMOVED means this is an orphan */
> > +};
> > +
> > +static LIST_HEAD(aio_grp_list);
> >  
> >  enum {
> >  	MSG_DIRECTIO_UNKNOWN = CHECKER_FIRST_MSGID,
> > @@ -37,18 +64,134 @@ const char *libcheck_msgtable[] = {
> >  struct directio_context {
> >  	int		running;
> >  	int		reset_flags;
> > -	int		blksize;
> > -	unsigned char *	buf;
> > -	unsigned char * ptr;
> > -	io_context_t	ioctx;
> > -	struct iocb	io;
> > +	struct aio_group *aio_grp;
> > +	struct async_req *req;
> >  };
> >  
> > +static struct aio_group *
> > +add_aio_group(void)
> > +{
> > +	struct aio_group *aio_grp;
> > +
> > +	aio_grp = malloc(sizeof(struct aio_group));
> > +	if (!aio_grp)
> > +		return NULL;
> > +	memset(aio_grp, 0, sizeof(struct aio_group));
> > +	INIT_LIST_HEAD(&aio_grp->orphans);
> > +
> > +	if (io_setup(AIO_GROUP_SIZE, &aio_grp->ioctx) != 0) {
> > +		LOG(1, "io_setup failed");
> > +		free(aio_grp);
> > +		return NULL;
> > +	}
> > +	list_add(&aio_grp->node, &aio_grp_list);
> > +	return aio_grp;
> > +}
> > +
> > +static int
> > +set_aio_group(struct directio_context *ct)
> > +{
> > +	struct aio_group *aio_grp = NULL;
> > +
> > +	list_for_each_entry(aio_grp, &aio_grp_list, node)
> > +		if (aio_grp->holders < AIO_GROUP_SIZE)
> > +			goto found;
> > +	aio_grp = add_aio_group();
> > +	if (!aio_grp) {
> > +		ct->aio_grp = NULL;
> > +		return -1;
> > +	}
> > +found:
> > +	aio_grp->holders++;
> > +	ct->aio_grp = aio_grp;
> > +	return 0;
> > +}
> > +
> > +static void
> > +remove_aio_group(struct aio_group *aio_grp)
> > +{
> > +	struct async_req *req, *tmp;
> > +
> > +	io_destroy(aio_grp->ioctx);
> > +	list_for_each_entry_safe(req, tmp, &aio_grp->orphans, node) {
> > +		list_del(&req->node);
> > +		free(req->buf);
> > +		free(req);
> > +	}
> > +	list_del(&aio_grp->node);
> > +	free(aio_grp);
> > +}
> > +
> > +/* If an aio_group is completely full of orphans, then no checkers
> > can
> > + * use it, which means that no checkers can clear out the orphans.
> > To
> > + * avoid keeping the useless group around, simply remove remove the
> > + * group */
> > +static void
> > +check_orphaned_group(struct aio_group *aio_grp)
> > +{
> > +	int count = 0;
> > +	struct list_head *item;
> > +
> > +	if (aio_grp->holders < AIO_GROUP_SIZE)
> > +		return;
> > +	list_for_each(item, &aio_grp->orphans)
> > +		count++;
> > +	if (count >= AIO_GROUP_SIZE) {
> > +		remove_aio_group(aio_grp);
> > +		if (list_empty(&aio_grp_list))
> > +			add_aio_group();
> > +	}
> > +}
> > +
> > +int libcheck_load (void)
> > +{
> > +	if (add_aio_group() == NULL) {
> > +		LOG(1, "libcheck_load failed: %s", strerror(errno));
> > +		return 1;
> > +	}
> > +	return 0;
> > +}
> > +
> > +void libcheck_unload (void)
> > +{
> > +	struct aio_group *aio_grp, *tmp;
> > +
> > +	list_for_each_entry_safe(aio_grp, tmp, &aio_grp_list, node)
> > +		remove_aio_group(aio_grp);
> > +}
> > +
> > +int libcheck_reset (void)
> > +{
> > +	struct aio_group *aio_grp, *tmp, *reset_grp = NULL;
> > +
> > +	/* If a clean existing aio_group exists, use that. Otherwise
> > add a
> > +	 * new one */
> > +	list_for_each_entry(aio_grp, &aio_grp_list, node) {
> > +		if (aio_grp->holders == 0 &&
> > +		    list_empty(&aio_grp->orphans)) {
> > +			reset_grp = aio_grp;
> > +			break;
> > +		}
> > +	}
> > +	if (!reset_grp)
> > +		reset_grp = add_aio_group();
> > +	if (!reset_grp) {
> > +		LOG(1, "checker reset failed");
> > +		return 1;
> > +	}
> > +
> > +	list_for_each_entry_safe(aio_grp, tmp, &aio_grp_list, node) {
> > +		if (aio_grp != reset_grp)
> > +			remove_aio_group(aio_grp);
> > +	}
> > +	return 0;
> > +}
> >  
> >  int libcheck_init (struct checker * c)
> >  {
> >  	unsigned long pgsize = getpagesize();
> >  	struct directio_context * ct;
> > +	struct async_req *req = NULL;
> >  	long flags;
> >  
> >  	ct = malloc(sizeof(struct directio_context));
> > @@ -56,26 +199,31 @@ int libcheck_init (struct checker * c)
> >  		return 1;
> >  	memset(ct, 0, sizeof(struct directio_context));
> >  
> > -	if (io_setup(1, &ct->ioctx) != 0) {
> > -		condlog(1, "io_setup failed");
> > -		free(ct);
> > -		return 1;
> > +	if (set_aio_group(ct) < 0)
> > +		goto out;
> > +
> > +	req = malloc(sizeof(struct async_req));
> > +	if (!req) {
> > +		goto out;
> >  	}
> > +	memset(req, 0, sizeof(struct async_req));
> > +	INIT_LIST_HEAD(&req->node);
> >  
> > -	if (ioctl(c->fd, BLKBSZGET, &ct->blksize) < 0) {
> > +	if (ioctl(c->fd, BLKBSZGET, &req->blksize) < 0) {
> >  		c->msgid = MSG_DIRECTIO_BLOCKSIZE;
> > -		ct->blksize = 512;
> > +		req->blksize = 512;
> >  	}
> > -	if (ct->blksize > 4096) {
> > +	if (req->blksize > 4096) {
> >  		/*
> >  		 * Sanity check for DASD; BSZGET is broken
> >  		 */
> > -		ct->blksize = 4096;
> > +		req->blksize = 4096;
> >  	}
> > -	if (!ct->blksize)
> > +	if (!req->blksize)
> >  		goto out;
> > -	ct->buf = (unsigned char *)malloc(ct->blksize + pgsize);
> > -	if (!ct->buf)
> > +
> > +	req->buf = (unsigned char *)malloc(req->blksize + pgsize);
> > +	if (!req->buf)
> >  		goto out;
> >  
> >  	flags = fcntl(c->fd, F_GETFL);
> > @@ -88,17 +236,22 @@ int libcheck_init (struct checker * c)
> >  		ct->reset_flags = 1;
> >  	}
> >  
> > -	ct->ptr = (unsigned char *) (((unsigned long)ct->buf + pgsize -
> > 1) &
> > +	req->ptr = (unsigned char *) (((unsigned long)req->buf + pgsize
> > - 1) &
> >  		  (~(pgsize - 1)));
> >  
> >  	/* Successfully initialized, return the context. */
> > +	ct->req = req;
> >  	c->context = (void *) ct;
> >  	return 0;
> >  
> >  out:
> > -	if (ct->buf)
> > -		free(ct->buf);
> > -	io_destroy(ct->ioctx);
> > +	if (req) {
> > +		if (req->buf)
> > +			free(req->buf);
> > +		free(req);
> > +	}
> > +	if (ct->aio_grp)
> > +		ct->aio_grp->holders--;
> >  	free(ct);
> >  	return 1;
> >  }
> > @@ -106,6 +259,7 @@ out:
> >  void libcheck_free (struct checker * c)
> >  {
> >  	struct directio_context * ct = (struct directio_context *)c-
> > >context;
> > +	struct io_event event;
> >  	long flags;
> >  
> >  	if (!ct)
> > @@ -121,20 +275,73 @@ void libcheck_free (struct checker * c)
> >  		}
> >  	}
> >  
> > -	if (ct->buf)
> > -		free(ct->buf);
> > -	io_destroy(ct->ioctx);
> > +	if (ct->running &&
> > +	    (ct->req->state != PATH_PENDING ||
> > +	     io_cancel(ct->aio_grp->ioctx, &ct->req->io, &event) == 0))
> > +		ct->running = 0;
> > +	if (!ct->running) {
> > +		free(ct->req->buf);
> > +		free(ct->req);
> > +		ct->aio_grp->holders--;
> > +	} else {
> > +		LOG(3, "io_cancel error while freeing: %i", errno);
> > +		ct->req->state = PATH_REMOVED;
> > +		list_add(&ct->req->node, &ct->aio_grp->orphans);
> > +		check_orphaned_group(ct->aio_grp);
> > +	}
> > +
> >  	free(ct);
> > +	c->context = NULL;
> > +}
> > +
> > +static int
> > +get_events(struct aio_group *aio_grp, struct timespec *timeout)
> > +{
> > +	struct io_event events[128];
> > +	int i, nr, got_events = 0;
> > +	struct timespec zero_timeout = {0};
> > +	struct timespec *timep = (timeout)? timeout : &zero_timeout;
> > +
> > +	do {
> > +		errno = 0;
> > +		nr = io_getevents(aio_grp->ioctx, 1, 128, events,
> > timep);
> > +		got_events |= (nr > 0);
> > +
> > +		for (i = 0; i < nr; i++) {
> > +			struct async_req *req =
> > container_of(events[i].obj, struct async_req, io);
> > +
> > +			LOG(3, "io finished %lu/%lu", events[i].res,
> > +			    events[i].res2);
> > +
> > +			/* got an orphaned request */
> > +			if (req->state == PATH_REMOVED) {
> > +				list_del(&req->node);
> > +				free(req->buf);
> > +				free(req);
> > +				aio_grp->holders--;
> > +			} else
> > +				req->state = (events[i].res == req-
> > >blksize) ?
> > +					      PATH_UP : PATH_DOWN;
> > +		}
> > +		timep = &zero_timeout;
> > +	} while (nr == 128); /* assume there are more events and try
> > again */
> > +
> > +	if (nr < 0)
> > +		LOG(3, "async io getevents returned %i (errno=%s)",
> > +		    nr, strerror(errno));
> > +
> > +	return got_events;
> >  }
> >  
> >  static int
> >  check_state(int fd, struct directio_context *ct, int sync, int
> > timeout_secs)
> >  {
> >  	struct timespec	timeout = { .tv_nsec = 5 };
> > -	struct io_event event;
> >  	struct stat	sb;
> > -	int		rc = PATH_UNCHECKED;
> > +	int		rc;
> >  	long		r;
> > +	struct timespec currtime, endtime;
> > +	struct timespec *timep = &timeout;
> >  
> >  	if (fstat(fd, &sb) == 0) {
> >  		LOG(4, "called for %x", (unsigned) sb.st_rdev);
> > @@ -145,50 +352,62 @@ check_state(int fd, struct directio_context
> > *ct, int sync, int timeout_secs)
> >  		timeout.tv_nsec = 0;
> >  	}
> >  
> > -	if (!ct->running) {
> > -		struct iocb *ios[1] = { &ct->io };
> > +	if (ct->running) {
> > +		if (ct->req->state != PATH_PENDING) {
> > +			ct->running = 0;
> > +			return ct->req->state;
> > +		}
> > +	} else {
> > +		struct iocb *ios[1] = { &ct->req->io };
> >  
> >  		LOG(3, "starting new request");
> > -		memset(&ct->io, 0, sizeof(struct iocb));
> > -		io_prep_pread(&ct->io, fd, ct->ptr, ct->blksize, 0);
> > -		if (io_submit(ct->ioctx, 1, ios) != 1) {
> > +		memset(&ct->req->io, 0, sizeof(struct iocb));
> > +		io_prep_pread(&ct->req->io, fd, ct->req->ptr,
> > +			      ct->req->blksize, 0);
> > +		ct->req->state = PATH_PENDING;
> > +		if (io_submit(ct->aio_grp->ioctx, 1, ios) != 1) {
> >  			LOG(3, "io_submit error %i", errno);
> >  			return PATH_UNCHECKED;
> >  		}
> >  	}
> >  	ct->running++;
> >  
> > -	errno = 0;
> > -	r = io_getevents(ct->ioctx, 1L, 1L, &event, &timeout);
> > +	get_monotonic_time(&endtime);
> > +	endtime.tv_sec += timeout.tv_sec;
> > +	endtime.tv_nsec += timeout.tv_nsec;
> > +	normalize_timespec(&endtime);
> > +	while(1) {
> > +		r = get_events(ct->aio_grp, timep);
> >  
> > -	if (r < 0 ) {
> > -		LOG(3, "async io getevents returned %li (errno=%s)", r,
> > -		    strerror(errno));
> > -		ct->running = 0;
> > -		rc = PATH_UNCHECKED;
> > -	} else if (r < 1L) {
> > -		if (ct->running > timeout_secs || sync) {
> > -			struct iocb *ios[1] = { &ct->io };
> > -
> > -			LOG(3, "abort check on timeout");
> > -			r = io_cancel(ct->ioctx, ios[0], &event);
> > -			/*
> > -			 * Only reset ct->running if we really
> > -			 * could abort the pending I/O
> > -			 */
> > -			if (r)
> > -				LOG(3, "io_cancel error %i", errno);
> > -			else
> > -				ct->running = 0;
> > -			rc = PATH_DOWN;
> > -		} else {
> > -			LOG(3, "async io pending");
> > -			rc = PATH_PENDING;
> > -		}
> > +		if (ct->req->state != PATH_PENDING) {
> > +			ct->running = 0;
> > +			return ct->req->state;
> > +		} else if (r == 0 || !timep)
> > +			break;
> > +
> > +		get_monotonic_time(&currtime);
> > +		timespecsub(&endtime, &currtime, &timeout);
> > +		if (timeout.tv_sec < 0)
> > +			timep = NULL;
> > +	}
> > +	if (ct->running > timeout_secs || sync) {
> > +		struct io_event event;
> > +
> > +		LOG(3, "abort check on timeout");
> > +
> > +		r = io_cancel(ct->aio_grp->ioctx, &ct->req->io,
> > &event);
> > +		/*
> > +		 * Only reset ct->running if we really
> > +		 * could abort the pending I/O
> > +		 */
> > +		if (r)
> > +			LOG(3, "io_cancel error %i", errno);
> > +		else
> > +			ct->running = 0;
> > +		rc = PATH_DOWN;
> >  	} else {
> > -		LOG(3, "io finished %lu/%lu", event.res, event.res2);
> > -		ct->running = 0;
> > -		rc = (event.res == ct->blksize) ? PATH_UP : PATH_DOWN;
> > +		LOG(3, "async io pending");
> > +		rc = PATH_PENDING;
> >  	}
> >  
> >  	return rc;
> 

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 11/15] libmultipath: change failed path prio timeout
  2020-01-17 17:18   ` Martin Wilck
@ 2020-01-20 16:22     ` Benjamin Marzinski
  0 siblings, 0 replies; 49+ messages in thread
From: Benjamin Marzinski @ 2020-01-20 16:22 UTC (permalink / raw)
  To: Martin Wilck; +Cc: device-mapper development

On Fri, Jan 17, 2020 at 06:18:46PM +0100, Martin Wilck wrote:
> On Thu, 2020-01-16 at 20:18 -0600, Benjamin Marzinski wrote:
> This changes the unit of the first get_prio_timeout() argument from
> seconds to milliseconds. While that's a good thing (it was questionable
> design to have the same function take several "timeout" arguments in
> different units), we should rename the argument there to avoid
> confusion (checker_timeout's unit is seconds all around).

Sure.

-Ben

> 
> Apart from that, ACK.
> 

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 07/15] libmultipath: fix sgio_get_vpd looping
  2020-01-17 16:27   ` Martin Wilck
@ 2020-01-20 16:59     ` Benjamin Marzinski
  0 siblings, 0 replies; 49+ messages in thread
From: Benjamin Marzinski @ 2020-01-20 16:59 UTC (permalink / raw)
  To: Martin Wilck; +Cc: device-mapper development

On Fri, Jan 17, 2020 at 05:27:00PM +0100, Martin Wilck wrote:
> On Thu, 2020-01-16 at 20:18 -0600, Benjamin Marzinski wrote:
> 
> > diff --git a/tests/vpd.c b/tests/vpd.c
> > index d9f80eaa..4dbce010 100644
> > --- a/tests/vpd.c
> > +++ b/tests/vpd.c
> > @@ -306,7 +306,7 @@ static int create_vpd83(unsigned char *buf,
> > size_t bufsiz, const char *id,
> >  	default:
> >  		break;
> >  	}
> > -	put_unaligned_be16(n, buf + 2);
> > +	put_unaligned_be16(bufsiz, buf + 2);
> >  	return n + 4;
> >  }
> 
> I can see that you are trying to create a VPD with a certain given
> length. But this way you intentionally create a VPD that doesn't
> conform to the spec (offset 2 should contain the real length of the
> designator list, not some arbitrary value). This is dangerous, in the
> future someone may copy this code thinking that it creates a valid
> VPD. At least you should add a big fat comment. Better even, you
> should leave out this hunk and override the length value in the
> actual test (make_test_vpd_eui) if (sml == 1) (and also add a comment).

Sure.

-Ben

> 
> Regards
> Martin
> 

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 10/15] libmultipath: change how the checker async is set
  2020-01-20 15:07     ` Benjamin Marzinski
@ 2020-01-20 20:20       ` John Stoffel
  2020-01-21 12:49         ` Benjamin Marzinski
  0 siblings, 1 reply; 49+ messages in thread
From: John Stoffel @ 2020-01-20 20:20 UTC (permalink / raw)
  To: Benjamin Marzinski; +Cc: device-mapper development, John Stoffel, Martin Wilck

>>>>> "Benjamin" == Benjamin Marzinski <bmarzins@redhat.com> writes:

Benjamin> On Sat, Jan 18, 2020 at 12:43:49PM -0500, John Stoffel wrote:
>> >>>>> "Benjamin" == Benjamin Marzinski <bmarzins@redhat.com> writes:
>> 
Benjamin> The way that the checkers async mode worked in multipathd didn't make
Benjamin> much sense. When multipathd starts up, all checker classes are in the
Benjamin> sync mode, and any pathinfo() calls on devices would run the checker in
Benjamin> sync mode. However, the First time a checker class was used in
Benjamin> checkerloop, it would set that checker class to async (assuming
Benjamin> force_sync wasn't set).  After that, no matter when a checker from that
Benjamin> class was called, it would always run in async mode.  Multipathd doesn't
Benjamin> need to run checkers in sync mode at all, so don't.
>> 
>> Sorry, I had a hard time parsing this description, especially the last
>> sentence.  Do you mean that checkers should default to async then,
>> instead of sync mode?  And from looking at the code, don't you mean
>> that you force sync mode?  I guess the question is whether checker
>> classes should default sync, or async.  And if they move to async,
>> should they stay there?
>> 

Benjamin> Sorry. I mean that right now multipathd runs the checkers from pathinfo(),
Benjamin> wait_for_pending_paths() and check_path(). When multipathd starts, all
Benjamin> checkers are in sync mode. The first time a checker is run from
Benjamin> check_path(), it is switched to async mode, and stays there for the rest
Benjamin> of the time multipathd is runing.

Benjamin> There is no need for multipathd to run checkers in sync mode at all, so
Benjamin> we shouldn't be doing so for these first checks.

Thanks, that makes the entire issue much more clear.  This raises the
question of why there is a sync mode at all then?  In any case, the
above makes the issue more understandable.

John

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 14/15] libmultipath: make directio checker share io contexts
  2020-01-20 16:08     ` Benjamin Marzinski
@ 2020-01-21  8:55       ` Martin Wilck
  0 siblings, 0 replies; 49+ messages in thread
From: Martin Wilck @ 2020-01-21  8:55 UTC (permalink / raw)
  To: Benjamin Marzinski, Martin Wilck; +Cc: device-mapper development

Hi Ben,

thanks for your clarifying remarks.

On Mon, 2020-01-20 at 10:08 -0600, Benjamin Marzinski wrote:
> On Mon, Jan 20, 2020 at 09:09:37AM +0100, Martin Wilck wrote:
> > Hi Ben,
> > 
> > On Thu, 2020-01-16 at 20:18 -0600, Benjamin Marzinski wrote:
> > > On systems with a large number of cores (>500), io_destroy() can
> > > take
> > > tens to hundreds of milliseconds to complete, due to RCU
> > > synchronization. If there are a large number of paths using the
> > > directio
> > > checker on such a system, this can lead to multipath taking
> > > almost a
> > > minute to complete, with the vast majority of time taken up by
> > > io_destroy().
> > > 
> > > To solve this, the directio checker now allocates one aio context
> > > for
> > > every 1024 checkers. This reduces the io_destroy() delay to a
> > > thousandth
> > > of its previous level.
> > >  However, this means that muliple checkers are
> > > sharing the same aio context, and must be able to handle getting
> > > results
> > > for other checkers.  Because only one checker is ever running at
> > > a
> > > time, this doesn't require any locking.  However, locking could
> > > be
> > > added
> > > in the future if necessary, to allow multiple checkers to run at
> > > the
> > > same time.
> > > 
> > > When checkers are freed, they usually no longer destroy the io
> > > context.
> > > Instead, they attempt to cancel any outstanding request. If that
> > > fails,
> > > they put the request on an orphan list, so that it can be freed
> > > by
> > > other
> > > checkers, once it has completed. IO contexts are only destroyed
> > > at
> > > three
> > > times, during reconfigure (to deal with the possibility that
> > > multipathd
> > > is holding more aio events than it needs to be, since there is a
> > > single
> > > limit for the whole system), when the checker class is unloaded,
> > > and
> > > in a corner case when checkers are freed. If an aio_group (which
> > > contains the aio context) is entirely full of orphaned requests,
> > > then
> > > no checker can use it. Since no checker is using it, there is no
> > > checker
> > > to clear out the orphaned requests. In this (likely rare) case,
> > > the
> > > last checker from that group to be freed and leave behind an
> > > orphaned
> > > request will call io_destroy() and remove the group.
> > 
> > I haven't made up my mind about this patch yet. 
> > 
> > The general idea is of course good, I never liked the fact that we
> > allocate one aio context per path. Also, technically, I haven't
> > found
> > any issues with your patch.
> > 
> > Wrt the design, rather than having the checkers mutually catch
> > there
> > completion events, have you considered simply creating a dedicated
> > thread for each aio group, which would just call io_getevents() in
> > a
> > loop and update the checker status, which licbcheck_check() could
> > then
> > fetch at the next call? That approach would avoid the "no checker
> > available for freeing orphans" complication. As an added bonus,
> > this
> > thread could also handle checker timeouts, and perhaps even
> > io_destroy() and the cleanup. That way multipathd wouldn't have to
> > wait
> > for io_destroy() at all - it could drop all refs to the aio context
> > to
> > be deleted, detach the event thread, and let it handle the
> > destruction
> > of the remaining data structures without worrying further. Some
> > locking
> > and/or atomic operations would be required of course, but you
> > remarked
> > already that that might be necessary sooner or later anyway.
> 
> I actually went back and forth a while before settling on not using
> another thread.  In the end, all of the additional locking work, and
> extra thread stuff seemed worse than the orphan handling. You will
> only
> ever need the data when the checker runs, and it will always want to
> wait for the checker timeout, so I decided against adding more
> threads.
> But like I said, it wasn't a clear cut choice, and I didn't think
> about
> full aio_group corner case until I was actually writing the code. 

I guess it's a matter of taste. I thought the locking wouldn't be all-
too-complicated (the IO thread could be made the only entity actually
writing to the status bytes), and OTOH I found the additonal level of
complexity introduced by the aio_groups a bit confusing at first. 
But I'm definitely not in a position to reject your patch on these weak
grounds.

> But,
> right now the only io_destroy issue that exists is in that corner
> case.
> I believe that you still want to wait for the old IO contexts to be
> destroyed before you create new ones in reconfigure. 

Well, the kernel doesn't wait until all resources are released before
freeing the AIO slots (see 
https://elixir.bootlin.com/linux/v5.5-rc7/source/fs/aio.c#L836). But
that's undocumented behavior, so waiting for io_destroy() to return is
certainly safer.

> > Second, I have experimented with this a bit, and AFAICS io_cancel()
> > NEVER succeeds. Please don't add a level 3 logmessage there, as the
> > failure of this call is expected. I observed that recent kernels
> > always
> > return -EINVAL from io_cancel(), and I'm going to report this on
> > LKML
> > soon. But whatever the error code is, as a matter of fact aio
> > requests
> > can't be cancelled with currentl Linux kernels. We have to wait for
> > their completion no matter what.
> 
> Sure, I can change that message

In principle, you could almost skip the entire check, as the kernel
won't return 0 anyway. (See also my linux-aio patch, 
https://marc.info/?l=linux-aio&m=157953649003136&w=2).


> > Finally, I wonder if the code could be simplified by just using a
> > single aio context, possibly with configurable size. If slots fill
> > up,
> > we could either fall back to sync IO, or try to re-allocate the
> > context, returning PATH_PENDING from checkers in the meantime.
> 
> The issue is that there is a single limit for async io events that is
> shared by everything using aio on a system, so we don't want to take
> up many more events than we need to. And in order to grow a single
> context, we would either need to allocate the new one while still
> holding the old one, or wait until we can destroy the old on. If we
> wait, then every checker is stalled until the last IO is waited for.
> If
> we don't wait, then we add back a significant chunk of the
> complexity,
> and we also end up temporarily taking up a much larger chunk of aio
> events than we need.

I don't think that would be a big issue, in particular as it would be
temporary. We don't need to bother too much about depleting system
resources. On servers which run aio-intensive workloads, the limit will
be increased anyway. On other systems, the default of 64k is plenty for
multipath, except in corner cases.

What bothers me more is that we, ourselves, have no fallback to sync IO
in the case that io_setup() fails. If that happens and directio is
configured, multipathd will effectively run without path checker.
That's not an issue of your patch, I was just considering whether it
should be fixed while we're at it. And we should add a more meaningful
error message if io_setup() returns -EAGAIN, hinting at an increase of
for fs.aio-max-nr sysctl, and add a similar hint in the man page.

Bottom line: I suppose you'll submit a v2 of your series anyway to
address the minor points I raised, and I believe this part will be fine
then, too.

Thanks,
Martin

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 10/15] libmultipath: change how the checker async is set
  2020-01-20 20:20       ` John Stoffel
@ 2020-01-21 12:49         ` Benjamin Marzinski
  2020-01-21 20:13           ` John Stoffel
  0 siblings, 1 reply; 49+ messages in thread
From: Benjamin Marzinski @ 2020-01-21 12:49 UTC (permalink / raw)
  To: John Stoffel; +Cc: device-mapper development, Martin Wilck

On Mon, Jan 20, 2020 at 03:20:14PM -0500, John Stoffel wrote:
> >>>>> "Benjamin" == Benjamin Marzinski <bmarzins@redhat.com> writes:
> 
> Benjamin> On Sat, Jan 18, 2020 at 12:43:49PM -0500, John Stoffel wrote:
> >> >>>>> "Benjamin" == Benjamin Marzinski <bmarzins@redhat.com> writes:
> >> 
> Benjamin> The way that the checkers async mode worked in multipathd didn't make
> Benjamin> much sense. When multipathd starts up, all checker classes are in the
> Benjamin> sync mode, and any pathinfo() calls on devices would run the checker in
> Benjamin> sync mode. However, the First time a checker class was used in
> Benjamin> checkerloop, it would set that checker class to async (assuming
> Benjamin> force_sync wasn't set).  After that, no matter when a checker from that
> Benjamin> class was called, it would always run in async mode.  Multipathd doesn't
> Benjamin> need to run checkers in sync mode at all, so don't.
> >> 
> >> Sorry, I had a hard time parsing this description, especially the last
> >> sentence.  Do you mean that checkers should default to async then,
> >> instead of sync mode?  And from looking at the code, don't you mean
> >> that you force sync mode?  I guess the question is whether checker
> >> classes should default sync, or async.  And if they move to async,
> >> should they stay there?
> >> 
> 
> Benjamin> Sorry. I mean that right now multipathd runs the checkers from pathinfo(),
> Benjamin> wait_for_pending_paths() and check_path(). When multipathd starts, all
> Benjamin> checkers are in sync mode. The first time a checker is run from
> Benjamin> check_path(), it is switched to async mode, and stays there for the rest
> Benjamin> of the time multipathd is runing.
> 
> Benjamin> There is no need for multipathd to run checkers in sync mode at all, so
> Benjamin> we shouldn't be doing so for these first checks.
> 
> Thanks, that makes the entire issue much more clear.  This raises the
> question of why there is a sync mode at all then?  In any case, the
> above makes the issue more understandable.

The multipath command, which uses the same checkers, runs in sync mode.

-Ben

> 
> John

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 10/15] libmultipath: change how the checker async is set
  2020-01-21 12:49         ` Benjamin Marzinski
@ 2020-01-21 20:13           ` John Stoffel
  0 siblings, 0 replies; 49+ messages in thread
From: John Stoffel @ 2020-01-21 20:13 UTC (permalink / raw)
  To: Benjamin Marzinski; +Cc: device-mapper development, John Stoffel, Martin Wilck

>>>>> "Benjamin" == Benjamin Marzinski <bmarzins@redhat.com> writes:

Benjamin> On Mon, Jan 20, 2020 at 03:20:14PM -0500, John Stoffel wrote:
>> >>>>> "Benjamin" == Benjamin Marzinski <bmarzins@redhat.com> writes:
>> 
Benjamin> On Sat, Jan 18, 2020 at 12:43:49PM -0500, John Stoffel wrote:
>> >> >>>>> "Benjamin" == Benjamin Marzinski <bmarzins@redhat.com> writes:
>> >> 
Benjamin> The way that the checkers async mode worked in multipathd didn't make
Benjamin> much sense. When multipathd starts up, all checker classes are in the
Benjamin> sync mode, and any pathinfo() calls on devices would run the checker in
Benjamin> sync mode. However, the First time a checker class was used in
Benjamin> checkerloop, it would set that checker class to async (assuming
Benjamin> force_sync wasn't set).  After that, no matter when a checker from that
Benjamin> class was called, it would always run in async mode.  Multipathd doesn't
Benjamin> need to run checkers in sync mode at all, so don't.
>> >> 
>> >> Sorry, I had a hard time parsing this description, especially the last
>> >> sentence.  Do you mean that checkers should default to async then,
>> >> instead of sync mode?  And from looking at the code, don't you mean
>> >> that you force sync mode?  I guess the question is whether checker
>> >> classes should default sync, or async.  And if they move to async,
>> >> should they stay there?
>> >> 
>> 
Benjamin> Sorry. I mean that right now multipathd runs the checkers from pathinfo(),
Benjamin> wait_for_pending_paths() and check_path(). When multipathd starts, all
Benjamin> checkers are in sync mode. The first time a checker is run from
Benjamin> check_path(), it is switched to async mode, and stays there for the rest
Benjamin> of the time multipathd is runing.
>> 
Benjamin> There is no need for multipathd to run checkers in sync mode at all, so
Benjamin> we shouldn't be doing so for these first checks.
>> 
>> Thanks, that makes the entire issue much more clear.  This raises the
>> question of why there is a sync mode at all then?  In any case, the
>> above makes the issue more understandable.

Benjamin> The multipath command, which uses the same checkers, runs in sync mode.

Crystal clean now.  Thanks for your patience!

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 09/15] libmultipath: add code to get vendor specific vpd data
  2020-01-17 16:56   ` Martin Wilck
@ 2020-01-22  8:51     ` Benjamin Marzinski
  2020-01-22 11:52       ` Martin Wilck
  0 siblings, 1 reply; 49+ messages in thread
From: Benjamin Marzinski @ 2020-01-22  8:51 UTC (permalink / raw)
  To: Martin Wilck; +Cc: device-mapper development

On Fri, Jan 17, 2020 at 05:56:40PM +0100, Martin Wilck wrote:
> On Thu, 2020-01-16 at 20:18 -0600, Benjamin Marzinski wrote:
> > This adds the wildcard 'g' for multipath and path formatted printing,
> > which returns extra data from a device's vendor specific vpd
> > page.  The
> > specific vendor vpd page to use, and the vendor/product id to decode
> > it
> > can be set in the hwentry with vpd_vendor_pg and vpd_vendor_id. It
> > can
> > be configured in the devices section of multipath.conf with the
> > vpd_vendor parameter. Currently, the only devices that use this are
> > HPE
> > 3PAR arrays, to return the Volume Name.
> > 
> > Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> > ---
> >  libmultipath/config.c      |  4 ++++
> >  libmultipath/config.h      |  2 ++
> >  libmultipath/dict.c        | 34 ++++++++++++++++++++++++++++++++++
> >  libmultipath/discovery.c   | 34 +++++++++++++++++++++++++++++++++-
> >  libmultipath/hwtable.c     |  2 ++
> >  libmultipath/print.c       | 27 +++++++++++++++++++++++++++
> >  libmultipath/propsel.c     | 24 ++++++++++++++++++++++++
> >  libmultipath/propsel.h     |  2 ++
> >  libmultipath/structs.h     |  9 +++++++++
> >  multipath/multipath.conf.5 |  8 ++++++++
> >  10 files changed, 145 insertions(+), 1 deletion(-)
> > 
> > diff --git a/libmultipath/config.c b/libmultipath/config.c
> > index 85626e96..72b8d37c 100644
> > --- a/libmultipath/config.c
> > +++ b/libmultipath/config.c
> > @@ -369,6 +369,8 @@ merge_hwe (struct hwentry * dst, struct hwentry *
> > src)
> >  	merge_num(max_sectors_kb);
> >  	merge_num(ghost_delay);
> >  	merge_num(all_tg_pt);
> > +	merge_num(vpd_vendor_pg);
> > +	merge_num(vpd_vendor_id);
> >  	merge_num(san_path_err_threshold);
> >  	merge_num(san_path_err_forget_rate);
> >  	merge_num(san_path_err_recovery_time);
> > @@ -517,6 +519,8 @@ store_hwe (vector hwtable, struct hwentry * dhwe)
> >  	hwe->detect_prio = dhwe->detect_prio;
> >  	hwe->detect_checker = dhwe->detect_checker;
> >  	hwe->ghost_delay = dhwe->ghost_delay;
> > +	hwe->vpd_vendor_pg = dhwe->vpd_vendor_pg;
> > +	hwe->vpd_vendor_id = dhwe->vpd_vendor_id;
> >  
> >  	if (dhwe->bl_product && !(hwe->bl_product = set_param_str(dhwe-
> > >bl_product)))
> >  		goto out;
> > diff --git a/libmultipath/config.h b/libmultipath/config.h
> > index e69aa07c..589146de 100644
> > --- a/libmultipath/config.h
> > +++ b/libmultipath/config.h
> > @@ -87,6 +87,8 @@ struct hwentry {
> >  	int max_sectors_kb;
> >  	int ghost_delay;
> >  	int all_tg_pt;
> > +	int vpd_vendor_pg;
> > +	int vpd_vendor_id;
> >  	char * bl_product;
> >  };
> >  
> > diff --git a/libmultipath/dict.c b/libmultipath/dict.c
> > index 2b046e1d..d6d8b79b 100644
> > --- a/libmultipath/dict.c
> > +++ b/libmultipath/dict.c
> > @@ -1366,6 +1366,39 @@ def_uxsock_timeout_handler(struct config
> > *conf, vector strvec)
> >  	return 0;
> >  }
> >  
> > +static int
> > +hw_vpd_vendor_handler(struct config *conf, vector strvec)
> > +{
> > +	int rc = 0;
> > +	char *buff;
> > +
> > +	struct hwentry * hwe = VECTOR_LAST_SLOT(conf->hwtable);
> > +	if (!hwe)
> > +		return 1;
> > +
> > +	buff = set_value(strvec);
> > +	if (!buff)
> > +		return 1;
> > +	if (strcmp(buff, "hp3par") == 0) {
> > +		hwe->vpd_vendor_pg = 0xc0;
> > +		hwe->vpd_vendor_id = VPD_VP_HP3PAR;
> > +	} else
> > +		rc = 1;
> > +	FREE(buff);
> > +	return rc;
> > +}
> > +
> > +static int
> > +snprint_hw_vpd_vendor(struct config *conf, char * buff, int len,
> > +		      const void * data)
> > +{
> > +	const struct hwentry * hwe = (const struct hwentry *)data;
> > +
> > +	if (hwe->vpd_vendor_pg == 0xc0 && hwe->vpd_vendor_id ==
> > VPD_VP_HP3PAR)
> > +		return snprintf(buff, len, "hp3par");
> > +	return 0;
> > +}
> 
> I don't understand this design. The way you set up the hwe, it would be
> logical to add "vpd_vendor_page" and "vpd_vendor_id" properties for
> device entries.

The vpd page abbreviations specify both the page to read, and the vendor
id to use to decode it, and they are more user readable. It seemed like
a much more foolproof way to have people specify this.
 
> BUT ok, that's overengineered with just one supported combination of
> page and vendor. I can understand that. But then, it seems also
> overengineered to carry around vpd_vendor_pg and vpd_vendor_id in the
> hwe. 
> 
> I'd suggest creating a hard-coded table with "vendor vpd schemes",
> 
> struct {
> 	int pg;
> 	int vendor_id;
> 	const char *name;
> } vendor_vpd_schemes[]  = { 
>     { 0xc0, VPD_VP_HP3PAR, "hp3par", }, 
>     { 0, }, 
> };
> 
> and in the hwentry, only carry around the index into this table,
> and look up the page and vendor ID there. Actually, with just that
> single user, we might as well not introduce a new device property at
> all, but simply use a separate hardcoded table with lookup values for
> vendor and product ID. I'm unsure if it's wise to make this
> configurable via multipath.conf.

Sure. I can make the hwe change. The reason why I added this to
multipath.conf is that I'm not sure what devices actually support these
vpd information pages.  So, if we miss one, or a new device comes out
that isn't in the built-in config, the user can still get this info.

> Regards,
> Martin
> 
> -- 
> Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
> SUSE
> Software Solutions Germany GmbH 
> HRB 36809 (AG Nürnberg) GF: Felix
> Imendörffer
> 

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 09/15] libmultipath: add code to get vendor specific vpd data
  2020-01-18 17:34   ` John Stoffel
@ 2020-01-22  9:46     ` Benjamin Marzinski
  0 siblings, 0 replies; 49+ messages in thread
From: Benjamin Marzinski @ 2020-01-22  9:46 UTC (permalink / raw)
  To: John Stoffel; +Cc: device-mapper development, Martin Wilck

On Sat, Jan 18, 2020 at 12:34:38PM -0500, John Stoffel wrote:
> >>>>> "Benjamin" == Benjamin Marzinski <bmarzins@redhat.com> writes:
> 
> Benjamin> This adds the wildcard 'g' for multipath and path formatted
> Benjamin> printing, which returns extra data from a device's vendor
> Benjamin> specific vpd page.  The specific vendor vpd page to use, and
> Benjamin> the vendor/product id to decode it can be set in the hwentry
> Benjamin> with vpd_vendor_pg and vpd_vendor_id. It can be configured
> Benjamin> in the devices section of multipath.conf with the vpd_vendor
> Benjamin> parameter. Currently, the only devices that use this are HPE
> Benjamin> 3PAR arrays, to return the Volume Name.
> 
> So I've looked over this patch and I see how you've hard-coded the
> 3par entry into the various .c files, but the multipathd.conf example
> really doesn't give a good example of how to use it, nor does the code
> give a good example, or maybe better yet a template set of functions
> for parsing custom VPD info pages.  This is really just a nit.  

Right now, the only thing that someone can do is add

vpd_vendor hp3par

to multipath.conf, and AFAIK, right now there is no reason why anyone
would do that.  I belive that all the devices that use this page already
have it in their built-in config.  When there are more options that can
be used, the man page can be revisited.

About the code, the issue is that there really isn't any standard that
I'm aware of for what's in the vendor specific pages, so I don't really
even know what those would look like.  I had to read the sg_vpd code and
ask HPE specifically for details of their format, to write this.  I
assume something like that will be necessary for each new one of these.

> Also, is the Vendor VPD page always at 0xco, or is that just a 3par
> setting?  I wonder if that constant should be pulled into a #define
> somewhere, so that it's simpler to update/change as needed?
> Especially if other vendors use different pages.

Vendors definitely use different pages. See the Vendor specific VPD
pages section of the "sg_vpd -e" output. I am planning on changing how
these are defined to match Martin Wilck's suggestion.

-Ben

> John
> 
> 
> 
> Benjamin> Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> Benjamin> ---
> Benjamin>  libmultipath/config.c      |  4 ++++
> Benjamin>  libmultipath/config.h      |  2 ++
> Benjamin>  libmultipath/dict.c        | 34 ++++++++++++++++++++++++++++++++++
> Benjamin>  libmultipath/discovery.c   | 34 +++++++++++++++++++++++++++++++++-
> Benjamin>  libmultipath/hwtable.c     |  2 ++
> Benjamin>  libmultipath/print.c       | 27 +++++++++++++++++++++++++++
> Benjamin>  libmultipath/propsel.c     | 24 ++++++++++++++++++++++++
> Benjamin>  libmultipath/propsel.h     |  2 ++
> Benjamin>  libmultipath/structs.h     |  9 +++++++++
> Benjamin>  multipath/multipath.conf.5 |  8 ++++++++
> Benjamin>  10 files changed, 145 insertions(+), 1 deletion(-)
> 
> Benjamin> diff --git a/libmultipath/config.c b/libmultipath/config.c
> Benjamin> index 85626e96..72b8d37c 100644
> Benjamin> --- a/libmultipath/config.c
> Benjamin> +++ b/libmultipath/config.c
> Benjamin> @@ -369,6 +369,8 @@ merge_hwe (struct hwentry * dst, struct hwentry * src)
> Benjamin>  	merge_num(max_sectors_kb);
> Benjamin>  	merge_num(ghost_delay);
> Benjamin>  	merge_num(all_tg_pt);
> Benjamin> +	merge_num(vpd_vendor_pg);
> Benjamin> +	merge_num(vpd_vendor_id);
> Benjamin>  	merge_num(san_path_err_threshold);
> Benjamin>  	merge_num(san_path_err_forget_rate);
> Benjamin>  	merge_num(san_path_err_recovery_time);
> Benjamin> @@ -517,6 +519,8 @@ store_hwe (vector hwtable, struct hwentry * dhwe)
> hwe-> detect_prio = dhwe->detect_prio;
> hwe-> detect_checker = dhwe->detect_checker;
> hwe-> ghost_delay = dhwe->ghost_delay;
> Benjamin> +	hwe->vpd_vendor_pg = dhwe->vpd_vendor_pg;
> Benjamin> +	hwe->vpd_vendor_id = dhwe->vpd_vendor_id;
>  
> Benjamin>  	if (dhwe->bl_product && !(hwe->bl_product = set_param_str(dhwe->bl_product)))
> Benjamin>  		goto out;
> Benjamin> diff --git a/libmultipath/config.h b/libmultipath/config.h
> Benjamin> index e69aa07c..589146de 100644
> Benjamin> --- a/libmultipath/config.h
> Benjamin> +++ b/libmultipath/config.h
> Benjamin> @@ -87,6 +87,8 @@ struct hwentry {
> Benjamin>  	int max_sectors_kb;
> Benjamin>  	int ghost_delay;
> Benjamin>  	int all_tg_pt;
> Benjamin> +	int vpd_vendor_pg;
> Benjamin> +	int vpd_vendor_id;
> Benjamin>  	char * bl_product;
> Benjamin>  };
>  
> Benjamin> diff --git a/libmultipath/dict.c b/libmultipath/dict.c
> Benjamin> index 2b046e1d..d6d8b79b 100644
> Benjamin> --- a/libmultipath/dict.c
> Benjamin> +++ b/libmultipath/dict.c
> Benjamin> @@ -1366,6 +1366,39 @@ def_uxsock_timeout_handler(struct config *conf, vector strvec)
> Benjamin>  	return 0;
> Benjamin>  }
>  
> Benjamin> +static int
> Benjamin> +hw_vpd_vendor_handler(struct config *conf, vector strvec)
> Benjamin> +{
> Benjamin> +	int rc = 0;
> Benjamin> +	char *buff;
> Benjamin> +
> Benjamin> +	struct hwentry * hwe = VECTOR_LAST_SLOT(conf->hwtable);
> Benjamin> +	if (!hwe)
> Benjamin> +		return 1;
> Benjamin> +
> Benjamin> +	buff = set_value(strvec);
> Benjamin> +	if (!buff)
> Benjamin> +		return 1;
> Benjamin> +	if (strcmp(buff, "hp3par") == 0) {
> Benjamin> +		hwe->vpd_vendor_pg = 0xc0;
> Benjamin> +		hwe->vpd_vendor_id = VPD_VP_HP3PAR;
> Benjamin> +	} else
> Benjamin> +		rc = 1;
> Benjamin> +	FREE(buff);
> Benjamin> +	return rc;
> Benjamin> +}
> Benjamin> +
> Benjamin> +static int
> Benjamin> +snprint_hw_vpd_vendor(struct config *conf, char * buff, int len,
> Benjamin> +		      const void * data)
> Benjamin> +{
> Benjamin> +	const struct hwentry * hwe = (const struct hwentry *)data;
> Benjamin> +
> Benjamin> +	if (hwe->vpd_vendor_pg == 0xc0 && hwe->vpd_vendor_id == VPD_VP_HP3PAR)
> Benjamin> +		return snprintf(buff, len, "hp3par");
> Benjamin> +	return 0;
> Benjamin> +}
> Benjamin> +
> Benjamin>  /*
> Benjamin>   * blacklist block handlers
> Benjamin>   */
> Benjamin> @@ -1806,6 +1839,7 @@ init_keywords(vector keywords)
> Benjamin>  	install_keyword("max_sectors_kb", &hw_max_sectors_kb_handler, &snprint_hw_max_sectors_kb);
> Benjamin>  	install_keyword("ghost_delay", &hw_ghost_delay_handler, &snprint_hw_ghost_delay);
> Benjamin>  	install_keyword("all_tg_pt", &hw_all_tg_pt_handler, &snprint_hw_all_tg_pt);
> Benjamin> +	install_keyword("vpd_vendor", &hw_vpd_vendor_handler, &snprint_hw_vpd_vendor);
> Benjamin>  	install_sublevel_end();
>  
> Benjamin>  	install_keyword_root("overrides", &overrides_handler);
> Benjamin> diff --git a/libmultipath/discovery.c b/libmultipath/discovery.c
> Benjamin> index 1d79cbae..d2773c3a 100644
> Benjamin> --- a/libmultipath/discovery.c
> Benjamin> +++ b/libmultipath/discovery.c
> Benjamin> @@ -1103,6 +1103,30 @@ parse_vpd_pg83(const unsigned char *in, size_t in_len,
> Benjamin>  	return len;
> Benjamin>  }
>  
> Benjamin> +static int
> Benjamin> +parse_vpd_c0_hp3par(const unsigned char *in, size_t in_len,
> Benjamin> +		    char *out, size_t out_len)
> Benjamin> +{
> Benjamin> +	size_t len;
> Benjamin> +
> Benjamin> +	memset(out, 0x0, out_len);
> Benjamin> +	if (in_len <= 4 || (in[4] > 3 && in_len < 44)) {
> Benjamin> +		condlog(3, "HP/3PAR vendor specific VPD page length too short: %lu", in_len);
> Benjamin> +		return -EINVAL;
> Benjamin> +	}
> Benjamin> +	if (in[4] <= 3) /* revision must be > 3 to have Vomlume Name */
> Benjamin> +		return -ENODATA;
> Benjamin> +	len = get_unaligned_be32(&in[40]);
> Benjamin> +	if (len > out_len || len + 44 > in_len) {
> Benjamin> +		condlog(3, "HP/3PAR vendor specific Volume name too long: %lu",
> Benjamin> +			len);
> Benjamin> +		return -EINVAL;
> Benjamin> +	}
> Benjamin> +	memcpy(out, &in[44], len);
> Benjamin> +	out[out_len - 1] = '\0';
> Benjamin> +	return len;
> Benjamin> +}
> Benjamin> +
> Benjamin>  static int
> Benjamin>  get_vpd_sysfs (struct udev_device *parent, int pg, char * str, int maxlen)
> Benjamin>  {
> Benjamin> @@ -1170,7 +1194,9 @@ get_vpd_sgio (int fd, int pg, int vend_id, char * str, int maxlen)
> Benjamin>  			len = (buff_len <= maxlen)? buff_len : maxlen;
> Benjamin>  			memcpy (str, buff, len);
> Benjamin>  		}
> Benjamin> -	} else
> Benjamin> +	} else if (pg == 0xc0 && vend_id == VPD_VP_HP3PAR)
> Benjamin> +		len = parse_vpd_c0_hp3par(buff, buff_len, str, maxlen);
> Benjamin> +	else
> Benjamin>  		len = -ENOSYS;
>  
> Benjamin>  	return len;
> Benjamin> @@ -1544,6 +1570,12 @@ scsi_ioctl_pathinfo (struct path * pp, struct config *conf, int mask)
> Benjamin>  	if (!(mask & DI_SERIAL))
> Benjamin>  		return;
>  
> Benjamin> +	select_vpd_vendor_pg(conf, pp);
> Benjamin> +	select_vpd_vendor_id(conf, pp);
> Benjamin> +
> Benjamin> +	if (pp->vpd_vendor_pg != 0 && get_vpd_sgio(pp->fd, pp->vpd_vendor_pg, pp->vpd_vendor_id, pp->vpd_data, sizeof(pp->vpd_data)) < 0)
> Benjamin> +		condlog(3, "%s: failed to get extra vpd data", pp->dev);
> Benjamin> +
> Benjamin>  	parent = pp->udev;
> Benjamin>  	while (parent) {
> Benjamin>  		const char *subsys = udev_device_get_subsystem(parent);
> Benjamin> diff --git a/libmultipath/hwtable.c b/libmultipath/hwtable.c
> Benjamin> index 16627ec5..1f27450c 100644
> Benjamin> --- a/libmultipath/hwtable.c
> Benjamin> +++ b/libmultipath/hwtable.c
> Benjamin> @@ -117,6 +117,8 @@ static struct hwentry default_hw[] = {
> Benjamin>  		.no_path_retry = 18,
> Benjamin>  		.fast_io_fail  = 10,
> Benjamin>  		.dev_loss      = MAX_DEV_LOSS_TMO,
> Benjamin> +		.vpd_vendor_pg = 0xc0,
> Benjamin> +		.vpd_vendor_id = VPD_VP_HP3PAR,
> Benjamin>  	},
> Benjamin>  	{
> Benjamin>  		/* RA8000 / ESA12000 */
> Benjamin> diff --git a/libmultipath/print.c b/libmultipath/print.c
> Benjamin> index 907469ad..0aafe3cb 100644
> Benjamin> --- a/libmultipath/print.c
> Benjamin> +++ b/libmultipath/print.c
> Benjamin> @@ -358,6 +358,23 @@ snprint_action (char * buff, size_t len, const struct multipath * mpp)
> Benjamin>  	}
> Benjamin>  }
>  
> Benjamin> +static int
> Benjamin> +snprint_multipath_vpd_data(char * buff, size_t len,
> Benjamin> +			   const struct multipath * mpp)
> Benjamin> +{
> Benjamin> +	struct pathgroup * pgp;
> Benjamin> +	struct path * pp;
> Benjamin> +	int i, j;
> Benjamin> +
> Benjamin> +	vector_foreach_slot(mpp->pg, pgp, i) {
> Benjamin> +		vector_foreach_slot(pgp->paths, pp, j) {
> Benjamin> +			if (strlen(pp->vpd_data))
> Benjamin> +				return snprintf(buff, len, "%s", pp->vpd_data);
> Benjamin> +		}
> Benjamin> +	}
> Benjamin> +	return 0;
> Benjamin> +}
> Benjamin> +
> Benjamin>  /*
> Benjamin>   * path info printing functions
> Benjamin>   */
> Benjamin> @@ -688,6 +705,14 @@ snprint_path_marginal(char * buff, size_t len, const struct path * pp)
> Benjamin>  	return snprintf(buff, len, "normal");
> Benjamin>  }
>  
> Benjamin> +static int
> Benjamin> +snprint_path_vpd_data(char * buff, size_t len, const struct path * pp)
> Benjamin> +{
> Benjamin> +	if (strlen(pp->vpd_data) > 0)
> Benjamin> +		return snprintf(buff, len, "%s", pp->vpd_data);
> Benjamin> +	return 0;
> Benjamin> +}
> Benjamin> +
> Benjamin>  struct multipath_data mpd[] = {
> Benjamin>  	{'n', "name",          0, snprint_name},
> Benjamin>  	{'w', "uuid",          0, snprint_multipath_uuid},
> Benjamin> @@ -712,6 +737,7 @@ struct multipath_data mpd[] = {
> Benjamin>  	{'p', "prod",          0, snprint_multipath_prod},
> Benjamin>  	{'e', "rev",           0, snprint_multipath_rev},
> Benjamin>  	{'G', "foreign",       0, snprint_multipath_foreign},
> Benjamin> +	{'g', "vpd page data", 0, snprint_multipath_vpd_data},
> Benjamin>  	{0, NULL, 0 , NULL}
> Benjamin>  };
>  
> Benjamin> @@ -737,6 +763,7 @@ struct path_data pd[] = {
> Benjamin>  	{'r', "target WWPN",   0, snprint_tgt_wwpn},
> Benjamin>  	{'a', "host adapter",  0, snprint_host_adapter},
> Benjamin>  	{'G', "foreign",       0, snprint_path_foreign},
> Benjamin> +	{'g', "vpd page data", 0, snprint_path_vpd_data},
> Benjamin>  	{'0', "failures",      0, snprint_path_failures},
> Benjamin>  	{'P', "protocol",      0, snprint_path_protocol},
> Benjamin>  	{0, NULL, 0 , NULL}
> Benjamin> diff --git a/libmultipath/propsel.c b/libmultipath/propsel.c
> Benjamin> index b5b5b89f..3c99f2d4 100644
> Benjamin> --- a/libmultipath/propsel.c
> Benjamin> +++ b/libmultipath/propsel.c
> Benjamin> @@ -1203,3 +1203,27 @@ out:
> Benjamin>  		origin);
> Benjamin>  	return 0;
> Benjamin>  }
> Benjamin> +
> Benjamin> +int select_vpd_vendor_pg (struct config *conf, struct path *pp)
> Benjamin> +{
> Benjamin> +	const char *origin;
> Benjamin> +
> Benjamin> +	pp_set_hwe(vpd_vendor_pg);
> Benjamin> +	pp_set_default(vpd_vendor_pg, 0);
> Benjamin> +out:
> Benjamin> +	condlog(3, "%s: vpd_vendor_pg = 0x%x %s", pp->dev, pp->vpd_vendor_pg,
> Benjamin> +		origin);
> Benjamin> +	return 0;
> Benjamin> +}
> Benjamin> +
> Benjamin> +int select_vpd_vendor_id (struct config *conf, struct path *pp)
> Benjamin> +{
> Benjamin> +	const char *origin;
> Benjamin> +
> Benjamin> +	pp_set_hwe(vpd_vendor_id);
> Benjamin> +	pp_set_default(vpd_vendor_id, 0);
> Benjamin> +out:
> Benjamin> +	condlog(3, "%s: vpd_vendor_id = 0x%x %s", pp->dev, pp->vpd_vendor_id,
> Benjamin> +		origin);
> Benjamin> +	return 0;
> Benjamin> +}
> Benjamin> diff --git a/libmultipath/propsel.h b/libmultipath/propsel.h
> Benjamin> index ddfd6262..3f6d319a 100644
> Benjamin> --- a/libmultipath/propsel.h
> Benjamin> +++ b/libmultipath/propsel.h
> Benjamin> @@ -37,3 +37,5 @@ void reconcile_features_with_options(const char *id, char **features,
> Benjamin>  				     int* no_path_retry,
> Benjamin>  				     int *retain_hwhandler);
> Benjamin>  int select_all_tg_pt (struct config *conf, struct multipath * mp);
> Benjamin> +int select_vpd_vendor_pg (struct config *conf, struct path *pp);
> Benjamin> +int select_vpd_vendor_id (struct config *conf, struct path *pp);
> Benjamin> diff --git a/libmultipath/structs.h b/libmultipath/structs.h
> Benjamin> index 1c32a799..1ad5f64a 100644
> Benjamin> --- a/libmultipath/structs.h
> Benjamin> +++ b/libmultipath/structs.h
> Benjamin> @@ -21,6 +21,7 @@
> Benjamin>  #define HOST_NAME_LEN		16
> Benjamin>  #define SLOT_NAME_SIZE		40
> Benjamin>  #define PRKEY_SIZE		19
> Benjamin> +#define VPD_DATA_SIZE		128
>  
> Benjamin>  #define SCSI_VENDOR_SIZE	9
> Benjamin>  #define SCSI_PRODUCT_SIZE	17
> Benjamin> @@ -243,6 +244,11 @@ struct hd_geometry {
> Benjamin>  };
> Benjamin>  #endif
>  
> Benjamin> +/*
> Benjamin> + * from sg_vpd_vendor.c
> Benjamin> + */
> Benjamin> +#define VPD_VP_HP3PAR 4
> Benjamin> +
> Benjamin>  struct path {
> Benjamin>  	char dev[FILE_NAME_SIZE];
> Benjamin>  	char dev_t[BLK_DEV_SIZE];
> Benjamin> @@ -255,6 +261,7 @@ struct path {
> Benjamin>  	char rev[PATH_REV_SIZE];
> Benjamin>  	char serial[SERIAL_SIZE];
> Benjamin>  	char tgt_node_name[NODE_NAME_SIZE];
> Benjamin> +	char vpd_data[VPD_DATA_SIZE];
> Benjamin>  	unsigned long long size;
> Benjamin>  	unsigned int checkint;
> Benjamin>  	unsigned int tick;
> Benjamin> @@ -287,6 +294,8 @@ struct path {
> Benjamin>  	int io_err_pathfail_starttime;
> Benjamin>  	int find_multipaths_timeout;
> Benjamin>  	int marginal;
> Benjamin> +	int vpd_vendor_pg;
> Benjamin> +	int vpd_vendor_id;
> Benjamin>  	/* configlet pointers */
> Benjamin>  	vector hwe;
> Benjamin>  	struct gen_path generic_path;
> Benjamin> diff --git a/multipath/multipath.conf.5 b/multipath/multipath.conf.5
> Benjamin> index e866da23..dc103fd8 100644
> Benjamin> --- a/multipath/multipath.conf.5
> Benjamin> +++ b/multipath/multipath.conf.5
> Benjamin> @@ -1472,6 +1472,14 @@ the \fIproduct\fR attribute set to the value of \fIproduct_blacklist\fR.
> Benjamin>  The user_friendly_names prefix to use for this
> Benjamin>  device type, instead of the default "mpath".
> Benjamin>  .TP
> Benjamin> +.B vpd_vendor
> Benjamin> +The vendor specific vpd page information, using the vpd page abbreviation.
> Benjamin> +The vpd page abbreviation can be found by running \fIsg_vpd -e\fR. multipathd
> Benjamin> +will use this information to gather device specific information that can be
> Benjamin> +displayed with the \fI%g\fR wilcard for the \fImultipathd show maps format\fR
> Benjamin> +and \fImultipathd show paths format\fR commands. Currently only the
> Benjamin> +\fBhp3par\fR vpd page is supported.
> Benjamin> +.TP
> Benjamin>  .B hardware_handler
> Benjamin>  The hardware handler to use for this device type.
> Benjamin>  The following hardware handler are implemented:
> Benjamin> -- 
> Benjamin> 2.17.2
> 
> Benjamin> --
> Benjamin> dm-devel mailing list
> Benjamin> dm-devel@redhat.com
> Benjamin> https://www.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 09/15] libmultipath: add code to get vendor specific vpd data
  2020-01-22  8:51     ` Benjamin Marzinski
@ 2020-01-22 11:52       ` Martin Wilck
  2020-01-22 12:20         ` Benjamin Marzinski
  0 siblings, 1 reply; 49+ messages in thread
From: Martin Wilck @ 2020-01-22 11:52 UTC (permalink / raw)
  To: dm-devel, Benjamin Marzinski

On Wed, 2020-01-22 at 02:51 -0600, Benjamin Marzinski wrote:
> On Fri, Jan 17, 2020 at 05:56:40PM +0100, Martin Wilck wrote:
> > On Thu, 2020-01-16 at 20:18 -0600, Benjamin Marzinski wrote:
> > > This adds the wildcard 'g' for multipath and path formatted
> > > printing,
> > > which returns extra data from a device's vendor specific vpd
> > > page.  The
> > > specific vendor vpd page to use, and the vendor/product id to
> > > decode
> > > it
> > > can be set in the hwentry with vpd_vendor_pg and vpd_vendor_id.
> > > It
> > > can
> > > be configured in the devices section of multipath.conf with the
> > > vpd_vendor parameter. Currently, the only devices that use this
> > > are
> > > HPE
> > > 3PAR arrays, to return the Volume Name.
> > > 
> > > Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> > > ---
> > >  libmultipath/config.c      |  4 ++++
> > >  libmultipath/config.h      |  2 ++
> > >  libmultipath/dict.c        | 34
> > > ++++++++++++++++++++++++++++++++++
> > >  libmultipath/discovery.c   | 34
> > > +++++++++++++++++++++++++++++++++-
> > >  libmultipath/hwtable.c     |  2 ++
> > >  libmultipath/print.c       | 27 +++++++++++++++++++++++++++
> > >  libmultipath/propsel.c     | 24 ++++++++++++++++++++++++
> > >  libmultipath/propsel.h     |  2 ++
> > >  libmultipath/structs.h     |  9 +++++++++
> > >  multipath/multipath.conf.5 |  8 ++++++++
> > >  10 files changed, 145 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/libmultipath/config.c b/libmultipath/config.c
> > > index 85626e96..72b8d37c 100644
> > > --- a/libmultipath/config.c
> > > +++ b/libmultipath/config.c
> > > @@ -369,6 +369,8 @@ merge_hwe (struct hwentry * dst, struct
> > > hwentry *
> > > src)
> > >  	merge_num(max_sectors_kb);
> > >  	merge_num(ghost_delay);
> > >  	merge_num(all_tg_pt);
> > > +	merge_num(vpd_vendor_pg);
> > > +	merge_num(vpd_vendor_id);
> > >  	merge_num(san_path_err_threshold);
> > >  	merge_num(san_path_err_forget_rate);
> > >  	merge_num(san_path_err_recovery_time);
> > > @@ -517,6 +519,8 @@ store_hwe (vector hwtable, struct hwentry *
> > > dhwe)
> > >  	hwe->detect_prio = dhwe->detect_prio;
> > >  	hwe->detect_checker = dhwe->detect_checker;
> > >  	hwe->ghost_delay = dhwe->ghost_delay;
> > > +	hwe->vpd_vendor_pg = dhwe->vpd_vendor_pg;
> > > +	hwe->vpd_vendor_id = dhwe->vpd_vendor_id;
> > >  
> > >  	if (dhwe->bl_product && !(hwe->bl_product = set_param_str(dhwe-
> > > > bl_product)))
> > >  		goto out;
> > > diff --git a/libmultipath/config.h b/libmultipath/config.h
> > > index e69aa07c..589146de 100644
> > > --- a/libmultipath/config.h
> > > +++ b/libmultipath/config.h
> > > @@ -87,6 +87,8 @@ struct hwentry {
> > >  	int max_sectors_kb;
> > >  	int ghost_delay;
> > >  	int all_tg_pt;
> > > +	int vpd_vendor_pg;
> > > +	int vpd_vendor_id;
> > >  	char * bl_product;
> > >  };
> > >  
> > > diff --git a/libmultipath/dict.c b/libmultipath/dict.c
> > > index 2b046e1d..d6d8b79b 100644
> > > --- a/libmultipath/dict.c
> > > +++ b/libmultipath/dict.c
> > > @@ -1366,6 +1366,39 @@ def_uxsock_timeout_handler(struct config
> > > *conf, vector strvec)
> > >  	return 0;
> > >  }
> > >  
> > > +static int
> > > +hw_vpd_vendor_handler(struct config *conf, vector strvec)
> > > +{
> > > +	int rc = 0;
> > > +	char *buff;
> > > +
> > > +	struct hwentry * hwe = VECTOR_LAST_SLOT(conf->hwtable);
> > > +	if (!hwe)
> > > +		return 1;
> > > +
> > > +	buff = set_value(strvec);
> > > +	if (!buff)
> > > +		return 1;
> > > +	if (strcmp(buff, "hp3par") == 0) {
> > > +		hwe->vpd_vendor_pg = 0xc0;
> > > +		hwe->vpd_vendor_id = VPD_VP_HP3PAR;
> > > +	} else
> > > +		rc = 1;
> > > +	FREE(buff);
> > > +	return rc;
> > > +}
> > > +
> > > +static int
> > > +snprint_hw_vpd_vendor(struct config *conf, char * buff, int len,
> > > +		      const void * data)
> > > +{
> > > +	const struct hwentry * hwe = (const struct hwentry *)data;
> > > +
> > > +	if (hwe->vpd_vendor_pg == 0xc0 && hwe->vpd_vendor_id ==
> > > VPD_VP_HP3PAR)
> > > +		return snprintf(buff, len, "hp3par");
> > > +	return 0;
> > > +}
> > 
> > I don't understand this design. The way you set up the hwe, it
> > would be
> > logical to add "vpd_vendor_page" and "vpd_vendor_id" properties for
> > device entries.
> 
> The vpd page abbreviations specify both the page to read, and the
> vendor
> id to use to decode it, and they are more user readable. It seemed
> like
> a much more foolproof way to have people specify this.
>  
> > BUT ok, that's overengineered with just one supported combination
> > of
> > page and vendor. I can understand that. But then, it seems also
> > overengineered to carry around vpd_vendor_pg and vpd_vendor_id in
> > the
> > hwe. 
> > 
> > I'd suggest creating a hard-coded table with "vendor vpd schemes",
> > 
> > struct {
> > 	int pg;
> > 	int vendor_id;
> > 	const char *name;
> > } vendor_vpd_schemes[]  = { 
> >     { 0xc0, VPD_VP_HP3PAR, "hp3par", }, 
> >     { 0, }, 
> > };
> > 
> > and in the hwentry, only carry around the index into this table,
> > and look up the page and vendor ID there. Actually, with just that
> > single user, we might as well not introduce a new device property
> > at
> > all, but simply use a separate hardcoded table with lookup values
> > for
> > vendor and product ID. I'm unsure if it's wise to make this
> > configurable via multipath.conf.
> 
> Sure. I can make the hwe change. The reason why I added this to
> multipath.conf is that I'm not sure what devices actually support
> these
> vpd information pages.  So, if we miss one, or a new device comes out
> that isn't in the built-in config, the user can still get this info.
> 

My main issue was that you have 2 fields in struct hwentry, and only
one field in the hwtable. I think I made the point that I'd prefer to
have only one everywhere, and use some sort of lookup table for the
page and vendor ID parameters.

Another issue that I realized just now, the "vendor_id" is not
standardized - it's just an internal identifier used by sg3_utils and
has no specific meaning outside the sg3_utils code base. Even it's
numeric value has changed between sg3_utils versions. I think we should
cast it away, and simply stick with the symbolic identifier. Keeping
that symbolic name in line with sg3_utils makes a lot of sense, of
course.

Regards,
Martin

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 01/15] multipathd: warn when configuration has been changed.
  2020-01-17 15:45   ` Martin Wilck
@ 2020-01-22 11:59     ` Benjamin Marzinski
  0 siblings, 0 replies; 49+ messages in thread
From: Benjamin Marzinski @ 2020-01-22 11:59 UTC (permalink / raw)
  To: Martin Wilck; +Cc: dm-devel

On Fri, Jan 17, 2020 at 03:45:34PM +0000, Martin Wilck wrote:
> On Thu, 2020-01-16 at 20:18 -0600, Benjamin Marzinski wrote:
> > It would be helpful if multipathd could log a message when
> > multipath.conf or files in the config_dir have been written to, both
> > so
> > that it can be used to send a notification to users, and to help with
> > determining after the fact if multipathd was running with an older
> > config, when the logs of multipathd's behaviour don't match with the
> > current multipath.conf.
> > 
> > To do this, the multipathd uxlsnr thread now sets up inotify watches
> > on
> > both /etc/multipath.conf and the config_dir to watch if the files are
> > deleted or closed after being opened for writing.  In order to keep
> > uxlsnr from polling repeatedly if the multipath.conf or the
> > config_dir
> > aren't present, it will only set up the watches once per reconfigure.
> > However, since multipath.conf is far more likely to be replaced by a
> > text editor than modified in place, if it gets removed, multipathd
> > will
> > immediately try to restart the watch on it (which will succeed if the
> > file was simply replaced by a new copy).  This does mean that if
> > multipath.conf or the config_dir are actually removed and then later
> > re-added, multipathd won't log any more messages for changes until
> > the
> > next reconfigure. But that seems like a fair trade-off to avoid
> > repeatedly polling for files that aren't likely to appear.
> > 
> > Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> > ---
> >  libmultipath/config.h |   1 +
> >  multipathd/main.c     |   1 +
> >  multipathd/uxlsnr.c   | 134
> > ++++++++++++++++++++++++++++++++++++++++--
> >  3 files changed, 130 insertions(+), 6 deletions(-)
> 
> I know I reviewed this already, but this time I have some more remarks,
> mostly style.

Sure. I can do all these.

-Ben

> 
> > 
> > diff --git a/libmultipath/config.h b/libmultipath/config.h
> > index ffec3103..e69aa07c 100644
> > --- a/libmultipath/config.h
> > +++ b/libmultipath/config.h
> > @@ -188,6 +188,7 @@ struct config {
> >  	int find_multipaths_timeout;
> >  	int marginal_pathgroups;
> >  	unsigned int version[3];
> > +	unsigned int sequence_nr;
> >  
> >  	char * multipath_dir;
> >  	char * selector;
> > diff --git a/multipathd/main.c b/multipathd/main.c
> > index 34a57689..7b364cfe 100644
> > --- a/multipathd/main.c
> > +++ b/multipathd/main.c
> > @@ -2618,6 +2618,7 @@ reconfigure (struct vectors * vecs)
> >  	uxsock_timeout = conf->uxsock_timeout;
> >  
> >  	old = rcu_dereference(multipath_conf);
> > +	conf->sequence_nr = old->sequence_nr + 1;
> >  	rcu_assign_pointer(multipath_conf, conf);
> >  	call_rcu(&old->rcu, rcu_free_config);
> >  
> > diff --git a/multipathd/uxlsnr.c b/multipathd/uxlsnr.c
> > index bc71679e..92d9a79a 100644
> > --- a/multipathd/uxlsnr.c
> > +++ b/multipathd/uxlsnr.c
> > @@ -23,6 +23,7 @@
> >  #include <sys/time.h>
> >  #include <signal.h>
> >  #include <stdbool.h>
> > +#include <sys/inotify.h>
> >  #include "checkers.h"
> >  #include "memory.h"
> >  #include "debug.h"
> > @@ -51,6 +52,8 @@ struct client {
> >  LIST_HEAD(clients);
> >  pthread_mutex_t client_lock = PTHREAD_MUTEX_INITIALIZER;
> >  struct pollfd *polls;
> > +int notify_fd = -1;
> > +char *config_dir;
> 
> Nit: please declare these as static, as they are used only in this
> file. Also, naming the variable differently, e.g. conf_dir, would
> decrease the number of false hits when grepping for the variable name.
> 
> >  
> >  static bool _socket_client_is_root(int fd);
> >  
> > @@ -151,6 +154,8 @@ void uxsock_cleanup(void *arg)
> >  	long ux_sock = (long)arg;
> >  
> >  	close(ux_sock);
> > +	close(notify_fd);
> > +	free(config_dir);
> >  
> >  	pthread_mutex_lock(&client_lock);
> >  	list_for_each_entry_safe(client_loop, client_tmp, &clients,
> > node) {
> > @@ -162,6 +167,106 @@ void uxsock_cleanup(void *arg)
> >  	free_polls();
> >  }
> >  
> > +/* failing to set the watch descriptor is o.k. we just miss a
> > warning
> > + * message */
> > +void reset_watch(int notify_fd, int *wds, unsigned int *sequence_nr)
> 
> This function could also be static.
> 
> > +{
> > +	struct config *conf;
> > +	int dir_reset = 0;
> > +	int conf_reset = 0;
> > +
> > +	if (notify_fd == -1)
> > +		return;
> > +
> > +	conf = get_multipath_config();
> > +	/* instead of repeatedly try to reset the inotify watch if
> > +	 * the config directory or multipath.conf isn't there, just
> > +	 * do it once per reconfigure */
> > +	if (*sequence_nr != conf->sequence_nr) {
> > +		*sequence_nr = conf->sequence_nr;
> > +		if (wds[0] == -1)
> > +			conf_reset = 1;
> > +		if (!config_dir || !conf->config_dir ||
> > +		    strcmp(config_dir, conf->config_dir)) {
> > +			dir_reset = 1;
> > +			if (config_dir)
> > +				free(config_dir);
> > +			if (conf->config_dir)
> > +				config_dir = strdup(conf->config_dir);
> > +			else
> > +				config_dir = NULL;
> > +		} else if (wds[1] == -1)
> > +			dir_reset = 1;
> > +	}
> > +	put_multipath_config(conf);
> > +
> > +	if (dir_reset) {
> > +		if (wds[1] != -1) {
> > +			inotify_rm_watch(notify_fd, wds[1]);
> > +			wds[1] = -1;
> > +		}
> > +		if (config_dir) {
> > +			wds[1] = inotify_add_watch(notify_fd,
> > config_dir,
> > +						   IN_CLOSE_WRITE |
> > IN_DELETE |
> > +						   IN_ONLYDIR);
> > +			if (wds[1] == -1)
> > +				condlog(3, "didn't set up notifications
> > on %s: %s", config_dir, strerror(errno));
> 
> Another nitpick: IMO we should be using "%m" for this in new code.
> 
> > +		}
> > +	}
> > +	if (conf_reset) {
> > +		wds[0] = inotify_add_watch(notify_fd,
> > DEFAULT_CONFIGFILE,
> > +					   IN_CLOSE_WRITE);
> > +		if (wds[0] == -1)
> > +			condlog(3, "didn't set up notifications on
> > /etc/multipath.conf: %s", strerror(errno));
> > +	}
> > +	return;
> > +}
> > +
> > +void handle_inotify(int fd, int  *wds)
> 
> Again, static.
> 
> > +{
> > +	char buff[1024]
> > +		__attribute__ ((aligned(__alignof__(struct
> > inotify_event))));
> > +	const struct inotify_event *event;
> > +	ssize_t len;
> > +	char *ptr;
> > +	int i, got_notify = 0;
> > +
> > +	for (;;) {
> > +		len = read(fd, buff, sizeof(buff));
> > +		if (len <= 0) {
> > +			if (len < 0 && errno != EAGAIN) {
> > +				condlog(3, "error reading from
> > inotify_fd");
> > +				for (i = 0; i < 2; i++) {
> > +					if (wds[i] != -1) {
> > +						inotify_rm_watch(fd,
> > wds[i]);
> > +						wds[i] = -1;
> > +					}
> > +				}
> > +			}
> > +			break;
> > +		}
> > +
> > +		got_notify = 1;
> > +		for (ptr = buff; ptr < buff + len;
> > +		     ptr += sizeof(struct inotify_event) + event->len)
> > {
> > +			event = (const struct inotify_event *) ptr;
> > +
> > +			if (event->mask & IN_IGNORED) {
> > +				/* multipathd.conf may have been
> > overwritten.
> > +				 * Try once to reset the notification
> > */
> > +				if (wds[0] == event->wd)
> > +					wds[0] =
> > inotify_add_watch(notify_fd,
> > +							DEFAULT_CONFIGF
> > ILE,
> > +							IN_CLOSE_WRITE)
> > ;
> > +				else if (wds[1] == event->wd)
> > +					wds[1] = -1;
> > +			}
> > +		}
> > +	}
> > +	if (got_notify)
> > +		condlog(1, "Multipath configuration updated.\nReload
> > multipathd for changes to take effect");
> > +}
> > +
> >  /*
> >   * entry point
> >   */
> > @@ -173,13 +278,19 @@ void * uxsock_listen(uxsock_trigger_fn
> > uxsock_trigger, long ux_sock,
> >  	char *reply;
> >  	sigset_t mask;
> >  	int old_clients = MIN_POLLS;
> > +	/* conf->sequence_nr will be 1 when uxsock_listen is first
> > called */
> > +	unsigned int sequence_nr = 0;
> > +	int wds[2] = { -1, -1 };
> 
> Style issue: The code might be better readable if we used a struct for
> this,
> 
> struct watch_descriptors {
> 	int conf_wd;
> 	int dir_wd;
> };	
> 
> -- 
> Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
> SUSE  Software Solutions Germany GmbH
> HRB 36809, AG Nürnberg GF: Felix
> Imendörffer
> 

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 09/15] libmultipath: add code to get vendor specific vpd data
  2020-01-22 11:52       ` Martin Wilck
@ 2020-01-22 12:20         ` Benjamin Marzinski
  0 siblings, 0 replies; 49+ messages in thread
From: Benjamin Marzinski @ 2020-01-22 12:20 UTC (permalink / raw)
  To: Martin Wilck; +Cc: dm-devel

On Wed, Jan 22, 2020 at 12:52:10PM +0100, Martin Wilck wrote:
> On Wed, 2020-01-22 at 02:51 -0600, Benjamin Marzinski wrote:
> > On Fri, Jan 17, 2020 at 05:56:40PM +0100, Martin Wilck wrote:
> > > On Thu, 2020-01-16 at 20:18 -0600, Benjamin Marzinski wrote:
> > > BUT ok, that's overengineered with just one supported combination
> > > of
> > > page and vendor. I can understand that. But then, it seems also
> > > overengineered to carry around vpd_vendor_pg and vpd_vendor_id in
> > > the
> > > hwe. 
> > > 
> > > I'd suggest creating a hard-coded table with "vendor vpd schemes",
> > > 
> > > struct {
> > > 	int pg;
> > > 	int vendor_id;
> > > 	const char *name;
> > > } vendor_vpd_schemes[]  = { 
> > >     { 0xc0, VPD_VP_HP3PAR, "hp3par", }, 
> > >     { 0, }, 
> > > };
> > > 
> > > and in the hwentry, only carry around the index into this table,
> > > and look up the page and vendor ID there. Actually, with just that
> > > single user, we might as well not introduce a new device property
> > > at
> > > all, but simply use a separate hardcoded table with lookup values
> > > for
> > > vendor and product ID. I'm unsure if it's wise to make this
> > > configurable via multipath.conf.
> > 
> > Sure. I can make the hwe change. The reason why I added this to
> > multipath.conf is that I'm not sure what devices actually support
> > these
> > vpd information pages.  So, if we miss one, or a new device comes out
> > that isn't in the built-in config, the user can still get this info.
> > 
> 
> My main issue was that you have 2 fields in struct hwentry, and only
> one field in the hwtable. I think I made the point that I'd prefer to
> have only one everywhere, and use some sort of lookup table for the
> page and vendor ID parameters.

I meant I agreed with this approach.

> Another issue that I realized just now, the "vendor_id" is not
> standardized - it's just an internal identifier used by sg3_utils and
> has no specific meaning outside the sg3_utils code base. Even it's
> numeric value has changed between sg3_utils versions. I think we should
> cast it away, and simply stick with the symbolic identifier. Keeping
> that symbolic name in line with sg3_utils makes a lot of sense, of
> course.

Yeah, another reason I didn't go with vpd_vendor_pg and vpd_vendor_id
was that on earlier versions of sg_vpd, the vpd_vendor_id for
hp3par was different. We can easily remove it altogether.

-Ben

> Regards,
> Martin
> 

^ permalink raw reply	[flat|nested] 49+ messages in thread

end of thread, other threads:[~2020-01-22 12:20 UTC | newest]

Thread overview: 49+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-17  2:18 [PATCH 00/15] Multipath patch dump Benjamin Marzinski
2020-01-17  2:18 ` [PATCH 01/15] multipathd: warn when configuration has been changed Benjamin Marzinski
2020-01-17 15:45   ` Martin Wilck
2020-01-22 11:59     ` Benjamin Marzinski
2020-01-17 15:51   ` Martin Wilck
2020-01-17  2:18 ` [PATCH 02/15] libmultipath: fix leak in foreign code Benjamin Marzinski
2020-01-17 15:50   ` Martin Wilck
2020-01-17  2:18 ` [PATCH 03/15] Fix leak in mpathpersist Benjamin Marzinski
2020-01-17 15:53   ` Martin Wilck
2020-01-17  2:18 ` [PATCH 04/15] libmultipath: remove unused path->prio_args Benjamin Marzinski
2020-01-17 15:55   ` Martin Wilck
2020-01-17  2:18 ` [PATCH 05/15] libmultipath: constify get_unaligned_be* Benjamin Marzinski
2020-01-17 15:57   ` Martin Wilck
2020-01-17  2:18 ` [PATCH 06/15] libmultipath: add missing hwe mpe variable merges Benjamin Marzinski
2020-01-17 16:04   ` Martin Wilck
2020-01-17  2:18 ` [PATCH 07/15] libmultipath: fix sgio_get_vpd looping Benjamin Marzinski
2020-01-17 16:27   ` Martin Wilck
2020-01-20 16:59     ` Benjamin Marzinski
2020-01-17  2:18 ` [PATCH 08/15] libmultipath: add vend_id to get_vpd_sgio Benjamin Marzinski
2020-01-17 16:58   ` Martin Wilck
2020-01-17  2:18 ` [PATCH 09/15] libmultipath: add code to get vendor specific vpd data Benjamin Marzinski
2020-01-17 16:56   ` Martin Wilck
2020-01-22  8:51     ` Benjamin Marzinski
2020-01-22 11:52       ` Martin Wilck
2020-01-22 12:20         ` Benjamin Marzinski
2020-01-18 17:34   ` John Stoffel
2020-01-22  9:46     ` Benjamin Marzinski
2020-01-17  2:18 ` [PATCH 10/15] libmultipath: change how the checker async is set Benjamin Marzinski
2020-01-17 17:02   ` Martin Wilck
2020-01-17 17:04   ` Martin Wilck
2020-01-18 17:43   ` John Stoffel
2020-01-20 15:07     ` Benjamin Marzinski
2020-01-20 20:20       ` John Stoffel
2020-01-21 12:49         ` Benjamin Marzinski
2020-01-21 20:13           ` John Stoffel
2020-01-17  2:18 ` [PATCH 11/15] libmultipath: change failed path prio timeout Benjamin Marzinski
2020-01-17 17:18   ` Martin Wilck
2020-01-20 16:22     ` Benjamin Marzinski
2020-01-17  2:18 ` [PATCH 12/15] multipathd: add new paths under vecs lock Benjamin Marzinski
2020-01-17 17:39   ` Martin Wilck
2020-01-17  2:18 ` [PATCH 13/15] libmultipath: add new checker class functions Benjamin Marzinski
2020-01-17 17:41   ` Martin Wilck
2020-01-17  2:18 ` [PATCH 14/15] libmultipath: make directio checker share io contexts Benjamin Marzinski
2020-01-20  8:09   ` Martin Wilck
2020-01-20 16:08     ` Benjamin Marzinski
2020-01-21  8:55       ` Martin Wilck
2020-01-17  2:18 ` [PATCH 15/15] tests: add directio unit tests Benjamin Marzinski
2020-01-20  8:27   ` Martin Wilck
2020-01-20 15:25     ` Benjamin Marzinski

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.