All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 00/17] Multipath patch dump
@ 2020-02-05 18:58 Benjamin Marzinski
  2020-02-05 18:58 ` [PATCH v2 01/17] multipathd: warn when configuration has been changed Benjamin Marzinski
                   ` (17 more replies)
  0 siblings, 18 replies; 37+ messages in thread
From: Benjamin Marzinski @ 2020-02-05 18:58 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck

This patch set is has multiple parts.

patch 01 is just a resubmit of my previous patch, with Martin's
corrections added.

Patches 02 - 07 are miscellaneous fixes and cleanups

Patches 08 - 10 are related to adding a new format wildcard, at the
	request of HPE, that allows multipath to get and display
	information from the vendor specific VPD pages

Patches 11 - 17 are the part that needs the most review, patch 14
	especially. It turns out that on machines with a large number
	of cores, io_destroy(), which is used by the directio checker,
	can take a long time to complete. Talking to some kernel people
	at Red Hat, I was told that this isn't a bug. It's working as
	expected, and multipath shouldn't be issuing so many
	io_destroy() calls (1 per path, when multipath or multipathd
	stops). Cutting out the io_destroy calls involved a major change
	to the directio checker. It's pretty hard to test a lot of the
	corner cases on actual hardware, so I've written a bunch of
	unit tests for this (patches 16 & 17).

Changes is v2:
0001-multipathd-warn-when-configuration-has-been-changed.patch
	Minor changes, including using a structure instead of an
	array to hold the watch descriptors, as suggested by Martin
	Wilck.

0002-multipathd-staticify-uxlsnr-variables-functions.patch
	New patch

0008-libmultipath-fix-sgio_get_vpd-looping.patch
	Test has been changed to keep create_vpd83 the same, and
	overwrite the length in the actual test, as suggested by
	Martin Wilck

0010-libmultipath-add-code-to-get-vendor-specific-vpd-dat.patch
	The information to find the vpd page and how to decode it is
	now simply the index in a table of name -> page_nr mappings

0012-libmultipath-change-failed-path-prio-timeout.patch
	The timeout argument has been renamed to avoid confusion,
	as suggested by Martin Wilck

0015-libmultipath-make-directio-checker-share-io-contexts.patch
	Minor changes to print more useful messages, and avoid
	printing anything when we get a non-zero io_cancel()
	result, as suggested by Martin Wilck

0016-tests-add-directio-unit-tests.patch
	Minor change to improve readability, as suggested by
	Martin Wilck

0017-tests-make-directio-tests-able-to-work-on-a-real-dev.patch
	New patch. This adds the ability to set a testing device, so
	you can run the directio tests on an actual device

Benjamin Marzinski (17):
  multipathd: warn when configuration has been changed.
  multipathd: staticify uxlsnr variables/functions
  libmultipath: fix leak in foreign code
  Fix leak in mpathpersist
  libmultipath: remove unused path->prio_args
  libmultipath: constify get_unaligned_be*
  libmultipath: add missing hwe mpe variable merges
  libmultipath: fix sgio_get_vpd looping
  libmultipath: add vend_id to get_vpd_sgio
  libmultipath: add code to get vendor specific vpd data
  libmultipath: change how the checker async is set
  libmultipath: change failed path prio timeout
  multipathd: add new paths under vecs lock
  libmultipath: add new checker class functions
  libmultipath: make directio checker share io contexts
  tests: add directio unit tests
  tests: make directio tests able to work on a real device

 libmpathpersist/mpath_persist.c  |   2 +-
 libmultipath/checkers.c          |  29 +-
 libmultipath/checkers.h          |   1 +
 libmultipath/checkers/directio.c | 336 ++++++++++---
 libmultipath/config.c            |  10 +
 libmultipath/config.h            |   2 +
 libmultipath/dict.c              |  38 ++
 libmultipath/discovery.c         |  80 ++-
 libmultipath/discovery.h         |   2 +-
 libmultipath/foreign.c           |  11 +-
 libmultipath/hwtable.c           |   1 +
 libmultipath/print.c             |  25 +
 libmultipath/prio.c              |   6 +-
 libmultipath/propsel.c           |  20 +-
 libmultipath/propsel.h           |   1 +
 libmultipath/structs.h           |  16 +-
 libmultipath/unaligned.h         |   4 +-
 mpathpersist/main.c              |   1 +
 multipath/main.c                 |   1 +
 multipath/multipath.conf.5       |  15 +-
 multipathd/main.c                |  18 +-
 multipathd/uxlsnr.c              | 150 +++++-
 tests/Makefile                   |  19 +-
 tests/directio.c                 | 812 +++++++++++++++++++++++++++++++
 tests/directio_test_dev          |   4 +
 tests/vpd.c                      |  87 ++--
 26 files changed, 1534 insertions(+), 157 deletions(-)
 create mode 100644 tests/directio.c
 create mode 100644 tests/directio_test_dev

-- 
2.17.2

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH v2 01/17] multipathd: warn when configuration has been changed.
  2020-02-05 18:58 [PATCH v2 00/17] Multipath patch dump Benjamin Marzinski
@ 2020-02-05 18:58 ` Benjamin Marzinski
  2020-02-10 14:22   ` Martin Wilck
  2020-02-05 18:58 ` [PATCH v2 02/17] multipathd: staticify uxlsnr variables/functions Benjamin Marzinski
                   ` (16 subsequent siblings)
  17 siblings, 1 reply; 37+ messages in thread
From: Benjamin Marzinski @ 2020-02-05 18:58 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck

It would be helpful if multipathd could log a message when
multipath.conf or files in the config_dir have been written to, both so
that it can be used to send a notification to users, and to help with
determining after the fact if multipathd was running with an older
config, when the logs of multipathd's behaviour don't match with the
current multipath.conf.

To do this, the multipathd uxlsnr thread now sets up inotify watches on
both /etc/multipath.conf and the config_dir to watch if the files are
deleted or closed after being opened for writing.  In order to keep
uxlsnr from polling repeatedly if the multipath.conf or the config_dir
aren't present, it will only set up the watches once per reconfigure.
However, since multipath.conf is far more likely to be replaced by a
text editor than modified in place, if it gets removed, multipathd will
immediately try to restart the watch on it (which will succeed if the
file was simply replaced by a new copy).  This does mean that if
multipath.conf or the config_dir are actually removed and then later
re-added, multipathd won't log any more messages for changes until the
next reconfigure. But that seems like a fair trade-off to avoid
repeatedly polling for files that aren't likely to appear.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 libmultipath/config.h |   1 +
 multipathd/main.c     |   1 +
 multipathd/uxlsnr.c   | 138 ++++++++++++++++++++++++++++++++++++++++--
 3 files changed, 134 insertions(+), 6 deletions(-)

diff --git a/libmultipath/config.h b/libmultipath/config.h
index ffec3103..e69aa07c 100644
--- a/libmultipath/config.h
+++ b/libmultipath/config.h
@@ -188,6 +188,7 @@ struct config {
 	int find_multipaths_timeout;
 	int marginal_pathgroups;
 	unsigned int version[3];
+	unsigned int sequence_nr;
 
 	char * multipath_dir;
 	char * selector;
diff --git a/multipathd/main.c b/multipathd/main.c
index 34a57689..7b364cfe 100644
--- a/multipathd/main.c
+++ b/multipathd/main.c
@@ -2618,6 +2618,7 @@ reconfigure (struct vectors * vecs)
 	uxsock_timeout = conf->uxsock_timeout;
 
 	old = rcu_dereference(multipath_conf);
+	conf->sequence_nr = old->sequence_nr + 1;
 	rcu_assign_pointer(multipath_conf, conf);
 	call_rcu(&old->rcu, rcu_free_config);
 
diff --git a/multipathd/uxlsnr.c b/multipathd/uxlsnr.c
index bc71679e..020d7a9b 100644
--- a/multipathd/uxlsnr.c
+++ b/multipathd/uxlsnr.c
@@ -23,6 +23,7 @@
 #include <sys/time.h>
 #include <signal.h>
 #include <stdbool.h>
+#include <sys/inotify.h>
 #include "checkers.h"
 #include "memory.h"
 #include "debug.h"
@@ -51,6 +52,8 @@ struct client {
 LIST_HEAD(clients);
 pthread_mutex_t client_lock = PTHREAD_MUTEX_INITIALIZER;
 struct pollfd *polls;
+static int notify_fd = -1;
+static char *watch_config_dir;
 
 static bool _socket_client_is_root(int fd);
 
@@ -151,6 +154,8 @@ void uxsock_cleanup(void *arg)
 	long ux_sock = (long)arg;
 
 	close(ux_sock);
+	close(notify_fd);
+	free(watch_config_dir);
 
 	pthread_mutex_lock(&client_lock);
 	list_for_each_entry_safe(client_loop, client_tmp, &clients, node) {
@@ -162,6 +167,110 @@ void uxsock_cleanup(void *arg)
 	free_polls();
 }
 
+struct watch_descriptors {
+	int conf_wd;
+	int dir_wd;
+};
+
+/* failing to set the watch descriptor is o.k. we just miss a warning
+ * message */
+static void reset_watch(int notify_fd, struct watch_descriptors *wds,
+			unsigned int *sequence_nr)
+{
+	struct config *conf;
+	int dir_reset = 0;
+	int conf_reset = 0;
+
+	if (notify_fd == -1)
+		return;
+
+	conf = get_multipath_config();
+	/* instead of repeatedly try to reset the inotify watch if
+	 * the config directory or multipath.conf isn't there, just
+	 * do it once per reconfigure */
+	if (*sequence_nr != conf->sequence_nr) {
+		*sequence_nr = conf->sequence_nr;
+		if (wds->conf_wd == -1)
+			conf_reset = 1;
+		if (!watch_config_dir || !conf->config_dir ||
+		    strcmp(watch_config_dir, conf->config_dir)) {
+			dir_reset = 1;
+			if (watch_config_dir)
+				free(watch_config_dir);
+			if (conf->config_dir)
+				watch_config_dir = strdup(conf->config_dir);
+			else
+				watch_config_dir = NULL;
+		} else if (wds->dir_wd == -1)
+			dir_reset = 1;
+	}
+	put_multipath_config(conf);
+
+	if (dir_reset) {
+		if (wds->dir_wd != -1) {
+			inotify_rm_watch(notify_fd, wds->dir_wd);
+			wds->dir_wd = -1;
+		}
+		if (watch_config_dir) {
+			wds->dir_wd = inotify_add_watch(notify_fd,
+							watch_config_dir,
+							IN_CLOSE_WRITE |
+							IN_DELETE | IN_ONLYDIR);
+			if (wds->dir_wd == -1)
+				condlog(3, "didn't set up notifications on %s: %m", watch_config_dir);
+		}
+	}
+	if (conf_reset) {
+		wds->conf_wd = inotify_add_watch(notify_fd, DEFAULT_CONFIGFILE,
+						 IN_CLOSE_WRITE);
+		if (wds->conf_wd == -1)
+			condlog(3, "didn't set up notifications on /etc/multipath.conf: %m");
+	}
+	return;
+}
+
+static void handle_inotify(int fd, struct watch_descriptors *wds)
+{
+	char buff[1024]
+		__attribute__ ((aligned(__alignof__(struct inotify_event))));
+	const struct inotify_event *event;
+	ssize_t len;
+	char *ptr;
+	int got_notify = 0;
+
+	for (;;) {
+		len = read(fd, buff, sizeof(buff));
+		if (len <= 0) {
+			if (len < 0 && errno != EAGAIN) {
+				condlog(3, "error reading from inotify_fd");
+				if (wds->conf_wd != -1)
+					inotify_rm_watch(fd, wds->conf_wd);
+				if (wds->dir_wd != -1)
+					inotify_rm_watch(fd, wds->dir_wd);
+				wds->conf_wd = wds->dir_wd = -1;
+			}
+			break;
+		}
+
+		got_notify = 1;
+		for (ptr = buff; ptr < buff + len;
+		     ptr += sizeof(struct inotify_event) + event->len) {
+			event = (const struct inotify_event *) ptr;
+
+			if (event->mask & IN_IGNORED) {
+				/* multipathd.conf may have been overwritten.
+				 * Try once to reset the notification */
+				if (wds->conf_wd == event->wd)
+					wds->conf_wd = inotify_add_watch(notify_fd, DEFAULT_CONFIGFILE, IN_CLOSE_WRITE);
+				else if (wds->dir_wd == event->wd)
+					wds->dir_wd = -1;
+			}
+		}
+	}
+	if (got_notify)
+		condlog(1, "Multipath configuration updated.\nReload multipathd for changes to take effect");
+}
+
 /*
  * entry point
  */
@@ -173,13 +282,19 @@ void * uxsock_listen(uxsock_trigger_fn uxsock_trigger, long ux_sock,
 	char *reply;
 	sigset_t mask;
 	int old_clients = MIN_POLLS;
+	/* conf->sequence_nr will be 1 when uxsock_listen is first called */
+	unsigned int sequence_nr = 0;
+	struct watch_descriptors wds = { .conf_wd = -1, .dir_wd = -1 };
 
 	condlog(3, "uxsock: startup listener");
-	polls = (struct pollfd *)MALLOC((MIN_POLLS + 1) * sizeof(struct pollfd));
+	polls = (struct pollfd *)MALLOC((MIN_POLLS + 2) * sizeof(struct pollfd));
 	if (!polls) {
 		condlog(0, "uxsock: failed to allocate poll fds");
 		exit_daemon();
 	}
+	notify_fd = inotify_init1(IN_NONBLOCK);
+	if (notify_fd == -1) /* it's fine if notifications fail */
+		condlog(3, "failed to start up configuration notifications");
 	sigfillset(&mask);
 	sigdelset(&mask, SIGINT);
 	sigdelset(&mask, SIGTERM);
@@ -198,18 +313,18 @@ void * uxsock_listen(uxsock_trigger_fn uxsock_trigger, long ux_sock,
 		if (num_clients != old_clients) {
 			struct pollfd *new;
 			if (num_clients <= MIN_POLLS && old_clients > MIN_POLLS) {
-				new = REALLOC(polls, (1 + MIN_POLLS) *
+				new = REALLOC(polls, (2 + MIN_POLLS) *
 						sizeof(struct pollfd));
 			} else if (num_clients <= MIN_POLLS && old_clients <= MIN_POLLS) {
 				new = polls;
 			} else {
-				new = REALLOC(polls, (1+num_clients) *
+				new = REALLOC(polls, (2 + num_clients) *
 						sizeof(struct pollfd));
 			}
 			if (!new) {
 				pthread_mutex_unlock(&client_lock);
 				condlog(0, "%s: failed to realloc %d poll fds",
-					"uxsock", 1 + num_clients);
+					"uxsock", 2 + num_clients);
 				sched_yield();
 				continue;
 			}
@@ -219,8 +334,15 @@ void * uxsock_listen(uxsock_trigger_fn uxsock_trigger, long ux_sock,
 		polls[0].fd = ux_sock;
 		polls[0].events = POLLIN;
 
+		reset_watch(notify_fd, &wds, &sequence_nr);
+		if (notify_fd == -1 || (wds.conf_wd == -1 && wds.dir_wd == -1))
+			polls[1].fd = -1;
+		else
+			polls[1].fd = notify_fd;
+		polls[1].events = POLLIN;
+
 		/* setup the clients */
-		i = 1;
+		i = 2;
 		list_for_each_entry(c, &clients, node) {
 			polls[i].fd = c->fd;
 			polls[i].events = POLLIN;
@@ -262,7 +384,7 @@ void * uxsock_listen(uxsock_trigger_fn uxsock_trigger, long ux_sock,
 		}
 
 		/* see if a client wants to speak to us */
-		for (i = 1; i < num_clients + 1; i++) {
+		for (i = 2; i < num_clients + 2; i++) {
 			if (polls[i].revents & POLLIN) {
 				struct timespec start_time;
 
@@ -321,6 +443,10 @@ void * uxsock_listen(uxsock_trigger_fn uxsock_trigger, long ux_sock,
 		if (polls[0].revents & POLLIN) {
 			new_client(ux_sock);
 		}
+
+		/* handle inotify events on config files */
+		if (polls[1].revents & POLLIN)
+			handle_inotify(notify_fd, &wds);
 	}
 
 	return NULL;
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v2 02/17] multipathd: staticify uxlsnr variables/functions
  2020-02-05 18:58 [PATCH v2 00/17] Multipath patch dump Benjamin Marzinski
  2020-02-05 18:58 ` [PATCH v2 01/17] multipathd: warn when configuration has been changed Benjamin Marzinski
@ 2020-02-05 18:58 ` Benjamin Marzinski
  2020-02-10 14:24   ` Martin Wilck
  2020-02-05 18:58 ` [PATCH v2 03/17] libmultipath: fix leak in foreign code Benjamin Marzinski
                   ` (15 subsequent siblings)
  17 siblings, 1 reply; 37+ messages in thread
From: Benjamin Marzinski @ 2020-02-05 18:58 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 multipathd/uxlsnr.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/multipathd/uxlsnr.c b/multipathd/uxlsnr.c
index 020d7a9b..1c5ce9d2 100644
--- a/multipathd/uxlsnr.c
+++ b/multipathd/uxlsnr.c
@@ -40,7 +40,7 @@
 #include "cli.h"
 #include "uxlsnr.h"
 
-struct timespec sleep_time = {5, 0};
+static struct timespec sleep_time = {5, 0};
 
 struct client {
 	struct list_head node;
@@ -49,9 +49,9 @@ struct client {
 
 #define MIN_POLLS 1023
 
-LIST_HEAD(clients);
-pthread_mutex_t client_lock = PTHREAD_MUTEX_INITIALIZER;
-struct pollfd *polls;
+static LIST_HEAD(clients);
+static pthread_mutex_t client_lock = PTHREAD_MUTEX_INITIALIZER;
+static struct pollfd *polls;
 static int notify_fd = -1;
 static char *watch_config_dir;
 
@@ -122,13 +122,13 @@ static void dead_client(struct client *c)
 	pthread_cleanup_pop(1);
 }
 
-void free_polls (void)
+static void free_polls (void)
 {
 	if (polls)
 		FREE(polls);
 }
 
-void check_timeout(struct timespec start_time, char *inbuf,
+static void check_timeout(struct timespec start_time, char *inbuf,
 		   unsigned int timeout)
 {
 	struct timespec diff_time, end_time;
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v2 03/17] libmultipath: fix leak in foreign code
  2020-02-05 18:58 [PATCH v2 00/17] Multipath patch dump Benjamin Marzinski
  2020-02-05 18:58 ` [PATCH v2 01/17] multipathd: warn when configuration has been changed Benjamin Marzinski
  2020-02-05 18:58 ` [PATCH v2 02/17] multipathd: staticify uxlsnr variables/functions Benjamin Marzinski
@ 2020-02-05 18:58 ` Benjamin Marzinski
  2020-02-11  9:36   ` Martin Wilck
  2020-02-05 18:58 ` [PATCH v2 04/17] Fix leak in mpathpersist Benjamin Marzinski
                   ` (14 subsequent siblings)
  17 siblings, 1 reply; 37+ messages in thread
From: Benjamin Marzinski @ 2020-02-05 18:58 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck

If scandir fails or finds no foreign libraries, enable_re needs to be
freed before exitting.

Fixes: 8d03eda4 'multipath.conf: add "enable_foreign" parameter'
Reviewed-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 libmultipath/foreign.c | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/libmultipath/foreign.c b/libmultipath/foreign.c
index 4b34e141..68e9a9b8 100644
--- a/libmultipath/foreign.c
+++ b/libmultipath/foreign.c
@@ -129,7 +129,7 @@ static int _init_foreign(const char *multipath_dir, const char *enable)
 	char pathbuf[PATH_MAX];
 	struct dirent **di;
 	struct scandir_result sr;
-	int r, i;
+	int r, i, ret = 0;
 	regex_t *enable_re = NULL;
 
 	foreigns = vector_alloc();
@@ -157,13 +157,15 @@ static int _init_foreign(const char *multipath_dir, const char *enable)
 	if (r == 0) {
 		condlog(3, "%s: no foreign multipath libraries found",
 			__func__);
-		return 0;
+		ret = 0;
+		goto out;
 	} else if (r < 0) {
 		r = errno;
 		condlog(1, "%s: error %d scanning foreign multipath libraries",
 			__func__, r);
 		_cleanup_foreign();
-		return -r;
+		ret = -r;
+		goto out;
 	}
 
 	sr.di = di;
@@ -250,8 +252,9 @@ static int _init_foreign(const char *multipath_dir, const char *enable)
 		free_foreign(fgn);
 	}
 	pthread_cleanup_pop(1); /* free_scandir_result */
+out:
 	pthread_cleanup_pop(1); /* free_pre */
-	return 0;
+	return ret;
 }
 
 int init_foreign(const char *multipath_dir, const char *enable)
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v2 04/17] Fix leak in mpathpersist
  2020-02-05 18:58 [PATCH v2 00/17] Multipath patch dump Benjamin Marzinski
                   ` (2 preceding siblings ...)
  2020-02-05 18:58 ` [PATCH v2 03/17] libmultipath: fix leak in foreign code Benjamin Marzinski
@ 2020-02-05 18:58 ` Benjamin Marzinski
  2020-02-05 18:58 ` [PATCH v2 05/17] libmultipath: remove unused path->prio_args Benjamin Marzinski
                   ` (13 subsequent siblings)
  17 siblings, 0 replies; 37+ messages in thread
From: Benjamin Marzinski @ 2020-02-05 18:58 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck

If the persistent in command fails, the response buffer must be freed.
Found by Coverity

Reviewed-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 mpathpersist/main.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/mpathpersist/main.c b/mpathpersist/main.c
index 278b8d51..920e686c 100644
--- a/mpathpersist/main.c
+++ b/mpathpersist/main.c
@@ -499,6 +499,7 @@ static int handle_args(int argc, char * argv[], int nline)
 		if (ret != MPATH_PR_SUCCESS )
 		{
 			fprintf (stderr, "Persistent Reserve IN command failed\n");
+			free(resp);
 			goto out_fd;
 		}
 
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v2 05/17] libmultipath: remove unused path->prio_args
  2020-02-05 18:58 [PATCH v2 00/17] Multipath patch dump Benjamin Marzinski
                   ` (3 preceding siblings ...)
  2020-02-05 18:58 ` [PATCH v2 04/17] Fix leak in mpathpersist Benjamin Marzinski
@ 2020-02-05 18:58 ` Benjamin Marzinski
  2020-02-05 18:58 ` [PATCH v2 06/17] libmultipath: constify get_unaligned_be* Benjamin Marzinski
                   ` (12 subsequent siblings)
  17 siblings, 0 replies; 37+ messages in thread
From: Benjamin Marzinski @ 2020-02-05 18:58 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck

Reviewed-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 libmultipath/structs.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/libmultipath/structs.h b/libmultipath/structs.h
index a3adf906..1c32a799 100644
--- a/libmultipath/structs.h
+++ b/libmultipath/structs.h
@@ -272,7 +272,6 @@ struct path {
 	char * uid_attribute;
 	char * getuid;
 	struct prio prio;
-	char * prio_args;
 	struct checker checker;
 	struct multipath * mpp;
 	int fd;
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v2 06/17] libmultipath: constify get_unaligned_be*
  2020-02-05 18:58 [PATCH v2 00/17] Multipath patch dump Benjamin Marzinski
                   ` (4 preceding siblings ...)
  2020-02-05 18:58 ` [PATCH v2 05/17] libmultipath: remove unused path->prio_args Benjamin Marzinski
@ 2020-02-05 18:58 ` Benjamin Marzinski
  2020-02-05 18:58 ` [PATCH v2 07/17] libmultipath: add missing hwe mpe variable merges Benjamin Marzinski
                   ` (11 subsequent siblings)
  17 siblings, 0 replies; 37+ messages in thread
From: Benjamin Marzinski @ 2020-02-05 18:58 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck

Reviewed-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 libmultipath/unaligned.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/libmultipath/unaligned.h b/libmultipath/unaligned.h
index 68c07742..b9eaa7cb 100644
--- a/libmultipath/unaligned.h
+++ b/libmultipath/unaligned.h
@@ -10,14 +10,14 @@ static inline uint16_t get_unaligned_be16(const void *ptr)
 	return p[0] << 8 | p[1];
 }
 
-static inline uint32_t get_unaligned_be32(void *ptr)
+static inline uint32_t get_unaligned_be32(const void *ptr)
 {
 	const uint8_t *p = ptr;
 
 	return p[0] << 24 | p[1] << 16 | p[2] << 8 | p[3];
 }
 
-static inline uint64_t get_unaligned_be64(void *ptr)
+static inline uint64_t get_unaligned_be64(const void *ptr)
 {
 	uint32_t low = get_unaligned_be32(ptr + 4);
 	uint64_t high = get_unaligned_be32(ptr);
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v2 07/17] libmultipath: add missing hwe mpe variable merges
  2020-02-05 18:58 [PATCH v2 00/17] Multipath patch dump Benjamin Marzinski
                   ` (5 preceding siblings ...)
  2020-02-05 18:58 ` [PATCH v2 06/17] libmultipath: constify get_unaligned_be* Benjamin Marzinski
@ 2020-02-05 18:58 ` Benjamin Marzinski
  2020-02-05 18:58 ` [PATCH v2 08/17] libmultipath: fix sgio_get_vpd looping Benjamin Marzinski
                   ` (10 subsequent siblings)
  17 siblings, 0 replies; 37+ messages in thread
From: Benjamin Marzinski @ 2020-02-05 18:58 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck

There were some variables in the hwe and mpe structs that weren't being
merged by merge_hwe() and merge_mpe().

Reviewed-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 libmultipath/config.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/libmultipath/config.c b/libmultipath/config.c
index 20e3b8bf..85626e96 100644
--- a/libmultipath/config.c
+++ b/libmultipath/config.c
@@ -372,6 +372,10 @@ merge_hwe (struct hwentry * dst, struct hwentry * src)
 	merge_num(san_path_err_threshold);
 	merge_num(san_path_err_forget_rate);
 	merge_num(san_path_err_recovery_time);
+	merge_num(marginal_path_err_sample_time);
+	merge_num(marginal_path_err_rate_threshold);
+	merge_num(marginal_path_err_recheck_gap_time);
+	merge_num(marginal_path_double_failed_time);
 
 	snprintf(id, sizeof(id), "%s/%s", dst->vendor, dst->product);
 	reconcile_features_with_options(id, &dst->features,
@@ -397,6 +401,7 @@ merge_mpe(struct mpentry *dst, struct mpentry *src)
 	if (dst->prkey_source == PRKEY_SOURCE_NONE &&
 	    src->prkey_source != PRKEY_SOURCE_NONE) {
 		dst->prkey_source = src->prkey_source;
+		dst->sa_flags = src->sa_flags;
 		memcpy(&dst->reservation_key, &src->reservation_key,
 		       sizeof(dst->reservation_key));
 	}
@@ -413,6 +418,9 @@ merge_mpe(struct mpentry *dst, struct mpentry *src)
 	merge_num(deferred_remove);
 	merge_num(delay_watch_checks);
 	merge_num(delay_wait_checks);
+	merge_num(san_path_err_threshold);
+	merge_num(san_path_err_forget_rate);
+	merge_num(san_path_err_recovery_time);
 	merge_num(marginal_path_err_sample_time);
 	merge_num(marginal_path_err_rate_threshold);
 	merge_num(marginal_path_err_recheck_gap_time);
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v2 08/17] libmultipath: fix sgio_get_vpd looping
  2020-02-05 18:58 [PATCH v2 00/17] Multipath patch dump Benjamin Marzinski
                   ` (6 preceding siblings ...)
  2020-02-05 18:58 ` [PATCH v2 07/17] libmultipath: add missing hwe mpe variable merges Benjamin Marzinski
@ 2020-02-05 18:58 ` Benjamin Marzinski
  2020-02-10 14:35   ` Martin Wilck
  2020-02-05 18:58 ` [PATCH v2 09/17] libmultipath: add vend_id to get_vpd_sgio Benjamin Marzinski
                   ` (9 subsequent siblings)
  17 siblings, 1 reply; 37+ messages in thread
From: Benjamin Marzinski @ 2020-02-05 18:58 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck

If do_inq returns a page with a length that is less than maxlen, but
larger than DEFAULT_SGIO_LEN, this function will loop forever. Also
if do_inq returns with a length equal to or greater than maxlen,
sgio_get_vpd will exit immediately, even if it hasn't read the entire
page.  Fix these issues, modify the tests to verify the new behavior.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 libmultipath/discovery.c | 12 +++----
 tests/vpd.c              | 77 ++++++++++++++++++++++++----------------
 2 files changed, 52 insertions(+), 37 deletions(-)

diff --git a/libmultipath/discovery.c b/libmultipath/discovery.c
index 72f455e8..3c72a80a 100644
--- a/libmultipath/discovery.c
+++ b/libmultipath/discovery.c
@@ -870,6 +870,7 @@ static int
 sgio_get_vpd (unsigned char * buff, int maxlen, int fd, int pg)
 {
 	int len = DEFAULT_SGIO_LEN;
+	int rlen;
 
 	if (fd < 0) {
 		errno = EBADF;
@@ -877,12 +878,11 @@ sgio_get_vpd (unsigned char * buff, int maxlen, int fd, int pg)
 	}
 retry:
 	if (0 == do_inq(fd, 0, 1, pg, buff, len)) {
-		len = get_unaligned_be16(&buff[2]) + 4;
-		if (len >= maxlen)
-			return len;
-		if (len > DEFAULT_SGIO_LEN)
-			goto retry;
-		return len;
+		rlen = get_unaligned_be16(&buff[2]) + 4;
+		if (rlen <= len || len >= maxlen)
+			return rlen;
+		len = (rlen < maxlen)? rlen : maxlen;
+		goto retry;
 	}
 	return -1;
 }
diff --git a/tests/vpd.c b/tests/vpd.c
index d9f80eaa..1e653ed4 100644
--- a/tests/vpd.c
+++ b/tests/vpd.c
@@ -506,9 +506,10 @@ static void test_vpd_naa_ ## naa ## _ ## wlen(void **state)             \
  * test_vpd_eui_LEN_WLEN() - test code for VPD 83, EUI64
  * @LEN:	EUI64 length (8, 12, or 16)
  * @WLEN:	WWID buffer size
+ * @SML:	Use small VPD page size
  */
-#define make_test_vpd_eui(len, wlen)					\
-static void test_vpd_eui_ ## len ## _ ## wlen(void **state)             \
+#define make_test_vpd_eui(len, wlen, sml)				\
+static void test_vpd_eui_ ## len ## _ ## wlen ## _ ## sml(void **state)	\
 {									\
 	struct vpdtest *vt = *state;                                    \
 	int n, ret;							\
@@ -518,10 +519,16 @@ static void test_vpd_eui_ ## len ## _ ## wlen(void **state)             \
 									\
 	n = create_vpd83(vt->vpdbuf, sizeof(vt->vpdbuf), test_id,	\
 			 2, 0, len);					\
+	if (sml) {							\
+		/* overwrite the page side to DEFAULT_SGIO_LEN + 1 */	\
+		put_unaligned_be16(255, vt->vpdbuf + 2);		\
+		will_return(__wrap_ioctl, n);				\
+		will_return(__wrap_ioctl, vt->vpdbuf);			\
+	}								\
 	will_return(__wrap_ioctl, n);					\
 	will_return(__wrap_ioctl, vt->vpdbuf);				\
 	ret = get_vpd_sgio(10, 0x83, vt->wwid, wlen);			\
-	assert_correct_wwid("test_vpd_eui_" #len "_" #wlen,		\
+	assert_correct_wwid("test_vpd_eui_" #len "_" #wlen "_" #sml,	\
 			    exp_len, ret, '2', 0, true,			\
 			    test_id, vt->wwid);				\
 }
@@ -603,25 +610,30 @@ make_test_vpd_vnd(20, 10);
 make_test_vpd_vnd(10, 10);
 
 /* EUI64 tests */
+/* small vpd page test */
+make_test_vpd_eui(8, 32, 1);
+make_test_vpd_eui(12, 32, 1);
+make_test_vpd_eui(16, 40, 1);
+
 /* 64bit, WWID size: 18 */
-make_test_vpd_eui(8, 32);
-make_test_vpd_eui(8, 18);
-make_test_vpd_eui(8, 17);
-make_test_vpd_eui(8, 16);
-make_test_vpd_eui(8, 10);
+make_test_vpd_eui(8, 32, 0);
+make_test_vpd_eui(8, 18, 0);
+make_test_vpd_eui(8, 17, 0);
+make_test_vpd_eui(8, 16, 0);
+make_test_vpd_eui(8, 10, 0);
 
 /* 96 bit, WWID size: 26 */
-make_test_vpd_eui(12, 32);
-make_test_vpd_eui(12, 26);
-make_test_vpd_eui(12, 25);
-make_test_vpd_eui(12, 20);
-make_test_vpd_eui(12, 10);
+make_test_vpd_eui(12, 32, 0);
+make_test_vpd_eui(12, 26, 0);
+make_test_vpd_eui(12, 25, 0);
+make_test_vpd_eui(12, 20, 0);
+make_test_vpd_eui(12, 10, 0);
 
 /* 128 bit, WWID size: 34 */
-make_test_vpd_eui(16, 40);
-make_test_vpd_eui(16, 34);
-make_test_vpd_eui(16, 33);
-make_test_vpd_eui(16, 20);
+make_test_vpd_eui(16, 40, 0);
+make_test_vpd_eui(16, 34, 0);
+make_test_vpd_eui(16, 33, 0);
+make_test_vpd_eui(16, 20, 0);
 
 /* NAA IEEE registered extended (36), WWID size: 34 */
 make_test_vpd_naa(6, 40);
@@ -722,20 +734,23 @@ static int test_vpd(void)
 		cmocka_unit_test(test_vpd_vnd_19_20),
 		cmocka_unit_test(test_vpd_vnd_20_10),
 		cmocka_unit_test(test_vpd_vnd_10_10),
-		cmocka_unit_test(test_vpd_eui_8_32),
-		cmocka_unit_test(test_vpd_eui_8_18),
-		cmocka_unit_test(test_vpd_eui_8_17),
-		cmocka_unit_test(test_vpd_eui_8_16),
-		cmocka_unit_test(test_vpd_eui_8_10),
-		cmocka_unit_test(test_vpd_eui_12_32),
-		cmocka_unit_test(test_vpd_eui_12_26),
-		cmocka_unit_test(test_vpd_eui_12_25),
-		cmocka_unit_test(test_vpd_eui_12_20),
-		cmocka_unit_test(test_vpd_eui_12_10),
-		cmocka_unit_test(test_vpd_eui_16_40),
-		cmocka_unit_test(test_vpd_eui_16_34),
-		cmocka_unit_test(test_vpd_eui_16_33),
-		cmocka_unit_test(test_vpd_eui_16_20),
+		cmocka_unit_test(test_vpd_eui_8_32_1),
+		cmocka_unit_test(test_vpd_eui_12_32_1),
+		cmocka_unit_test(test_vpd_eui_16_40_1),
+		cmocka_unit_test(test_vpd_eui_8_32_0),
+		cmocka_unit_test(test_vpd_eui_8_18_0),
+		cmocka_unit_test(test_vpd_eui_8_17_0),
+		cmocka_unit_test(test_vpd_eui_8_16_0),
+		cmocka_unit_test(test_vpd_eui_8_10_0),
+		cmocka_unit_test(test_vpd_eui_12_32_0),
+		cmocka_unit_test(test_vpd_eui_12_26_0),
+		cmocka_unit_test(test_vpd_eui_12_25_0),
+		cmocka_unit_test(test_vpd_eui_12_20_0),
+		cmocka_unit_test(test_vpd_eui_12_10_0),
+		cmocka_unit_test(test_vpd_eui_16_40_0),
+		cmocka_unit_test(test_vpd_eui_16_34_0),
+		cmocka_unit_test(test_vpd_eui_16_33_0),
+		cmocka_unit_test(test_vpd_eui_16_20_0),
 		cmocka_unit_test(test_vpd_naa_6_40),
 		cmocka_unit_test(test_vpd_naa_6_34),
 		cmocka_unit_test(test_vpd_naa_6_33),
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v2 09/17] libmultipath: add vend_id to get_vpd_sgio
  2020-02-05 18:58 [PATCH v2 00/17] Multipath patch dump Benjamin Marzinski
                   ` (7 preceding siblings ...)
  2020-02-05 18:58 ` [PATCH v2 08/17] libmultipath: fix sgio_get_vpd looping Benjamin Marzinski
@ 2020-02-05 18:58 ` Benjamin Marzinski
  2020-02-05 18:58 ` [PATCH v2 10/17] libmultipath: add code to get vendor specific vpd data Benjamin Marzinski
                   ` (8 subsequent siblings)
  17 siblings, 0 replies; 37+ messages in thread
From: Benjamin Marzinski @ 2020-02-05 18:58 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck

This tells multipath how it should decode vendor specific pages. It will
be used by a future patch.

Reviewed-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 libmultipath/discovery.c |  4 ++--
 libmultipath/discovery.h |  2 +-
 libmultipath/propsel.c   |  2 +-
 tests/vpd.c              | 10 +++++-----
 4 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/libmultipath/discovery.c b/libmultipath/discovery.c
index 3c72a80a..1d79cbae 100644
--- a/libmultipath/discovery.c
+++ b/libmultipath/discovery.c
@@ -1135,7 +1135,7 @@ get_vpd_sysfs (struct udev_device *parent, int pg, char * str, int maxlen)
 }
 
 int
-get_vpd_sgio (int fd, int pg, char * str, int maxlen)
+get_vpd_sgio (int fd, int pg, int vend_id, char * str, int maxlen)
 {
 	int len, buff_len;
 	unsigned char buff[4096];
@@ -1810,7 +1810,7 @@ static ssize_t uid_fallback(struct path *pp, int path_state,
 		if (len < 0 && path_state == PATH_UP) {
 			condlog(1, "%s: failed to get sysfs uid: %s",
 				pp->dev, strerror(-len));
-			len = get_vpd_sgio(pp->fd, 0x83, pp->wwid,
+			len = get_vpd_sgio(pp->fd, 0x83, 0, pp->wwid,
 					   WWID_SIZE);
 			*origin = "sgio";
 		}
diff --git a/libmultipath/discovery.h b/libmultipath/discovery.h
index 8d04c2af..2f2fd9eb 100644
--- a/libmultipath/discovery.h
+++ b/libmultipath/discovery.h
@@ -35,7 +35,7 @@ int path_get_tpgs(struct path *pp); /* This function never returns TPGS_UNDEF */
 int do_tur (char *);
 int path_offline (struct path *);
 int get_state (struct path * pp, struct config * conf, int daemon, int state);
-int get_vpd_sgio (int fd, int pg, char * str, int maxlen);
+int get_vpd_sgio (int fd, int pg, int vend_id, char * str, int maxlen);
 int pathinfo (struct path * pp, struct config * conf, int mask);
 int alloc_path_with_pathinfo (struct config *conf, struct udev_device *udevice,
 			      const char *wwid, int flag, struct path **pp_ptr);
diff --git a/libmultipath/propsel.c b/libmultipath/propsel.c
index 27e8d68a..b5b5b89f 100644
--- a/libmultipath/propsel.c
+++ b/libmultipath/propsel.c
@@ -490,7 +490,7 @@ check_rdac(struct path * pp)
 	if (__do_set_from_hwe(checker_name, pp, checker_name) &&
 	    strcmp(checker_name, RDAC))
 		return 0;
-	len = get_vpd_sgio(pp->fd, 0xC9, buff, 44);
+	len = get_vpd_sgio(pp->fd, 0xC9, 0, buff, 44);
 	if (len <= 0)
 		return 0;
 	return !(memcmp(buff + 4, "vac1", 4));
diff --git a/tests/vpd.c b/tests/vpd.c
index 1e653ed4..0331c487 100644
--- a/tests/vpd.c
+++ b/tests/vpd.c
@@ -429,7 +429,7 @@ static void test_vpd_vnd_ ## len ## _ ## wlen(void **state)             \
 	free(exp_wwid);							\
 	will_return(__wrap_ioctl, n);					\
 	will_return(__wrap_ioctl, vt->vpdbuf);				\
-	ret = get_vpd_sgio(10, 0x83, vt->wwid, wlen);			\
+	ret = get_vpd_sgio(10, 0x83, 0, vt->wwid, wlen);		\
 	assert_correct_wwid("test_vpd_vnd_" #len "_" #wlen,		\
 			    exp_len, ret, '1', 0, false,		\
 			    exp_subst, vt->wwid);			\
@@ -459,7 +459,7 @@ static void test_vpd_str_ ## typ ## _ ## len ## _ ## wlen(void **state) \
 		exp_len = wlen - 1;					\
 	will_return(__wrap_ioctl, n);					\
 	will_return(__wrap_ioctl, vt->vpdbuf);				\
-	ret = get_vpd_sgio(10, 0x83, vt->wwid, wlen);			\
+	ret = get_vpd_sgio(10, 0x83, 0, vt->wwid, wlen);		\
 	assert_correct_wwid("test_vpd_str_" #typ "_" #len "_" #wlen,	\
 			    exp_len, ret, byte0[type], 0,		\
 			    type != STR_IQN,				\
@@ -496,7 +496,7 @@ static void test_vpd_naa_ ## naa ## _ ## wlen(void **state)             \
 			 3, naa, 0);					\
 	will_return(__wrap_ioctl, n);					\
 	will_return(__wrap_ioctl, vt->vpdbuf);				\
-	ret = get_vpd_sgio(10, 0x83, vt->wwid, wlen);			\
+	ret = get_vpd_sgio(10, 0x83, 0, vt->wwid, wlen);		\
 	assert_correct_wwid("test_vpd_naa_" #naa "_" #wlen,		\
 			    exp_len, ret, '3', '0' + naa, true,		\
 			    test_id, vt->wwid);				\
@@ -527,7 +527,7 @@ static void test_vpd_eui_ ## len ## _ ## wlen ## _ ## sml(void **state)	\
 	}								\
 	will_return(__wrap_ioctl, n);					\
 	will_return(__wrap_ioctl, vt->vpdbuf);				\
-	ret = get_vpd_sgio(10, 0x83, vt->wwid, wlen);			\
+	ret = get_vpd_sgio(10, 0x83, 0, vt->wwid, wlen);		\
 	assert_correct_wwid("test_vpd_eui_" #len "_" #wlen "_" #sml,	\
 			    exp_len, ret, '2', 0, true,			\
 			    test_id, vt->wwid);				\
@@ -554,7 +554,7 @@ static void test_vpd80_ ## size ## _ ## len ## _ ## wlen(void **state)  \
 			 size, len);					\
 	will_return(__wrap_ioctl, n);					\
 	will_return(__wrap_ioctl, vt->vpdbuf);				\
-	ret = get_vpd_sgio(10, 0x80, vt->wwid, wlen);			\
+	ret = get_vpd_sgio(10, 0x80, 0, vt->wwid, wlen);		\
 	assert_correct_wwid("test_vpd80_" #size "_" #len "_" #wlen,	\
 			    exp_len, ret, 0, 0, false,			\
 			    input, vt->wwid);				\
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v2 10/17] libmultipath: add code to get vendor specific vpd data
  2020-02-05 18:58 [PATCH v2 00/17] Multipath patch dump Benjamin Marzinski
                   ` (8 preceding siblings ...)
  2020-02-05 18:58 ` [PATCH v2 09/17] libmultipath: add vend_id to get_vpd_sgio Benjamin Marzinski
@ 2020-02-05 18:58 ` Benjamin Marzinski
  2020-02-10 14:49   ` Martin Wilck
  2020-02-11  8:39   ` Martin Wilck
  2020-02-05 18:58 ` [PATCH v2 11/17] libmultipath: change how the checker async is set Benjamin Marzinski
                   ` (7 subsequent siblings)
  17 siblings, 2 replies; 37+ messages in thread
From: Benjamin Marzinski @ 2020-02-05 18:58 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck

This adds the wildcard 'g' for multipath and path formatted printing,
which returns extra data from a device's vendor specific vpd page.  The
specific vendor vpd page to use, and the vendor/product id to decode it
can be set in the hwentry with vpd_vendor_pg and vpd_vendor_id. It can
be configured in the devices section of multipath.conf with the
vpd_vendor parameter. Currently, the only devices that use this are HPE
3PAR arrays, to return the Volume Name.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 libmultipath/config.c      |  2 ++
 libmultipath/config.h      |  1 +
 libmultipath/dict.c        | 38 ++++++++++++++++++++++++++++++++++++
 libmultipath/discovery.c   | 40 +++++++++++++++++++++++++++++++++++++-
 libmultipath/hwtable.c     |  1 +
 libmultipath/print.c       | 25 ++++++++++++++++++++++++
 libmultipath/propsel.c     | 18 +++++++++++++++++
 libmultipath/propsel.h     |  1 +
 libmultipath/structs.h     | 15 ++++++++++++++
 multipath/multipath.conf.5 |  8 ++++++++
 10 files changed, 148 insertions(+), 1 deletion(-)

diff --git a/libmultipath/config.c b/libmultipath/config.c
index 85626e96..8c207d21 100644
--- a/libmultipath/config.c
+++ b/libmultipath/config.c
@@ -369,6 +369,7 @@ merge_hwe (struct hwentry * dst, struct hwentry * src)
 	merge_num(max_sectors_kb);
 	merge_num(ghost_delay);
 	merge_num(all_tg_pt);
+	merge_num(vpd_vendor_id);
 	merge_num(san_path_err_threshold);
 	merge_num(san_path_err_forget_rate);
 	merge_num(san_path_err_recovery_time);
@@ -517,6 +518,7 @@ store_hwe (vector hwtable, struct hwentry * dhwe)
 	hwe->detect_prio = dhwe->detect_prio;
 	hwe->detect_checker = dhwe->detect_checker;
 	hwe->ghost_delay = dhwe->ghost_delay;
+	hwe->vpd_vendor_id = dhwe->vpd_vendor_id;
 
 	if (dhwe->bl_product && !(hwe->bl_product = set_param_str(dhwe->bl_product)))
 		goto out;
diff --git a/libmultipath/config.h b/libmultipath/config.h
index e69aa07c..9d6be6e2 100644
--- a/libmultipath/config.h
+++ b/libmultipath/config.h
@@ -87,6 +87,7 @@ struct hwentry {
 	int max_sectors_kb;
 	int ghost_delay;
 	int all_tg_pt;
+	int vpd_vendor_id;
 	char * bl_product;
 };
 
diff --git a/libmultipath/dict.c b/libmultipath/dict.c
index 2b046e1d..dd21396b 100644
--- a/libmultipath/dict.c
+++ b/libmultipath/dict.c
@@ -1366,6 +1366,43 @@ def_uxsock_timeout_handler(struct config *conf, vector strvec)
 	return 0;
 }
 
+static int
+hw_vpd_vendor_handler(struct config *conf, vector strvec)
+{
+	int i;
+	char *buff;
+
+	struct hwentry * hwe = VECTOR_LAST_SLOT(conf->hwtable);
+	if (!hwe)
+		return 1;
+
+	buff = set_value(strvec);
+	if (!buff)
+		return 1;
+	for (i = 0; i < VPD_VP_ARRAY_SIZE; i++) {
+		if (strcmp(buff, vpd_vendor_pages[i].name) == 0) {
+			hwe->vpd_vendor_id = i;
+			goto out;
+		}
+	}
+	hwe->vpd_vendor_id = 0;
+out:
+	FREE(buff);
+	return 0;
+}
+
+static int
+snprint_hw_vpd_vendor(struct config *conf, char * buff, int len,
+		      const void * data)
+{
+	const struct hwentry * hwe = (const struct hwentry *)data;
+
+	if (hwe->vpd_vendor_id > 0 && hwe->vpd_vendor_id < VPD_VP_ARRAY_SIZE)
+		return snprintf(buff, len, "%s",
+				vpd_vendor_pages[hwe->vpd_vendor_id].name);
+	return 0;
+}
+
 /*
  * blacklist block handlers
  */
@@ -1806,6 +1843,7 @@ init_keywords(vector keywords)
 	install_keyword("max_sectors_kb", &hw_max_sectors_kb_handler, &snprint_hw_max_sectors_kb);
 	install_keyword("ghost_delay", &hw_ghost_delay_handler, &snprint_hw_ghost_delay);
 	install_keyword("all_tg_pt", &hw_all_tg_pt_handler, &snprint_hw_all_tg_pt);
+	install_keyword("vpd_vendor", &hw_vpd_vendor_handler, &snprint_hw_vpd_vendor);
 	install_sublevel_end();
 
 	install_keyword_root("overrides", &overrides_handler);
diff --git a/libmultipath/discovery.c b/libmultipath/discovery.c
index 1d79cbae..c2311b66 100644
--- a/libmultipath/discovery.c
+++ b/libmultipath/discovery.c
@@ -34,6 +34,11 @@
 #include "prioritizers/alua_rtpg.h"
 #include "foreign.h"
 
+struct vpd_vendor_page vpd_vendor_pages[VPD_VP_ARRAY_SIZE] = {
+	[VPD_VP_UNDEF]	= { 0x00, "undef" },
+	[VPD_VP_HP3PAR]	= { 0xc0, "hp3par" },
+};
+
 int
 alloc_path_with_pathinfo (struct config *conf, struct udev_device *udevice,
 			  const char *wwid, int flag, struct path **pp_ptr)
@@ -1103,6 +1108,30 @@ parse_vpd_pg83(const unsigned char *in, size_t in_len,
 	return len;
 }
 
+static int
+parse_vpd_c0_hp3par(const unsigned char *in, size_t in_len,
+		    char *out, size_t out_len)
+{
+	size_t len;
+
+	memset(out, 0x0, out_len);
+	if (in_len <= 4 || (in[4] > 3 && in_len < 44)) {
+		condlog(3, "HP/3PAR vendor specific VPD page length too short: %lu", in_len);
+		return -EINVAL;
+	}
+	if (in[4] <= 3) /* revision must be > 3 to have Vomlume Name */
+		return -ENODATA;
+	len = get_unaligned_be32(&in[40]);
+	if (len > out_len || len + 44 > in_len) {
+		condlog(3, "HP/3PAR vendor specific Volume name too long: %lu",
+			len);
+		return -EINVAL;
+	}
+	memcpy(out, &in[44], len);
+	out[out_len - 1] = '\0';
+	return len;
+}
+
 static int
 get_vpd_sysfs (struct udev_device *parent, int pg, char * str, int maxlen)
 {
@@ -1170,7 +1199,9 @@ get_vpd_sgio (int fd, int pg, int vend_id, char * str, int maxlen)
 			len = (buff_len <= maxlen)? buff_len : maxlen;
 			memcpy (str, buff, len);
 		}
-	} else
+	} else if (pg == 0xc0 && vend_id == VPD_VP_HP3PAR)
+		len = parse_vpd_c0_hp3par(buff, buff_len, str, maxlen);
+	else
 		len = -ENOSYS;
 
 	return len;
@@ -1540,10 +1571,17 @@ scsi_ioctl_pathinfo (struct path * pp, struct config *conf, int mask)
 {
 	struct udev_device *parent;
 	const char *attr_path = NULL;
+	int vpd_id;
 
 	if (!(mask & DI_SERIAL))
 		return;
 
+	select_vpd_vendor_id(conf, pp);
+	vpd_id = pp->vpd_vendor_id;
+
+	if (vpd_id != VPD_VP_UNDEF && get_vpd_sgio(pp->fd, vpd_vendor_pages[vpd_id].pg, vpd_id, pp->vpd_data, sizeof(pp->vpd_data)) < 0)
+		condlog(3, "%s: failed to get extra vpd data", pp->dev);
+
 	parent = pp->udev;
 	while (parent) {
 		const char *subsys = udev_device_get_subsystem(parent);
diff --git a/libmultipath/hwtable.c b/libmultipath/hwtable.c
index 16627ec5..dd6a17d4 100644
--- a/libmultipath/hwtable.c
+++ b/libmultipath/hwtable.c
@@ -117,6 +117,7 @@ static struct hwentry default_hw[] = {
 		.no_path_retry = 18,
 		.fast_io_fail  = 10,
 		.dev_loss      = MAX_DEV_LOSS_TMO,
+		.vpd_vendor_id = VPD_VP_HP3PAR,
 	},
 	{
 		/* RA8000 / ESA12000 */
diff --git a/libmultipath/print.c b/libmultipath/print.c
index 907469ad..f3b27592 100644
--- a/libmultipath/print.c
+++ b/libmultipath/print.c
@@ -358,6 +358,21 @@ snprint_action (char * buff, size_t len, const struct multipath * mpp)
 	}
 }
 
+static int
+snprint_multipath_vpd_data(char * buff, size_t len,
+			   const struct multipath * mpp)
+{
+	struct pathgroup * pgp;
+	struct path * pp;
+	int i, j;
+
+	vector_foreach_slot(mpp->pg, pgp, i)
+		vector_foreach_slot(pgp->paths, pp, j)
+			if (strlen(pp->vpd_data))
+				return snprintf(buff, len, "%s", pp->vpd_data);
+	return snprintf(buff, len, "[undef]");
+}
+
 /*
  * path info printing functions
  */
@@ -688,6 +703,14 @@ snprint_path_marginal(char * buff, size_t len, const struct path * pp)
 	return snprintf(buff, len, "normal");
 }
 
+static int
+snprint_path_vpd_data(char * buff, size_t len, const struct path * pp)
+{
+	if (strlen(pp->vpd_data) > 0)
+		return snprintf(buff, len, "%s", pp->vpd_data);
+	return snprintf(buff, len, "[undef]");
+}
+
 struct multipath_data mpd[] = {
 	{'n', "name",          0, snprint_name},
 	{'w', "uuid",          0, snprint_multipath_uuid},
@@ -712,6 +735,7 @@ struct multipath_data mpd[] = {
 	{'p', "prod",          0, snprint_multipath_prod},
 	{'e', "rev",           0, snprint_multipath_rev},
 	{'G', "foreign",       0, snprint_multipath_foreign},
+	{'g', "vpd page data", 0, snprint_multipath_vpd_data},
 	{0, NULL, 0 , NULL}
 };
 
@@ -737,6 +761,7 @@ struct path_data pd[] = {
 	{'r', "target WWPN",   0, snprint_tgt_wwpn},
 	{'a', "host adapter",  0, snprint_host_adapter},
 	{'G', "foreign",       0, snprint_path_foreign},
+	{'g', "vpd page data", 0, snprint_path_vpd_data},
 	{'0', "failures",      0, snprint_path_failures},
 	{'P', "protocol",      0, snprint_path_protocol},
 	{0, NULL, 0 , NULL}
diff --git a/libmultipath/propsel.c b/libmultipath/propsel.c
index b5b5b89f..216b09a0 100644
--- a/libmultipath/propsel.c
+++ b/libmultipath/propsel.c
@@ -1203,3 +1203,21 @@ out:
 		origin);
 	return 0;
 }
+
+int select_vpd_vendor_id (struct config *conf, struct path *pp)
+{
+	const char *origin;
+
+	pp_set_hwe(vpd_vendor_id);
+	pp_set_default(vpd_vendor_id, 0);
+out:
+	if (pp->vpd_vendor_id < 0 || pp->vpd_vendor_id >= VPD_VP_ARRAY_SIZE) {
+		condlog(3, "%s: vpd_vendor_id = %d (invalid, setting to 0)",
+			pp->dev, pp->vpd_vendor_id);
+		pp->vpd_vendor_id = 0;
+	}
+	condlog(3, "%s: vpd_vendor_id = %d \"%s\" %s", pp->dev,
+		pp->vpd_vendor_id, vpd_vendor_pages[pp->vpd_vendor_id].name,
+		origin);
+	return 0;
+}
diff --git a/libmultipath/propsel.h b/libmultipath/propsel.h
index ddfd6262..4fa08e1f 100644
--- a/libmultipath/propsel.h
+++ b/libmultipath/propsel.h
@@ -37,3 +37,4 @@ void reconcile_features_with_options(const char *id, char **features,
 				     int* no_path_retry,
 				     int *retain_hwhandler);
 int select_all_tg_pt (struct config *conf, struct multipath * mp);
+int select_vpd_vendor_id (struct config *conf, struct path *pp);
diff --git a/libmultipath/structs.h b/libmultipath/structs.h
index 1c32a799..6c03ee5d 100644
--- a/libmultipath/structs.h
+++ b/libmultipath/structs.h
@@ -21,6 +21,7 @@
 #define HOST_NAME_LEN		16
 #define SLOT_NAME_SIZE		40
 #define PRKEY_SIZE		19
+#define VPD_DATA_SIZE		128
 
 #define SCSI_VENDOR_SIZE	9
 #define SCSI_PRODUCT_SIZE	17
@@ -221,6 +222,18 @@ enum all_tg_pt_states {
 	ALL_TG_PT_ON = YNU_YES,
 };
 
+enum vpd_vendor_ids {
+	VPD_VP_UNDEF,
+	VPD_VP_HP3PAR,
+	VPD_VP_ARRAY_SIZE, /* This must remain the last entry */
+};
+
+struct vpd_vendor_page {
+	int pg;
+	const char *name;
+};
+extern struct vpd_vendor_page vpd_vendor_pages[VPD_VP_ARRAY_SIZE];
+
 struct sg_id {
 	int host_no;
 	int channel;
@@ -255,6 +268,7 @@ struct path {
 	char rev[PATH_REV_SIZE];
 	char serial[SERIAL_SIZE];
 	char tgt_node_name[NODE_NAME_SIZE];
+	char vpd_data[VPD_DATA_SIZE];
 	unsigned long long size;
 	unsigned int checkint;
 	unsigned int tick;
@@ -287,6 +301,7 @@ struct path {
 	int io_err_pathfail_starttime;
 	int find_multipaths_timeout;
 	int marginal;
+	int vpd_vendor_id;
 	/* configlet pointers */
 	vector hwe;
 	struct gen_path generic_path;
diff --git a/multipath/multipath.conf.5 b/multipath/multipath.conf.5
index e866da23..dc103fd8 100644
--- a/multipath/multipath.conf.5
+++ b/multipath/multipath.conf.5
@@ -1472,6 +1472,14 @@ the \fIproduct\fR attribute set to the value of \fIproduct_blacklist\fR.
 The user_friendly_names prefix to use for this
 device type, instead of the default "mpath".
 .TP
+.B vpd_vendor
+The vendor specific vpd page information, using the vpd page abbreviation.
+The vpd page abbreviation can be found by running \fIsg_vpd -e\fR. multipathd
+will use this information to gather device specific information that can be
+displayed with the \fI%g\fR wilcard for the \fImultipathd show maps format\fR
+and \fImultipathd show paths format\fR commands. Currently only the
+\fBhp3par\fR vpd page is supported.
+.TP
 .B hardware_handler
 The hardware handler to use for this device type.
 The following hardware handler are implemented:
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v2 11/17] libmultipath: change how the checker async is set
  2020-02-05 18:58 [PATCH v2 00/17] Multipath patch dump Benjamin Marzinski
                   ` (9 preceding siblings ...)
  2020-02-05 18:58 ` [PATCH v2 10/17] libmultipath: add code to get vendor specific vpd data Benjamin Marzinski
@ 2020-02-05 18:58 ` Benjamin Marzinski
  2020-02-05 18:58 ` [PATCH v2 12/17] libmultipath: change failed path prio timeout Benjamin Marzinski
                   ` (6 subsequent siblings)
  17 siblings, 0 replies; 37+ messages in thread
From: Benjamin Marzinski @ 2020-02-05 18:58 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck

The way that the checkers async mode worked in multipathd didn't make
much sense. When multipathd starts up, all checker classes are in the
sync mode, and any pathinfo() calls on devices would run the checker in
sync mode. However, the First time a checker class was used in
checkerloop, it would set that checker class to async (assuming
force_sync wasn't set).  After that, no matter when a checker from that
class was called, it would always run in async mode.  Multipathd doesn't
need to run checkers in sync mode at all, so don't.

Reviewed-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 libmpathpersist/mpath_persist.c |  2 +-
 libmultipath/discovery.c        | 10 ++++------
 multipath/main.c                |  1 +
 3 files changed, 6 insertions(+), 7 deletions(-)

diff --git a/libmpathpersist/mpath_persist.c b/libmpathpersist/mpath_persist.c
index 603cfc3b..b2238f00 100644
--- a/libmpathpersist/mpath_persist.c
+++ b/libmpathpersist/mpath_persist.c
@@ -47,7 +47,7 @@ mpath_lib_init (void)
 		condlog(0, "Failed to initialize multipath config.");
 		return NULL;
 	}
-
+	conf->force_sync = 1;
 	set_max_fds(conf->max_fds);
 
 	return conf;
diff --git a/libmultipath/discovery.c b/libmultipath/discovery.c
index c2311b66..bee5b77b 100644
--- a/libmultipath/discovery.c
+++ b/libmultipath/discovery.c
@@ -1649,12 +1649,10 @@ get_state (struct path * pp, struct config *conf, int daemon, int oldstate)
 	if (pp->mpp && !c->mpcontext)
 		checker_mp_init(c, &pp->mpp->mpcontext);
 	checker_clear_message(c);
-	if (daemon) {
-		if (conf->force_sync == 0)
-			checker_set_async(c);
-		else
-			checker_set_sync(c);
-	}
+	if (conf->force_sync == 0)
+		checker_set_async(c);
+	else
+		checker_set_sync(c);
 	if (!conf->checker_timeout &&
 	    sysfs_get_timeout(pp, &(c->timeout)) <= 0)
 		c->timeout = DEF_TIMEOUT;
diff --git a/multipath/main.c b/multipath/main.c
index 4f4d8e89..aebabd9b 100644
--- a/multipath/main.c
+++ b/multipath/main.c
@@ -905,6 +905,7 @@ main (int argc, char *argv[])
 		exit(RTVL_FAIL);
 	multipath_conf = conf;
 	conf->retrigger_tries = 0;
+	conf->force_sync = 1;
 	while ((arg = getopt(argc, argv, ":adcChl::FfM:v:p:b:BrR:itTquUwW")) != EOF ) {
 		switch(arg) {
 		case 1: printf("optarg : %s\n",optarg);
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v2 12/17] libmultipath: change failed path prio timeout
  2020-02-05 18:58 [PATCH v2 00/17] Multipath patch dump Benjamin Marzinski
                   ` (10 preceding siblings ...)
  2020-02-05 18:58 ` [PATCH v2 11/17] libmultipath: change how the checker async is set Benjamin Marzinski
@ 2020-02-05 18:58 ` Benjamin Marzinski
  2020-02-10 14:51   ` Martin Wilck
  2020-02-05 18:58 ` [PATCH v2 13/17] multipathd: add new paths under vecs lock Benjamin Marzinski
                   ` (5 subsequent siblings)
  17 siblings, 1 reply; 37+ messages in thread
From: Benjamin Marzinski @ 2020-02-05 18:58 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck

multipath will try to get the priority from a PATH_DOWN path, if the
path doesn't currently have a valid priority. However, if the priority
code needs to contact the device to get the priority, this is likely to
fail for PATH_DOWN paths.  This code dates back to when multipathd could
not easily reload device tables with failed paths, so getting the
correct priority was important to have a correctly configured device.
Now multipathd can simply reload the device to move the path to the
correct pathgroup when the path comes back up.  Since there are a number
of prioritizers that don't require talking to the device, multipath
shouldn't completely skip attempting to get the priority of these paths,
but it should set a small timeout, so that it isn't hanging in the
case where it needs to contact a device through a failed path.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 libmultipath/discovery.c | 14 ++++++--------
 libmultipath/prio.c      |  6 +++---
 2 files changed, 9 insertions(+), 11 deletions(-)

diff --git a/libmultipath/discovery.c b/libmultipath/discovery.c
index bee5b77b..5e88d980 100644
--- a/libmultipath/discovery.c
+++ b/libmultipath/discovery.c
@@ -1667,11 +1667,10 @@ get_state (struct path * pp, struct config *conf, int daemon, int oldstate)
 }
 
 static int
-get_prio (struct path * pp)
+get_prio (struct path * pp, int timeout)
 {
 	struct prio * p;
 	struct config *conf;
-	int checker_timeout;
 	int old_prio;
 
 	if (!pp)
@@ -1690,11 +1689,8 @@ get_prio (struct path * pp)
 			return 1;
 		}
 	}
-	conf = get_multipath_config();
-	checker_timeout = conf->checker_timeout;
-	put_multipath_config(conf);
 	old_prio = pp->priority;
-	pp->priority = prio_getprio(p, pp, checker_timeout);
+	pp->priority = prio_getprio(p, pp, timeout);
 	if (pp->priority < 0) {
 		/* this changes pp->offline, but why not */
 		int state = path_offline(pp);
@@ -2101,11 +2097,13 @@ int pathinfo(struct path *pp, struct config *conf, int mask)
 
 	 /*
 	  * Retrieve path priority, even for PATH_DOWN paths if it has never
-	  * been successfully obtained before.
+	  * been successfully obtained before. If path is down don't try
+	  * for too long.
 	  */
 	if ((mask & DI_PRIO) && path_state == PATH_UP && strlen(pp->wwid)) {
 		if (pp->state != PATH_DOWN || pp->priority == PRIO_UNDEF) {
-			get_prio(pp);
+			get_prio(pp, (pp->state != PATH_DOWN)?
+				     (conf->checker_timeout * 1000) : 10);
 		}
 	}
 
diff --git a/libmultipath/prio.c b/libmultipath/prio.c
index 87de1f97..194563c4 100644
--- a/libmultipath/prio.c
+++ b/libmultipath/prio.c
@@ -10,11 +10,11 @@
 
 static LIST_HEAD(prioritizers);
 
-unsigned int get_prio_timeout(unsigned int checker_timeout,
+unsigned int get_prio_timeout(unsigned int timeout_ms,
 			      unsigned int default_timeout)
 {
-	if (checker_timeout)
-		return checker_timeout * 1000;
+	if (timeout_ms)
+		return timeout_ms;
 	return default_timeout;
 }
 
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v2 13/17] multipathd: add new paths under vecs lock
  2020-02-05 18:58 [PATCH v2 00/17] Multipath patch dump Benjamin Marzinski
                   ` (11 preceding siblings ...)
  2020-02-05 18:58 ` [PATCH v2 12/17] libmultipath: change failed path prio timeout Benjamin Marzinski
@ 2020-02-05 18:58 ` Benjamin Marzinski
  2020-02-05 18:58 ` [PATCH v2 14/17] libmultipath: add new checker class functions Benjamin Marzinski
                   ` (4 subsequent siblings)
  17 siblings, 0 replies; 37+ messages in thread
From: Benjamin Marzinski @ 2020-02-05 18:58 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck

Right now, multipathd only ever calls pathinfo on a path outside of the
vecs lock in one circumstance, when it's adding a new path from a
uevent. Doing this can cause weirdness or crash multipathd.

First off, the path checker code is written assuming that only one
checker instance will be active at a time.  If this isn't the case, two
processes can modify the checker's refcounts at the same time,
potentially causing a checker to get removed while still in use (if two
threads try to increment the refcount at the same time, causing it to
only increment once).

Another issue is that since the vecs lock isn't grabbed until after all
of the pathinfo is gathered, and the vecs lock is the only thing keeping
the uevent servicing thread from running at the same time as a
reconfigure is happening, it is possible to have multipathd add a path
using the old config after reconfigure has already added that path using
the new config. Leading to two versions of the path on the pathvec.
There are possibly other issues with running pathinfo outside of the
vecs lock that I haven't noticed, as well.

The easiest solution is to call pathinfo on the path while holding the
vecs lock, to keep other threads from modifying the checker structure or
the config while setting up the path. This is what happens every other
time multipathd calls pathinfo, including when a path is added through
the multipathd command interface.

This does mean that any time spent calling pathinfo on new paths will
block other threads that need the vecs lock, but since the checker is
now always called in async mode, and the prio function will use a
short timeout if called on a failed path, this shouldn't be any more
noticeable than the calls to check_path() for already existing paths.

Reviewed-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 multipathd/main.c | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/multipathd/main.c b/multipathd/main.c
index 7b364cfe..9e01cfaa 100644
--- a/multipathd/main.c
+++ b/multipathd/main.c
@@ -906,9 +906,8 @@ uev_add_path (struct uevent *uev, struct vectors * vecs, int need_do_map)
 			}
 		}
 	}
-	lock_cleanup_pop(vecs->lock);
 	if (pp)
-		return ret;
+		goto out;
 
 	/*
 	 * get path vital state
@@ -920,13 +919,13 @@ uev_add_path (struct uevent *uev, struct vectors * vecs, int need_do_map)
 	pthread_cleanup_pop(1);
 	if (!pp) {
 		if (ret == PATHINFO_SKIPPED)
-			return 0;
-		condlog(3, "%s: failed to get path info", uev->kernel);
-		return 1;
+			ret = 0;
+		else {
+			condlog(3, "%s: failed to get path info", uev->kernel);
+			ret = 1;
+		}
+		goto out;
 	}
-	pthread_cleanup_push(cleanup_lock, &vecs->lock);
-	lock(&vecs->lock);
-	pthread_testcancel();
 	ret = store_path(vecs->pathvec, pp);
 	if (!ret) {
 		conf = get_multipath_config();
@@ -940,6 +939,7 @@ uev_add_path (struct uevent *uev, struct vectors * vecs, int need_do_map)
 		free_path(pp);
 		ret = 1;
 	}
+out:
 	lock_cleanup_pop(vecs->lock);
 	return ret;
 }
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v2 14/17] libmultipath: add new checker class functions
  2020-02-05 18:58 [PATCH v2 00/17] Multipath patch dump Benjamin Marzinski
                   ` (12 preceding siblings ...)
  2020-02-05 18:58 ` [PATCH v2 13/17] multipathd: add new paths under vecs lock Benjamin Marzinski
@ 2020-02-05 18:58 ` Benjamin Marzinski
  2020-02-05 18:58 ` [PATCH v2 15/17] libmultipath: make directio checker share io contexts Benjamin Marzinski
                   ` (3 subsequent siblings)
  17 siblings, 0 replies; 37+ messages in thread
From: Benjamin Marzinski @ 2020-02-05 18:58 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck

This commit adds the optional functions libcheck_load, libcheck_unload,
and libcheck_reset. libcheck_load is called when a checker is first
loaded, libcheck_unload is called when it is unloaded, and
libcheck_reset is called during reconfigure, after all the current
paths have been removed. They are designed to setup, reset, and destroy
global state that all checkers of this class use. They will be used
in future commits.

Reviewed-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 libmultipath/checkers.c | 29 ++++++++++++++++++++++++++++-
 libmultipath/checkers.h |  1 +
 multipathd/main.c       |  1 +
 3 files changed, 30 insertions(+), 1 deletion(-)

diff --git a/libmultipath/checkers.c b/libmultipath/checkers.c
index a08bf418..5c7d3004 100644
--- a/libmultipath/checkers.c
+++ b/libmultipath/checkers.c
@@ -18,6 +18,9 @@ struct checker_class {
 	int (*init)(struct checker *);       /* to allocate the context */
 	int (*mp_init)(struct checker *);    /* to allocate the mpcontext */
 	void (*free)(struct checker *);      /* to free the context */
+	int (*load)(void);                   /* to allocate global variables */
+	void (*unload)(void);                /* to free global variables */
+	int (*reset)(void);		     /* to reset the global variables */
 	const char **msgtable;
 	short msgtable_size;
 };
@@ -66,6 +69,8 @@ void free_checker_class(struct checker_class *c)
 	}
 	condlog(3, "unloading %s checker", c->name);
 	list_del(&c->node);
+	if (c->unload)
+		c->unload();
 	if (c->handle) {
 		if (dlclose(c->handle) != 0) {
 			condlog(0, "Cannot unload checker %s: %s",
@@ -98,6 +103,18 @@ static struct checker_class *checker_class_lookup(const char *name)
 	return NULL;
 }
 
+int reset_checker_classes(void)
+{
+	int ret = 0;
+	struct checker_class *c;
+
+	list_for_each_entry(c, &checkers, node) {
+		if (c->reset)
+			ret = ret? ret : c->reset();
+	}
+	return ret;
+}
+
 static struct checker_class *add_checker_class(const char *multipath_dir,
 					       const char *name)
 {
@@ -142,7 +159,11 @@ static struct checker_class *add_checker_class(const char *multipath_dir,
 		goto out;
 
 	c->mp_init = (int (*)(struct checker *)) dlsym(c->handle, "libcheck_mp_init");
-	/* NULL mp_init is o.k. call dlerror() to clear out any error string */
+	c->load = (int (*)(void)) dlsym(c->handle, "libcheck_load");
+	c->unload = (void (*)(void)) dlsym(c->handle, "libcheck_unload");
+	c->reset = (int (*)(void)) dlsym(c->handle, "libcheck_reset");
+	/* These 4 functions can be NULL. call dlerror() to clear out any
+	 * error string */
 	dlerror();
 
 	c->free = (void (*)(struct checker *)) dlsym(c->handle, "libcheck_free");
@@ -168,6 +189,12 @@ static struct checker_class *add_checker_class(const char *multipath_dir,
 	condlog(3, "checker %s: message table size = %d",
 		c->name, c->msgtable_size);
 
+	if (c->load && c->load() != 0) {
+		condlog(0, "%s: failed to load checker", c->name);
+		c->unload = NULL;
+		goto out;
+	}
+
 done:
 	c->sync = 1;
 	list_add(&c->node, &checkers);
diff --git a/libmultipath/checkers.h b/libmultipath/checkers.h
index 5237e7ec..d9930467 100644
--- a/libmultipath/checkers.h
+++ b/libmultipath/checkers.h
@@ -150,6 +150,7 @@ void checker_disable (struct checker *);
 int checker_check (struct checker *, int);
 int checker_is_sync(const struct checker *);
 const char *checker_name (const struct checker *);
+int reset_checker_classes(void);
 /*
  * This returns a string that's best prepended with "$NAME checker",
  * where $NAME is the return value of checker_name().
diff --git a/multipathd/main.c b/multipathd/main.c
index 9e01cfaa..8ee8a77d 100644
--- a/multipathd/main.c
+++ b/multipathd/main.c
@@ -2607,6 +2607,7 @@ reconfigure (struct vectors * vecs)
 	vecs->pathvec = NULL;
 	delete_all_foreign();
 
+	reset_checker_classes();
 	/* Re-read any timezone changes */
 	tzset();
 
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v2 15/17] libmultipath: make directio checker share io contexts
  2020-02-05 18:58 [PATCH v2 00/17] Multipath patch dump Benjamin Marzinski
                   ` (13 preceding siblings ...)
  2020-02-05 18:58 ` [PATCH v2 14/17] libmultipath: add new checker class functions Benjamin Marzinski
@ 2020-02-05 18:58 ` Benjamin Marzinski
  2020-02-10 17:08   ` Martin Wilck
  2020-02-11  8:42   ` Martin Wilck
  2020-02-05 18:58 ` [PATCH v2 16/17] tests: add directio unit tests Benjamin Marzinski
                   ` (2 subsequent siblings)
  17 siblings, 2 replies; 37+ messages in thread
From: Benjamin Marzinski @ 2020-02-05 18:58 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck

On systems with a large number of cores (>500), io_destroy() can take
tens to hundreds of milliseconds to complete, due to RCU
synchronization. If there are a large number of paths using the directio
checker on such a system, this can lead to multipath taking almost a
minute to complete, with the vast majority of time taken up by
io_destroy().

To solve this, the directio checker now allocates one aio context for
every 1024 checkers. This reduces the io_destroy() delay to a thousandth
of its previous level. However, this means that muliple checkers are
sharing the same aio context, and must be able to handle getting results
for other checkers.  Because only one checker is ever running at a
time, this doesn't require any locking.  However, locking could be added
in the future if necessary, to allow multiple checkers to run at the
same time.

When checkers are freed, they usually no longer destroy the io context.
Instead, they attempt to cancel any outstanding request. If that fails,
they put the request on an orphan list, so that it can be freed by other
checkers, once it has completed. IO contexts are only destroyed at three
times, during reconfigure (to deal with the possibility that multipathd
is holding more aio events than it needs to be, since there is a single
limit for the whole system), when the checker class is unloaded, and
in a corner case when checkers are freed. If an aio_group (which
contains the aio context) is entirely full of orphaned requests, then
no checker can use it. Since no checker is using it, there is no checker
to clear out the orphaned requests. In this (likely rare) case, the
last checker from that group to be freed and leave behind an orphaned
request will call io_destroy() and remove the group.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 libmultipath/checkers/directio.c | 336 +++++++++++++++++++++++++------
 multipath/multipath.conf.5       |   7 +-
 2 files changed, 281 insertions(+), 62 deletions(-)

diff --git a/libmultipath/checkers/directio.c b/libmultipath/checkers/directio.c
index 1b00b775..740c68e5 100644
--- a/libmultipath/checkers/directio.c
+++ b/libmultipath/checkers/directio.c
@@ -17,6 +17,33 @@
 
 #include "checkers.h"
 #include "../libmultipath/debug.h"
+#include "../libmultipath/time-util.h"
+
+#define AIO_GROUP_SIZE 1024
+
+/* Note: This checker type relies on the fact that only one checker can be run
+ * at a time, since multiple checkers share the same aio_group, and must be
+ * able to modify other checker's async_reqs. If multple checkers become able
+ * to be run at the same time, this checker will need to add locking, and
+ * probably polling on event fds, to deal with that */
+
+struct aio_group {
+	struct list_head node;
+	int holders;
+	io_context_t ioctx;
+	struct list_head orphans;
+};
+
+struct async_req {
+	struct iocb io;
+	int blksize;
+	unsigned char *	buf;
+	unsigned char * ptr;
+	struct list_head node;
+	int state; /* PATH_REMOVED means this is an orphan */
+};
+
+static LIST_HEAD(aio_grp_list);
 
 enum {
 	MSG_DIRECTIO_UNKNOWN = CHECKER_FIRST_MSGID,
@@ -37,18 +64,136 @@ const char *libcheck_msgtable[] = {
 struct directio_context {
 	int		running;
 	int		reset_flags;
-	int		blksize;
-	unsigned char *	buf;
-	unsigned char * ptr;
-	io_context_t	ioctx;
-	struct iocb	io;
+	struct aio_group *aio_grp;
+	struct async_req *req;
 };
 
+static struct aio_group *
+add_aio_group(void)
+{
+	struct aio_group *aio_grp;
+
+	aio_grp = malloc(sizeof(struct aio_group));
+	if (!aio_grp)
+		return NULL;
+	memset(aio_grp, 0, sizeof(struct aio_group));
+	INIT_LIST_HEAD(&aio_grp->orphans);
+
+	if (io_setup(AIO_GROUP_SIZE, &aio_grp->ioctx) != 0) {
+		LOG(1, "io_setup failed");
+		if (errno == EAGAIN)
+			LOG(1, "global number of io events too small. Increase fs.aio-max-nr with sysctl");
+		free(aio_grp);
+		return NULL;
+	}
+	list_add(&aio_grp->node, &aio_grp_list);
+	return aio_grp;
+}
+
+static int
+set_aio_group(struct directio_context *ct)
+{
+	struct aio_group *aio_grp = NULL;
+
+	list_for_each_entry(aio_grp, &aio_grp_list, node)
+		if (aio_grp->holders < AIO_GROUP_SIZE)
+			goto found;
+	aio_grp = add_aio_group();
+	if (!aio_grp) {
+		ct->aio_grp = NULL;
+		return -1;
+	}
+found:
+	aio_grp->holders++;
+	ct->aio_grp = aio_grp;
+	return 0;
+}
+
+static void
+remove_aio_group(struct aio_group *aio_grp)
+{
+	struct async_req *req, *tmp;
+
+	io_destroy(aio_grp->ioctx);
+	list_for_each_entry_safe(req, tmp, &aio_grp->orphans, node) {
+		list_del(&req->node);
+		free(req->buf);
+		free(req);
+	}
+	list_del(&aio_grp->node);
+	free(aio_grp);
+}
+
+/* If an aio_group is completely full of orphans, then no checkers can
+ * use it, which means that no checkers can clear out the orphans. To
+ * avoid keeping the useless group around, simply remove remove the
+ * group */
+static void
+check_orphaned_group(struct aio_group *aio_grp)
+{
+	int count = 0;
+	struct list_head *item;
+
+	if (aio_grp->holders < AIO_GROUP_SIZE)
+		return;
+	list_for_each(item, &aio_grp->orphans)
+		count++;
+	if (count >= AIO_GROUP_SIZE) {
+		remove_aio_group(aio_grp);
+		if (list_empty(&aio_grp_list))
+			add_aio_group();
+	}
+}
+
+int libcheck_load (void)
+{
+	if (add_aio_group() == NULL) {
+		LOG(1, "libcheck_load failed: %s", strerror(errno));
+		return 1;
+	}
+	return 0;
+}
+
+void libcheck_unload (void)
+{
+	struct aio_group *aio_grp, *tmp;
+
+	list_for_each_entry_safe(aio_grp, tmp, &aio_grp_list, node)
+		remove_aio_group(aio_grp);
+}
+
+int libcheck_reset (void)
+{
+	struct aio_group *aio_grp, *tmp, *reset_grp = NULL;
+
+	/* If a clean existing aio_group exists, use that. Otherwise add a
+	 * new one */
+	list_for_each_entry(aio_grp, &aio_grp_list, node) {
+		if (aio_grp->holders == 0 &&
+		    list_empty(&aio_grp->orphans)) {
+			reset_grp = aio_grp;
+			break;
+		}
+	}
+	if (!reset_grp)
+		reset_grp = add_aio_group();
+	if (!reset_grp) {
+		LOG(1, "checker reset failed");
+		return 1;
+	}
+
+	list_for_each_entry_safe(aio_grp, tmp, &aio_grp_list, node) {
+		if (aio_grp != reset_grp)
+			remove_aio_group(aio_grp);
+	}
+	return 0;
+}
 
 int libcheck_init (struct checker * c)
 {
 	unsigned long pgsize = getpagesize();
 	struct directio_context * ct;
+	struct async_req *req = NULL;
 	long flags;
 
 	ct = malloc(sizeof(struct directio_context));
@@ -56,26 +201,31 @@ int libcheck_init (struct checker * c)
 		return 1;
 	memset(ct, 0, sizeof(struct directio_context));
 
-	if (io_setup(1, &ct->ioctx) != 0) {
-		condlog(1, "io_setup failed");
-		free(ct);
-		return 1;
+	if (set_aio_group(ct) < 0)
+		goto out;
+
+	req = malloc(sizeof(struct async_req));
+	if (!req) {
+		goto out;
 	}
+	memset(req, 0, sizeof(struct async_req));
+	INIT_LIST_HEAD(&req->node);
 
-	if (ioctl(c->fd, BLKBSZGET, &ct->blksize) < 0) {
+	if (ioctl(c->fd, BLKBSZGET, &req->blksize) < 0) {
 		c->msgid = MSG_DIRECTIO_BLOCKSIZE;
-		ct->blksize = 512;
+		req->blksize = 512;
 	}
-	if (ct->blksize > 4096) {
+	if (req->blksize > 4096) {
 		/*
 		 * Sanity check for DASD; BSZGET is broken
 		 */
-		ct->blksize = 4096;
+		req->blksize = 4096;
 	}
-	if (!ct->blksize)
+	if (!req->blksize)
 		goto out;
-	ct->buf = (unsigned char *)malloc(ct->blksize + pgsize);
-	if (!ct->buf)
+
+	req->buf = (unsigned char *)malloc(req->blksize + pgsize);
+	if (!req->buf)
 		goto out;
 
 	flags = fcntl(c->fd, F_GETFL);
@@ -88,17 +238,22 @@ int libcheck_init (struct checker * c)
 		ct->reset_flags = 1;
 	}
 
-	ct->ptr = (unsigned char *) (((unsigned long)ct->buf + pgsize - 1) &
+	req->ptr = (unsigned char *) (((unsigned long)req->buf + pgsize - 1) &
 		  (~(pgsize - 1)));
 
 	/* Successfully initialized, return the context. */
+	ct->req = req;
 	c->context = (void *) ct;
 	return 0;
 
 out:
-	if (ct->buf)
-		free(ct->buf);
-	io_destroy(ct->ioctx);
+	if (req) {
+		if (req->buf)
+			free(req->buf);
+		free(req);
+	}
+	if (ct->aio_grp)
+		ct->aio_grp->holders--;
 	free(ct);
 	return 1;
 }
@@ -106,6 +261,7 @@ out:
 void libcheck_free (struct checker * c)
 {
 	struct directio_context * ct = (struct directio_context *)c->context;
+	struct io_event event;
 	long flags;
 
 	if (!ct)
@@ -121,20 +277,72 @@ void libcheck_free (struct checker * c)
 		}
 	}
 
-	if (ct->buf)
-		free(ct->buf);
-	io_destroy(ct->ioctx);
+	if (ct->running &&
+	    (ct->req->state != PATH_PENDING ||
+	     io_cancel(ct->aio_grp->ioctx, &ct->req->io, &event) == 0))
+		ct->running = 0;
+	if (!ct->running) {
+		free(ct->req->buf);
+		free(ct->req);
+		ct->aio_grp->holders--;
+	} else {
+		ct->req->state = PATH_REMOVED;
+		list_add(&ct->req->node, &ct->aio_grp->orphans);
+		check_orphaned_group(ct->aio_grp);
+	}
+
 	free(ct);
+	c->context = NULL;
+}
+
+static int
+get_events(struct aio_group *aio_grp, struct timespec *timeout)
+{
+	struct io_event events[128];
+	int i, nr, got_events = 0;
+	struct timespec zero_timeout = {0};
+	struct timespec *timep = (timeout)? timeout : &zero_timeout;
+
+	do {
+		errno = 0;
+		nr = io_getevents(aio_grp->ioctx, 1, 128, events, timep);
+		got_events |= (nr > 0);
+
+		for (i = 0; i < nr; i++) {
+			struct async_req *req = container_of(events[i].obj, struct async_req, io);
+
+			LOG(3, "io finished %lu/%lu", events[i].res,
+			    events[i].res2);
+
+			/* got an orphaned request */
+			if (req->state == PATH_REMOVED) {
+				list_del(&req->node);
+				free(req->buf);
+				free(req);
+				aio_grp->holders--;
+			} else
+				req->state = (events[i].res == req->blksize) ?
+					      PATH_UP : PATH_DOWN;
+		}
+		timep = &zero_timeout;
+	} while (nr == 128); /* assume there are more events and try again */
+
+	if (nr < 0)
+		LOG(3, "async io getevents returned %i (errno=%s)",
+		    nr, strerror(errno));
+
+	return got_events;
 }
 
 static int
 check_state(int fd, struct directio_context *ct, int sync, int timeout_secs)
 {
 	struct timespec	timeout = { .tv_nsec = 5 };
-	struct io_event event;
 	struct stat	sb;
-	int		rc = PATH_UNCHECKED;
+	int		rc;
 	long		r;
+	struct timespec currtime, endtime;
+	struct timespec *timep = &timeout;
 
 	if (fstat(fd, &sb) == 0) {
 		LOG(4, "called for %x", (unsigned) sb.st_rdev);
@@ -145,50 +353,60 @@ check_state(int fd, struct directio_context *ct, int sync, int timeout_secs)
 		timeout.tv_nsec = 0;
 	}
 
-	if (!ct->running) {
-		struct iocb *ios[1] = { &ct->io };
+	if (ct->running) {
+		if (ct->req->state != PATH_PENDING) {
+			ct->running = 0;
+			return ct->req->state;
+		}
+	} else {
+		struct iocb *ios[1] = { &ct->req->io };
 
 		LOG(3, "starting new request");
-		memset(&ct->io, 0, sizeof(struct iocb));
-		io_prep_pread(&ct->io, fd, ct->ptr, ct->blksize, 0);
-		if (io_submit(ct->ioctx, 1, ios) != 1) {
+		memset(&ct->req->io, 0, sizeof(struct iocb));
+		io_prep_pread(&ct->req->io, fd, ct->req->ptr,
+			      ct->req->blksize, 0);
+		ct->req->state = PATH_PENDING;
+		if (io_submit(ct->aio_grp->ioctx, 1, ios) != 1) {
 			LOG(3, "io_submit error %i", errno);
 			return PATH_UNCHECKED;
 		}
 	}
 	ct->running++;
 
-	errno = 0;
-	r = io_getevents(ct->ioctx, 1L, 1L, &event, &timeout);
+	get_monotonic_time(&endtime);
+	endtime.tv_sec += timeout.tv_sec;
+	endtime.tv_nsec += timeout.tv_nsec;
+	normalize_timespec(&endtime);
+	while(1) {
+		r = get_events(ct->aio_grp, timep);
 
-	if (r < 0 ) {
-		LOG(3, "async io getevents returned %li (errno=%s)", r,
-		    strerror(errno));
-		ct->running = 0;
-		rc = PATH_UNCHECKED;
-	} else if (r < 1L) {
-		if (ct->running > timeout_secs || sync) {
-			struct iocb *ios[1] = { &ct->io };
-
-			LOG(3, "abort check on timeout");
-			r = io_cancel(ct->ioctx, ios[0], &event);
-			/*
-			 * Only reset ct->running if we really
-			 * could abort the pending I/O
-			 */
-			if (r)
-				LOG(3, "io_cancel error %i", errno);
-			else
-				ct->running = 0;
-			rc = PATH_DOWN;
-		} else {
-			LOG(3, "async io pending");
-			rc = PATH_PENDING;
-		}
+		if (ct->req->state != PATH_PENDING) {
+			ct->running = 0;
+			return ct->req->state;
+		} else if (r == 0 || !timep)
+			break;
+
+		get_monotonic_time(&currtime);
+		timespecsub(&endtime, &currtime, &timeout);
+		if (timeout.tv_sec < 0)
+			timep = NULL;
+	}
+	if (ct->running > timeout_secs || sync) {
+		struct io_event event;
+
+		LOG(3, "abort check on timeout");
+
+		r = io_cancel(ct->aio_grp->ioctx, &ct->req->io, &event);
+		/*
+		 * Only reset ct->running if we really
+		 * could abort the pending I/O
+		 */
+		if (!r)
+			ct->running = 0;
+		rc = PATH_DOWN;
 	} else {
-		LOG(3, "io finished %lu/%lu", event.res, event.res2);
-		ct->running = 0;
-		rc = (event.res == ct->blksize) ? PATH_UP : PATH_DOWN;
+		LOG(3, "async io pending");
+		rc = PATH_PENDING;
 	}
 
 	return rc;
diff --git a/multipath/multipath.conf.5 b/multipath/multipath.conf.5
index dc103fd8..05a5e8ff 100644
--- a/multipath/multipath.conf.5
+++ b/multipath/multipath.conf.5
@@ -494,9 +494,10 @@ Check the path state for LSI/Engenio/NetApp RDAC class as NetApp SANtricity E/EF
 Series, and OEM arrays from IBM DELL SGI STK and SUN.
 .TP
 .I directio
-(Deprecated) Read the first sector with direct I/O. This checker is being
-deprecated, it could cause spurious path failures under high load.
-Please use \fItur\fR instead.
+(Deprecated) Read the first sector with direct I/O. If you have a large number
+of paths, or many AIO users on a system, you may need to use sysctl to
+increase fs.aio-max-nr. This checker is being deprecated, it could cause
+spurious path failures under high load. Please use \fItur\fR instead.
 .TP
 .I cciss_tur
 (Hardware-dependent)
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v2 16/17] tests: add directio unit tests
  2020-02-05 18:58 [PATCH v2 00/17] Multipath patch dump Benjamin Marzinski
                   ` (14 preceding siblings ...)
  2020-02-05 18:58 ` [PATCH v2 15/17] libmultipath: make directio checker share io contexts Benjamin Marzinski
@ 2020-02-05 18:58 ` Benjamin Marzinski
  2020-02-10 17:23   ` Martin Wilck
  2020-02-05 18:58 ` [PATCH v2 17/17] tests: make directio tests able to work on a real device Benjamin Marzinski
  2020-02-11  8:45 ` [PATCH v2 00/17] Multipath patch dump Martin Wilck
  17 siblings, 1 reply; 37+ messages in thread
From: Benjamin Marzinski @ 2020-02-05 18:58 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 tests/Makefile   |   3 +-
 tests/directio.c | 704 +++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 706 insertions(+), 1 deletion(-)
 create mode 100644 tests/directio.c

diff --git a/tests/Makefile b/tests/Makefile
index a5cdf390..275fdd7d 100644
--- a/tests/Makefile
+++ b/tests/Makefile
@@ -3,7 +3,8 @@ include ../Makefile.inc
 CFLAGS += $(BIN_CFLAGS) -I$(multipathdir) -I$(mpathcmddir)
 LIBDEPS += -L$(multipathdir) -lmultipath -lcmocka
 
-TESTS := uevent parser util dmevents hwtable blacklist unaligned vpd pgpolicy
+TESTS := uevent parser util dmevents hwtable blacklist unaligned vpd pgpolicy \
+	 directio
 
 .SILENT: $(TESTS:%=%.o)
 .PRECIOUS: $(TESTS:%=%-test)
diff --git a/tests/directio.c b/tests/directio.c
new file mode 100644
index 00000000..5f067e24
--- /dev/null
+++ b/tests/directio.c
@@ -0,0 +1,704 @@
+/*
+ * Copyright (c) 2018 Benjamin Marzinski, Redhat
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <https://www.gnu.org/licenses/>.
+ *
+ */
+
+#define _GNU_SOURCE
+#include <stdint.h>
+#include <stdbool.h>
+#include <stdarg.h>
+#include <stddef.h>
+#include <setjmp.h>
+#include <stdlib.h>
+#define UNIT_TESTING /* enable memory testing in directio.c */
+#include <cmocka.h>
+#include "globals.c"
+#include "../libmultipath/checkers/directio.c"
+
+int test_fd = 111;
+int ioctx_count = 0;
+struct io_event mock_events[AIO_GROUP_SIZE]; /* same as the checker max */
+int ev_off = 0;
+struct timespec zero_timeout = {0};
+struct timespec full_timeout = { .tv_sec = -1 };
+
+int __wrap_ioctl(int fd, unsigned long request, void *argp)
+{
+	int *blocksize = (int *)argp;
+
+	assert_int_equal(fd, test_fd);
+	assert_int_equal(request, BLKBSZGET);
+	assert_non_null(blocksize);
+	*blocksize = mock_type(int);
+	return 0;
+}
+
+int __wrap_fcntl(int fd, int cmd, long arg)
+{
+	assert_int_equal(fd, test_fd);
+	assert_int_equal(cmd, F_GETFL);
+	return O_DIRECT;
+}
+
+int __wrap___fxstat(int ver, int fd, struct stat *statbuf)
+{
+	assert_int_equal(fd, test_fd);
+	assert_non_null(statbuf);
+	memset(statbuf, 0, sizeof(struct stat));
+	return 0;
+}
+
+int __wrap_io_setup(int maxevents, io_context_t *ctxp)
+{
+	ioctx_count++;
+	return mock_type(int);
+}
+
+int __wrap_io_destroy(io_context_t ctx)
+{
+	ioctx_count--;
+	return mock_type(int);
+}
+
+int __wrap_io_submit(io_context_t ctx, long nr, struct iocb *ios[])
+{
+	return mock_type(int);
+}
+
+int __wrap_io_cancel(io_context_t ctx, struct iocb *iocb, struct io_event *evt)
+{
+	return mock_type(int);
+}
+
+int __wrap_io_getevents(io_context_t ctx, long min_nr, long nr,
+			struct io_event *events, struct timespec *timeout)
+{
+	int i, nr_evs;
+	struct timespec *sleep_tmo;
+	struct io_event *evs;
+
+	assert_non_null(timeout);
+	nr_evs = mock_type(int);
+	assert_true(nr_evs <= nr);
+	if (!nr_evs)
+		return 0;
+	sleep_tmo = mock_ptr_type(struct timespec *);
+	if (sleep_tmo) {
+		if (sleep_tmo->tv_sec < 0)
+			nanosleep(timeout, NULL);
+		else
+			nanosleep(sleep_tmo, NULL);
+	}
+	if (nr_evs < 0) {
+		errno = -nr_evs;
+		return -1;
+	}
+	evs = mock_ptr_type(struct io_event *);
+	for (i = 0; i < nr_evs; i++)
+		events[i] = evs[i];
+	ev_off -= nr_evs;
+	return nr_evs;
+}
+
+static void return_io_getevents_none(void)
+{
+	will_return(__wrap_io_getevents, 0);
+}
+
+static void return_io_getevents_nr(struct timespec *ts, int nr,
+				   struct async_req **reqs, int *res)
+{
+	int i, off = 0;
+
+	for(i = 0; i < nr; i++) {
+		mock_events[i + ev_off].obj = &reqs[i]->io;
+		if (res[i] == 0)
+			mock_events[i + ev_off].res = reqs[i]->blksize;
+	}
+	while (nr > 0) {
+		will_return(__wrap_io_getevents, (nr > 128)? 128 : nr);
+		will_return(__wrap_io_getevents, ts);
+		will_return(__wrap_io_getevents, &mock_events[off + ev_off]);
+		ts = NULL;
+		off += 128;
+		nr -= 128;
+	}
+	if (nr == 0)
+		will_return(__wrap_io_getevents, 0);
+	ev_off += i;
+}
+
+void do_check_state(struct checker *c, int sync, int timeout, int chk_state)
+{
+	struct directio_context * ct = (struct directio_context *)c->context;
+
+	if (!ct->running)
+		will_return(__wrap_io_submit, 1);
+	assert_int_equal(check_state(test_fd, ct, sync, timeout), chk_state);
+	assert_int_equal(ev_off, 0);
+	memset(mock_events, 0, sizeof(mock_events));
+}
+
+void do_libcheck_unload(int nr_aio_grps)
+{
+	int count = 0;
+	struct aio_group *aio_grp;
+
+	list_for_each_entry(aio_grp, &aio_grp_list, node)
+		count++;
+	assert_int_equal(count, nr_aio_grps);
+	for (count = 0; count < nr_aio_grps; count++)
+		will_return(__wrap_io_destroy, 0);
+	libcheck_unload();
+	assert_true(list_empty(&aio_grp_list));
+	assert_int_equal(ioctx_count, 0);
+}
+
+static void do_libcheck_init(struct checker *c, int blocksize,
+			     struct async_req **req)
+{
+	struct directio_context * ct;
+
+	c->fd = test_fd;
+	will_return(__wrap_ioctl, blocksize);
+	assert_int_equal(libcheck_init(c), 0);
+	ct = (struct directio_context *)c->context;
+	assert_non_null(ct);
+	assert_non_null(ct->aio_grp);
+	assert_non_null(ct->req);
+	if (req)
+		*req = ct->req;
+	assert_int_equal(ct->req->blksize, blocksize);
+}
+
+static int is_checker_running(struct checker *c)
+{
+	struct directio_context * ct = (struct directio_context *)c->context;
+	return ct->running;
+}
+
+static struct aio_group *get_aio_grp(struct checker *c)
+{
+	struct directio_context * ct = (struct directio_context *)c->context;
+
+	assert_non_null(ct);
+	return ct->aio_grp;
+}
+
+static void check_aio_grp(struct aio_group *aio_grp, int holders,
+			  int orphans)
+{
+	int count = 0;
+	struct list_head *item;
+
+	list_for_each(item, &aio_grp->orphans)
+		count++;
+	assert_int_equal(holders, aio_grp->holders);
+	assert_int_equal(orphans, count);
+}
+
+static void do_libcheck_load(void)
+{
+	int count = 0;
+	struct aio_group *aio_grp;
+
+	assert_true(list_empty(&aio_grp_list));
+	will_return(__wrap_io_setup, 0);
+	assert_int_equal(libcheck_load(), 0);
+	list_for_each_entry(aio_grp, &aio_grp_list, node) {
+		assert_int_equal(aio_grp->holders, 0);
+		count++;
+	}
+	assert_int_equal(count, 1);
+}
+
+/* simple loading resetting and unloading test */
+static void test_load_reset_unload(void **state)
+{
+	struct list_head *item;
+	do_libcheck_load();
+	item = aio_grp_list.next;
+	assert_int_equal(libcheck_reset(), 0);
+	assert_false(list_empty(&aio_grp_list));
+	assert_true(item == aio_grp_list.next);
+	do_libcheck_unload(1);
+}
+
+/* test initializing and then freeing 4096 checkers */
+static void test_init_free(void **state)
+{
+	int i, count = 0;
+	struct checker c[4096] = {0};
+	struct aio_group *aio_grp;
+
+	do_libcheck_load();
+	aio_grp = list_entry(aio_grp_list.next, typeof(*aio_grp), node);
+	will_return(__wrap_io_setup, 0);
+	will_return(__wrap_io_setup, 0);
+	will_return(__wrap_io_setup, 0);
+	for (i = 0; i < 4096; i++) {
+		struct directio_context * ct;
+
+		if (i % 3 == 0)
+			do_libcheck_init(&c[i], 512, NULL);
+		else if (i % 3 == 1)
+			do_libcheck_init(&c[i], 1024, NULL);
+		else
+			do_libcheck_init(&c[i], 4096, NULL);
+		ct = (struct directio_context *)c[i].context;
+		assert_non_null(ct->aio_grp);
+		if (i && (i & 1023) == 0)
+			aio_grp = ct->aio_grp;
+		else {
+			assert_ptr_equal(ct->aio_grp, aio_grp);
+			assert_int_equal(aio_grp->holders, (i & 1023) + 1);
+		}
+	}
+	count = 0;
+	list_for_each_entry(aio_grp, &aio_grp_list, node)
+		count++;
+	assert_int_equal(count, 4);
+	for (i = 0; i < 4096; i++) {
+		struct directio_context * ct = (struct directio_context *)c[i].context;
+
+		aio_grp = ct->aio_grp;
+		libcheck_free(&c[i]);
+		assert_int_equal(aio_grp->holders, 1023 - (i & 1023));
+	}
+	count = 0;
+	list_for_each_entry(aio_grp, &aio_grp_list, node) {
+		assert_int_equal(aio_grp->holders, 0);
+		count++;
+	}
+	assert_int_equal(count, 4);
+	will_return(__wrap_io_destroy, 0);
+	will_return(__wrap_io_destroy, 0);
+	will_return(__wrap_io_destroy, 0);
+	assert_int_equal(libcheck_reset(), 0);
+	do_libcheck_unload(1);
+}
+
+/* check mixed initializing and freeing 4096 checkers */
+static void test_multi_init_free(void **state)
+{
+	int i, count;
+	struct checker c[4096] = {0};
+	struct aio_group *aio_grp;
+
+	do_libcheck_load();
+	will_return(__wrap_io_setup, 0);
+	will_return(__wrap_io_setup, 0);
+	will_return(__wrap_io_setup, 0);
+	for (count = 0, i = 0; i < 4096; count++) {
+		/* usually init, but occasionally free checkers */
+		if (count == 0 || (count % 5 != 0 && count % 7 != 0)) {
+			do_libcheck_init(&c[i], 4096, NULL);
+			i++;
+		} else {
+			i--;
+			libcheck_free(&c[i]);
+		}
+	}
+	count = 0;
+	list_for_each_entry(aio_grp, &aio_grp_list, node) {
+		assert_int_equal(aio_grp->holders, 1024);
+		count++;
+	}
+	assert_int_equal(count, 4);
+	for (count = 0, i = 4096; i > 0; count++) {
+		/* usually free, but occasionally init checkers */
+		if (count == 0 || (count % 5 != 0 && count % 7 != 0)) {
+			i--;
+			libcheck_free(&c[i]);
+		} else {
+			do_libcheck_init(&c[i], 4096, NULL);
+			i++;
+		}
+	}
+	do_libcheck_unload(4);
+}
+
+/* simple single checker sync test */
+static void test_check_state_simple(void **state)
+{
+	struct checker c = {0};
+	struct async_req *req;
+	int res = 0;
+
+	do_libcheck_load();
+	do_libcheck_init(&c, 4096, &req);
+	return_io_getevents_nr(NULL, 1, &req, &res);
+	do_check_state(&c, 1, 30, PATH_UP);
+	libcheck_free(&c);
+	do_libcheck_unload(1);
+}
+
+/* test sync timeout */
+static void test_check_state_timeout(void **state)
+{
+	struct checker c = {0};
+	struct aio_group *aio_grp;
+
+	do_libcheck_load();
+	do_libcheck_init(&c, 4096, NULL);
+	aio_grp = get_aio_grp(&c);
+	return_io_getevents_none();
+	will_return(__wrap_io_cancel, 0);
+	do_check_state(&c, 1, 30, PATH_DOWN);
+	check_aio_grp(aio_grp, 1, 0);
+	libcheck_free(&c);
+	do_libcheck_unload(1);
+}
+
+/* test async timeout */
+static void test_check_state_async_timeout(void **state)
+{
+	struct checker c = {0};
+	struct aio_group *aio_grp;
+
+	do_libcheck_load();
+	do_libcheck_init(&c, 4096, NULL);
+	aio_grp = get_aio_grp(&c);
+	return_io_getevents_none();
+	do_check_state(&c, 0, 3, PATH_PENDING);
+	return_io_getevents_none();
+	do_check_state(&c, 0, 3, PATH_PENDING);
+	return_io_getevents_none();
+	do_check_state(&c, 0, 3, PATH_PENDING);
+	return_io_getevents_none();
+	will_return(__wrap_io_cancel, 0);
+	do_check_state(&c, 0, 3, PATH_DOWN);
+	check_aio_grp(aio_grp, 1, 0);
+	libcheck_free(&c);
+	do_libcheck_unload(1);
+}
+
+/* test freeing checkers with outstanding requests */
+static void test_free_with_pending(void **state)
+{
+        struct checker c[2] = {0};
+        struct aio_group *aio_grp;
+	struct async_req *req;
+	int res = 0;
+
+        do_libcheck_load();
+        do_libcheck_init(&c[0], 4096, &req);
+	do_libcheck_init(&c[1], 4096, NULL);
+        aio_grp = get_aio_grp(c);
+        return_io_getevents_none();
+        do_check_state(&c[0], 0, 30, PATH_PENDING);
+	return_io_getevents_nr(NULL, 1, &req, &res);
+	return_io_getevents_none();
+	do_check_state(&c[1], 0, 30, PATH_PENDING);
+	assert_true(is_checker_running(&c[0]));
+	assert_true(is_checker_running(&c[1]));
+	check_aio_grp(aio_grp, 2, 0);
+        libcheck_free(&c[0]);
+	check_aio_grp(aio_grp, 1, 0);
+        will_return(__wrap_io_cancel, 0);
+        libcheck_free(&c[1]);
+        check_aio_grp(aio_grp, 0, 0);
+        do_libcheck_unload(1);
+}
+
+/* test removing orpahed aio_group on free */
+static void test_orphaned_aio_group(void **state)
+{
+	struct checker c[AIO_GROUP_SIZE] = {0};
+	struct aio_group *aio_grp, *tmp_grp;
+	int i;
+
+        do_libcheck_load();
+	for (i = 0; i < AIO_GROUP_SIZE; i++) {
+		do_libcheck_init(&c[i], 4096, NULL);
+		return_io_getevents_none();
+		do_check_state(&c[i], 0, 30, PATH_PENDING);
+	}
+	aio_grp = get_aio_grp(c);
+	check_aio_grp(aio_grp, AIO_GROUP_SIZE, 0);
+	i = 0;
+	list_for_each_entry(tmp_grp, &aio_grp_list, node)
+		i++;
+	assert_int_equal(i, 1);
+	for (i = 0; i < AIO_GROUP_SIZE; i++) {
+		assert_true(is_checker_running(&c[i]));
+		will_return(__wrap_io_cancel, -1);
+		if (i == AIO_GROUP_SIZE - 1) {
+			/* remove the orphaned group and create a new one */
+			will_return(__wrap_io_destroy, 0);
+			will_return(__wrap_io_setup, 0);
+		}
+		libcheck_free(&c[i]);
+	}
+	i = 0;
+	list_for_each_entry(tmp_grp, &aio_grp_list, node) {
+		i++;
+		check_aio_grp(aio_grp, 0, 0);
+	}
+	assert_int_equal(i, 1);
+        do_libcheck_unload(1);
+}
+
+/* test sync timeout with failed cancel and cleanup by another
+ * checker */
+static void test_timeout_cancel_failed(void **state)
+{
+	struct checker c[2] = {0};
+	struct aio_group *aio_grp;
+	struct async_req *reqs[2];
+	int res[] = {0,0};
+	int i;
+
+	do_libcheck_load();
+	for (i = 0; i < 2; i++)
+		do_libcheck_init(&c[i], 4096, &reqs[i]);
+	aio_grp = get_aio_grp(c);
+	return_io_getevents_none();
+	will_return(__wrap_io_cancel, -1);
+	do_check_state(&c[0], 1, 30, PATH_DOWN);
+	assert_true(is_checker_running(&c[0]));
+	check_aio_grp(aio_grp, 2, 0);
+	return_io_getevents_none();
+	will_return(__wrap_io_cancel, -1);
+	do_check_state(&c[0], 1, 30, PATH_DOWN);
+	assert_true(is_checker_running(&c[0]));
+	return_io_getevents_nr(NULL, 2, reqs, res);
+	do_check_state(&c[1], 1, 30, PATH_UP);
+	do_check_state(&c[0], 1, 30, PATH_UP);
+	for (i = 0; i < 2; i++) {
+		assert_false(is_checker_running(&c[i]));
+		libcheck_free(&c[i]);
+	}
+	do_libcheck_unload(1);
+}
+
+/* test async timeout with failed cancel and cleanup by another
+ * checker */
+static void test_async_timeout_cancel_failed(void **state)
+{
+	struct checker c[2] = {0};
+	struct async_req *reqs[2];
+	int res[] = {0,0};
+	int i;
+
+	do_libcheck_load();
+	for (i = 0; i < 2; i++)
+		do_libcheck_init(&c[i], 4096, &reqs[i]);
+	return_io_getevents_none();
+	do_check_state(&c[0], 0, 2, PATH_PENDING);
+	return_io_getevents_none();
+	do_check_state(&c[1], 0, 2, PATH_PENDING);
+	return_io_getevents_none();
+	do_check_state(&c[0], 0, 2, PATH_PENDING);
+	return_io_getevents_none();
+	do_check_state(&c[1], 0, 2, PATH_PENDING);
+	return_io_getevents_none();
+	will_return(__wrap_io_cancel, -1);
+	do_check_state(&c[0], 0, 2, PATH_DOWN);
+	return_io_getevents_nr(NULL, 1, &reqs[1], &res[1]);
+	do_check_state(&c[1], 0, 2, PATH_UP);
+	return_io_getevents_none();
+	will_return(__wrap_io_cancel, -1);
+	do_check_state(&c[0], 0, 2, PATH_DOWN);
+	assert_true(is_checker_running(&c[0]));
+	return_io_getevents_nr(NULL, 2, reqs, res);
+	do_check_state(&c[1], 0, 2, PATH_UP);
+	do_check_state(&c[0], 0, 2, PATH_UP);
+	for (i = 0; i < 2; i++) {
+		assert_false(is_checker_running(&c[i]));
+		libcheck_free(&c[i]);
+	}
+	do_libcheck_unload(1);
+}
+
+/* test orphaning a request, and having another checker clean it up */
+static void test_orphan_checker_cleanup(void **state)
+{
+	struct checker c[2] = {0};
+	struct async_req *reqs[2];
+	int res[] = {0,0};
+	struct aio_group *aio_grp;
+	int i;
+
+	do_libcheck_load();
+	for (i = 0; i < 2; i++)
+		do_libcheck_init(&c[i], 4096, &reqs[i]);
+	aio_grp = get_aio_grp(c);
+	return_io_getevents_none();
+	do_check_state(&c[0], 0, 30, PATH_PENDING);
+	will_return(__wrap_io_cancel, -1);
+	check_aio_grp(aio_grp, 2, 0);
+	libcheck_free(&c[0]);
+	check_aio_grp(aio_grp, 2, 1);
+	return_io_getevents_nr(NULL, 2, reqs, res);
+	do_check_state(&c[1], 0, 2, PATH_UP);
+	check_aio_grp(aio_grp, 1, 0);
+	libcheck_free(&c[1]);
+	check_aio_grp(aio_grp, 0, 0);
+	do_libcheck_unload(1);
+}
+
+/* test orphaning a request, and having reset clean it up */
+static void test_orphan_reset_cleanup(void **state)
+{
+	struct checker c;
+	struct aio_group *orphan_aio_grp, *tmp_aio_grp;
+	int found, count;
+
+	do_libcheck_load();
+	do_libcheck_init(&c, 4096, NULL);
+	orphan_aio_grp = get_aio_grp(&c);
+	return_io_getevents_none();
+	do_check_state(&c, 0, 30, PATH_PENDING);
+	will_return(__wrap_io_cancel, -1);
+	check_aio_grp(orphan_aio_grp, 1, 0);
+	libcheck_free(&c);
+	check_aio_grp(orphan_aio_grp, 1, 1);
+	found = count = 0;
+	list_for_each_entry(tmp_aio_grp, &aio_grp_list, node) {
+		count++;
+		if (tmp_aio_grp == orphan_aio_grp)
+			found = 1;
+	}
+	assert_int_equal(count, 1);
+	assert_int_equal(found, 1);
+	will_return(__wrap_io_setup, 0);
+	will_return(__wrap_io_destroy, 0);
+	assert_int_equal(libcheck_reset(), 0);
+	found = count = 0;
+	list_for_each_entry(tmp_aio_grp, &aio_grp_list, node) {
+		count++;
+		check_aio_grp(tmp_aio_grp, 0, 0);
+		if (tmp_aio_grp == orphan_aio_grp)
+			found = 1;
+	}
+	assert_int_equal(count, 1);
+	assert_int_equal(found, 0);
+	do_libcheck_unload(1);
+}
+
+/* test orphaning a request, and having unload clean it up */
+static void test_orphan_unload_cleanup(void **state)
+{
+	struct checker c;
+	struct aio_group *orphan_aio_grp, *tmp_aio_grp;
+	int found, count;
+
+	do_libcheck_load();
+	do_libcheck_init(&c, 4096, NULL);
+	orphan_aio_grp = get_aio_grp(&c);
+	return_io_getevents_none();
+	do_check_state(&c, 0, 30, PATH_PENDING);
+	will_return(__wrap_io_cancel, -1);
+	check_aio_grp(orphan_aio_grp, 1, 0);
+	libcheck_free(&c);
+	check_aio_grp(orphan_aio_grp, 1, 1);
+	found = count = 0;
+	list_for_each_entry(tmp_aio_grp, &aio_grp_list, node) {
+		count++;
+		if (tmp_aio_grp == orphan_aio_grp)
+			found = 1;
+	}
+	assert_int_equal(count, 1);
+	assert_int_equal(found, 1);
+	do_libcheck_unload(1);
+}
+
+/* test checkers with different blocksizes */
+static void test_check_state_blksize(void **state)
+{
+	int i;
+	struct checker c[3] = {0};
+	int blksize[] = {4096, 1024, 512};
+	struct async_req *reqs[3];
+	int res[] = {0,1,0};
+	int chk_state[] = {PATH_UP, PATH_DOWN, PATH_UP};
+
+	do_libcheck_load();
+	for (i = 0; i < 3; i++)
+		do_libcheck_init(&c[i], blksize[i], &reqs[i]);
+	for (i = 0; i < 3; i++) {
+		return_io_getevents_nr(NULL, 1, &reqs[i], &res[i]);
+		do_check_state(&c[i], 1, 30, chk_state[i]);
+	}
+	for (i = 0; i < 3; i++) {
+		assert_false(is_checker_running(&c[i]));
+		libcheck_free(&c[i]);
+	}
+	do_libcheck_unload(1);
+}
+
+/* test async checkers pending and getting resovled by another checker
+ * as well as the loops for getting multiple events */
+static void test_check_state_async(void **state)
+{
+	int i;
+	struct checker c[257] = {0};
+	struct async_req *reqs[257];
+	int res[257] = {0};
+
+	do_libcheck_load();
+	for (i = 0; i < 257; i++)
+		do_libcheck_init(&c[i], 4096, &reqs[i]);
+	for (i = 0; i < 256; i++) {
+		return_io_getevents_none();
+		do_check_state(&c[i], 0, 30, PATH_PENDING);
+		assert_true(is_checker_running(&c[i]));
+	}
+	return_io_getevents_nr(&full_timeout, 256, reqs, res);
+	return_io_getevents_nr(NULL, 1, &reqs[256], &res[256]);
+	do_check_state(&c[256], 0, 30, PATH_UP);
+	assert_false(is_checker_running(&c[256]));
+	libcheck_free(&c[256]);
+	for (i = 0; i < 256; i++) {
+		do_check_state(&c[i], 0, 30, PATH_UP);
+		assert_false(is_checker_running(&c[i]));
+		libcheck_free(&c[i]);
+	}
+	do_libcheck_unload(1);
+}
+
+int test_directio(void)
+{
+	const struct CMUnitTest tests[] = {
+		cmocka_unit_test(test_load_reset_unload),
+		cmocka_unit_test(test_init_free),
+		cmocka_unit_test(test_multi_init_free),
+		cmocka_unit_test(test_check_state_simple),
+		cmocka_unit_test(test_check_state_timeout),
+		cmocka_unit_test(test_check_state_async_timeout),
+		cmocka_unit_test(test_free_with_pending),
+		cmocka_unit_test(test_timeout_cancel_failed),
+		cmocka_unit_test(test_async_timeout_cancel_failed),
+		cmocka_unit_test(test_orphan_checker_cleanup),
+		cmocka_unit_test(test_orphan_reset_cleanup),
+		cmocka_unit_test(test_orphan_unload_cleanup),
+		cmocka_unit_test(test_check_state_blksize),
+		cmocka_unit_test(test_check_state_async),
+		cmocka_unit_test(test_orphaned_aio_group),
+	};
+
+	return cmocka_run_group_tests(tests, NULL, NULL);
+}
+
+int main(void)
+{
+	int ret = 0;
+
+	conf.verbosity = 2;
+	ret += test_directio();
+	return ret;
+}
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v2 17/17] tests: make directio tests able to work on a real device
  2020-02-05 18:58 [PATCH v2 00/17] Multipath patch dump Benjamin Marzinski
                   ` (15 preceding siblings ...)
  2020-02-05 18:58 ` [PATCH v2 16/17] tests: add directio unit tests Benjamin Marzinski
@ 2020-02-05 18:58 ` Benjamin Marzinski
  2020-02-10 17:27   ` Martin Wilck
  2020-02-11  8:45 ` [PATCH v2 00/17] Multipath patch dump Martin Wilck
  17 siblings, 1 reply; 37+ messages in thread
From: Benjamin Marzinski @ 2020-02-05 18:58 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck

There is now a file in tests called directio_test_dev. If the commented
out test device line is uncommented and set to a device, it can be used
to test the directio checker on that device, instead of faking the
device.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 tests/Makefile          |  16 +++++-
 tests/directio.c        | 114 ++++++++++++++++++++++++++++++++++++++--
 tests/directio_test_dev |   4 ++
 3 files changed, 130 insertions(+), 4 deletions(-)
 create mode 100644 tests/directio_test_dev

diff --git a/tests/Makefile b/tests/Makefile
index 275fdd7d..0003d778 100644
--- a/tests/Makefile
+++ b/tests/Makefile
@@ -9,10 +9,18 @@ TESTS := uevent parser util dmevents hwtable blacklist unaligned vpd pgpolicy \
 .SILENT: $(TESTS:%=%.o)
 .PRECIOUS: $(TESTS:%=%-test)
 
+DIO_TEST_DEV = $(shell sed -n -e 's/^[[:space:]]*DIO_TEST_DEV[[:space:]]*=[[:space:]]*\([^[:space:]\#]\+\).*/\1/p' < directio_test_dev)
+
 all:	$(TESTS:%=%.out)
 
+# test-specific compiler flags
+# XYZ-test_FLAGS: Additional compiler flags for this test
+ifneq ($(DIO_TEST_DEV),)
+directio-test_FLAGS := -DDIO_TEST_DEV=\"$(DIO_TEST_DEV)\"
+endif
+
 # test-specific linker flags
-# XYZ-test-TESTDEPS: test libraries containing __wrap_xyz functions
+# XYZ-test_TESTDEPS: test libraries containing __wrap_xyz functions
 # XYZ-test_OBJDEPS: object files from libraries to link in explicitly
 #    That may be necessary if functions called from the object file are wrapped
 #    (wrapping works only for symbols which are undefined after processing a
@@ -28,6 +36,12 @@ blacklist-test_OBJDEPS := ../libmultipath/blacklist.o
 blacklist-test_LIBDEPS := -ludev
 vpd-test_OBJDEPS :=  ../libmultipath/discovery.o
 vpd-test_LIBDEPS := -ludev -lpthread -ldl
+ifneq ($(DIO_TEST_DEV),)
+directio-test_LIBDEPS := -laio
+endif
+
+%.o: %.c
+	$(CC) $(CFLAGS) $($*-test_FLAGS) -c -o $@ $<
 
 lib/libchecktur.so:
 	mkdir lib
diff --git a/tests/directio.c b/tests/directio.c
index 5f067e24..236c514b 100644
--- a/tests/directio.c
+++ b/tests/directio.c
@@ -35,8 +35,14 @@ int ev_off = 0;
 struct timespec zero_timeout = {0};
 struct timespec full_timeout = { .tv_sec = -1 };
 
+int __real_ioctl(int fd, unsigned long request, void *argp);
+
 int __wrap_ioctl(int fd, unsigned long request, void *argp)
 {
+#ifdef DIO_TEST_DEV
+	mock_type(int);
+	return __real_ioctl(fd, request, argp);
+#else
 	int *blocksize = (int *)argp;
 
 	assert_int_equal(fd, test_fd);
@@ -44,57 +50,115 @@ int __wrap_ioctl(int fd, unsigned long request, void *argp)
 	assert_non_null(blocksize);
 	*blocksize = mock_type(int);
 	return 0;
+#endif
 }
 
+int __real_fcntl(int fd, int cmd, long arg);
+
 int __wrap_fcntl(int fd, int cmd, long arg)
 {
+#ifdef DIO_TEST_DEV
+	return __real_fcntl(fd, cmd, arg);
+#else
 	assert_int_equal(fd, test_fd);
 	assert_int_equal(cmd, F_GETFL);
 	return O_DIRECT;
+#endif
 }
 
+int __real___fxstat(int ver, int fd, struct stat *statbuf);
+
 int __wrap___fxstat(int ver, int fd, struct stat *statbuf)
 {
+#ifdef DIO_TEST_DEV
+	return __real___fxstat(ver, fd, statbuf);
+#else
 	assert_int_equal(fd, test_fd);
 	assert_non_null(statbuf);
 	memset(statbuf, 0, sizeof(struct stat));
 	return 0;
+#endif
 }
 
+int __real_io_setup(int maxevents, io_context_t *ctxp);
+
 int __wrap_io_setup(int maxevents, io_context_t *ctxp)
 {
 	ioctx_count++;
+#ifdef DIO_TEST_DEV
+	int ret = mock_type(int);
+	assert_int_equal(ret, __real_io_setup(maxevents, ctxp));
+	return ret;
+#else
 	return mock_type(int);
+#endif
 }
 
+int __real_io_destroy(io_context_t ctx);
+
 int __wrap_io_destroy(io_context_t ctx)
 {
 	ioctx_count--;
+#ifdef DIO_TEST_DEV
+	int ret = mock_type(int);
+	assert_int_equal(ret, __real_io_destroy(ctx));
+	return ret;
+#else
 	return mock_type(int);
+#endif
 }
 
+int __real_io_submit(io_context_t ctx, long nr, struct iocb *ios[]);
+
 int __wrap_io_submit(io_context_t ctx, long nr, struct iocb *ios[])
 {
+#ifdef DIO_TEST_DEV
+	struct timespec dev_delay = { .tv_nsec = 100000 };
+	int ret = mock_type(int);
+	assert_int_equal(ret, __real_io_submit(ctx, nr, ios));
+	nanosleep(&dev_delay, NULL);
+	return ret;
+#else
 	return mock_type(int);
+#endif
 }
 
+int __real_io_cancel(io_context_t ctx, struct iocb *iocb, struct io_event *evt);
+
 int __wrap_io_cancel(io_context_t ctx, struct iocb *iocb, struct io_event *evt)
 {
+#ifdef DIO_TEST_DEV
+	mock_type(int);
+	return __real_io_cancel(ctx, iocb, evt);
+#else
 	return mock_type(int);
+#endif
 }
 
+int __real_io_getevents(io_context_t ctx, long min_nr, long nr,
+			struct io_event *events, struct timespec *timeout);
+
 int __wrap_io_getevents(io_context_t ctx, long min_nr, long nr,
 			struct io_event *events, struct timespec *timeout)
 {
-	int i, nr_evs;
+	int nr_evs;
+#ifndef DIO_TEST_DEV
 	struct timespec *sleep_tmo;
+	int i;
 	struct io_event *evs;
+#endif
 
 	assert_non_null(timeout);
 	nr_evs = mock_type(int);
 	assert_true(nr_evs <= nr);
 	if (!nr_evs)
 		return 0;
+#ifdef DIO_TEST_DEV
+	mock_ptr_type(struct timespec *);
+	mock_ptr_type(struct io_event *);
+	assert_int_equal(nr_evs, __real_io_getevents(ctx, min_nr, nr_evs,
+						     events, timeout));
+#else
 	sleep_tmo = mock_ptr_type(struct timespec *);
 	if (sleep_tmo) {
 		if (sleep_tmo->tv_sec < 0)
@@ -109,6 +173,7 @@ int __wrap_io_getevents(io_context_t ctx, long min_nr, long nr,
 	evs = mock_ptr_type(struct io_event *);
 	for (i = 0; i < nr_evs; i++)
 		events[i] = evs[i];
+#endif
 	ev_off -= nr_evs;
 	return nr_evs;
 }
@@ -181,7 +246,10 @@ static void do_libcheck_init(struct checker *c, int blocksize,
 	assert_non_null(ct->req);
 	if (req)
 		*req = ct->req;
+#ifndef DIO_TEST_DEV
+	/* don't check fake blocksize on real devices */
 	assert_int_equal(ct->req->blksize, blocksize);
+#endif
 }
 
 static int is_checker_running(struct checker *c)
@@ -359,6 +427,11 @@ static void test_check_state_timeout(void **state)
 	will_return(__wrap_io_cancel, 0);
 	do_check_state(&c, 1, 30, PATH_DOWN);
 	check_aio_grp(aio_grp, 1, 0);
+#ifdef DIO_TEST_DEV
+	/* io_cancel will return negative value on timeout, so it happens again
+	 * when freeing the checker */
+	will_return(__wrap_io_cancel, 0);
+#endif
 	libcheck_free(&c);
 	do_libcheck_unload(1);
 }
@@ -382,6 +455,9 @@ static void test_check_state_async_timeout(void **state)
 	will_return(__wrap_io_cancel, 0);
 	do_check_state(&c, 0, 3, PATH_DOWN);
 	check_aio_grp(aio_grp, 1, 0);
+#ifdef DIO_TEST_DEV
+	will_return(__wrap_io_cancel, 0);
+#endif
 	libcheck_free(&c);
 	do_libcheck_unload(1);
 }
@@ -410,7 +486,11 @@ static void test_free_with_pending(void **state)
 	check_aio_grp(aio_grp, 1, 0);
         will_return(__wrap_io_cancel, 0);
         libcheck_free(&c[1]);
+#ifdef DIO_TEST_DEV
+	check_aio_grp(aio_grp, 1, 1); /* real cancel doesn't remove request */
+#else
         check_aio_grp(aio_grp, 0, 0);
+#endif
         do_libcheck_unload(1);
 }
 
@@ -475,7 +555,8 @@ static void test_timeout_cancel_failed(void **state)
 	will_return(__wrap_io_cancel, -1);
 	do_check_state(&c[0], 1, 30, PATH_DOWN);
 	assert_true(is_checker_running(&c[0]));
-	return_io_getevents_nr(NULL, 2, reqs, res);
+	return_io_getevents_nr(NULL, 1, &reqs[0], &res[0]);
+	return_io_getevents_nr(NULL, 1, &reqs[1], &res[1]);
 	do_check_state(&c[1], 1, 30, PATH_UP);
 	do_check_state(&c[0], 1, 30, PATH_UP);
 	for (i = 0; i < 2; i++) {
@@ -508,8 +589,11 @@ static void test_async_timeout_cancel_failed(void **state)
 	return_io_getevents_none();
 	will_return(__wrap_io_cancel, -1);
 	do_check_state(&c[0], 0, 2, PATH_DOWN);
+#ifndef DIO_TEST_DEV
+	/* can't pick which even gets returned on real devices */
 	return_io_getevents_nr(NULL, 1, &reqs[1], &res[1]);
 	do_check_state(&c[1], 0, 2, PATH_UP);
+#endif
 	return_io_getevents_none();
 	will_return(__wrap_io_cancel, -1);
 	do_check_state(&c[0], 0, 2, PATH_DOWN);
@@ -625,7 +709,12 @@ static void test_check_state_blksize(void **state)
 	int blksize[] = {4096, 1024, 512};
 	struct async_req *reqs[3];
 	int res[] = {0,1,0};
+#ifdef DIO_TEST_DEV
+	/* can't pick event return state on real devices */
+	int chk_state[] = {PATH_UP, PATH_UP, PATH_UP};
+#else
 	int chk_state[] = {PATH_UP, PATH_DOWN, PATH_UP};
+#endif
 
 	do_libcheck_load();
 	for (i = 0; i < 3; i++)
@@ -671,6 +760,25 @@ static void test_check_state_async(void **state)
 	do_libcheck_unload(1);
 }
 
+static int setup(void **state)
+{
+#ifdef DIO_TEST_DEV
+	test_fd = open(DIO_TEST_DEV, O_RDONLY);
+	if (test_fd < 0)
+		fail_msg("cannot open %s: %m", DIO_TEST_DEV);
+#endif
+	return 0;
+}
+
+static int teardown(void **state)
+{
+#ifdef DIO_TEST_DEV
+	assert_true(test_fd > 0);
+	assert_int_equal(close(test_fd), 0);
+#endif
+	return 0;
+}
+
 int test_directio(void)
 {
 	const struct CMUnitTest tests[] = {
@@ -691,7 +799,7 @@ int test_directio(void)
 		cmocka_unit_test(test_orphaned_aio_group),
 	};
 
-	return cmocka_run_group_tests(tests, NULL, NULL);
+	return cmocka_run_group_tests(tests, setup, teardown);
 }
 
 int main(void)
diff --git a/tests/directio_test_dev b/tests/directio_test_dev
new file mode 100644
index 00000000..d64e6238
--- /dev/null
+++ b/tests/directio_test_dev
@@ -0,0 +1,4 @@
+# To run the directio tests on an actual block device, uncomment the line
+# starting with DIO_TES_DEV, and set it to the appropriate device
+
+# DIO_TEST_DEV=/dev/sdb
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 01/17] multipathd: warn when configuration has been changed.
  2020-02-05 18:58 ` [PATCH v2 01/17] multipathd: warn when configuration has been changed Benjamin Marzinski
@ 2020-02-10 14:22   ` Martin Wilck
  0 siblings, 0 replies; 37+ messages in thread
From: Martin Wilck @ 2020-02-10 14:22 UTC (permalink / raw)
  To: Benjamin Marzinski, Christophe Varoqui; +Cc: device-mapper development

On Wed, 2020-02-05 at 12:58 -0600, Benjamin Marzinski wrote:
> It would be helpful if multipathd could log a message when
> multipath.conf or files in the config_dir have been written to, both
> so
> that it can be used to send a notification to users, and to help with
> determining after the fact if multipathd was running with an older
> config, when the logs of multipathd's behaviour don't match with the
> current multipath.conf.
> 
> To do this, the multipathd uxlsnr thread now sets up inotify watches
> on
> both /etc/multipath.conf and the config_dir to watch if the files are
> deleted or closed after being opened for writing.  In order to keep
> uxlsnr from polling repeatedly if the multipath.conf or the
> config_dir
> aren't present, it will only set up the watches once per reconfigure.
> However, since multipath.conf is far more likely to be replaced by a
> text editor than modified in place, if it gets removed, multipathd
> will
> immediately try to restart the watch on it (which will succeed if the
> file was simply replaced by a new copy).  This does mean that if
> multipath.conf or the config_dir are actually removed and then later
> re-added, multipathd won't log any more messages for changes until
> the
> next reconfigure. But that seems like a fair trade-off to avoid
> repeatedly polling for files that aren't likely to appear.
> 
> Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> ---
>  libmultipath/config.h |   1 +
>  multipathd/main.c     |   1 +
>  multipathd/uxlsnr.c   | 138
> ++++++++++++++++++++++++++++++++++++++++--
>  3 files changed, 134 insertions(+), 6 deletions(-)

Reviewed-by: Martin Wilck <mwilck@suse.com>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 02/17] multipathd: staticify uxlsnr variables/functions
  2020-02-05 18:58 ` [PATCH v2 02/17] multipathd: staticify uxlsnr variables/functions Benjamin Marzinski
@ 2020-02-10 14:24   ` Martin Wilck
  0 siblings, 0 replies; 37+ messages in thread
From: Martin Wilck @ 2020-02-10 14:24 UTC (permalink / raw)
  To: Benjamin Marzinski, Christophe Varoqui; +Cc: device-mapper development

On Wed, 2020-02-05 at 12:58 -0600, Benjamin Marzinski wrote:
> Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> ---
>  multipathd/uxlsnr.c | 12 ++++++------
>  1 file changed, 6 insertions(+), 6 deletions(-)
> 

Reviewed-by: Martin Wilck <mwilck@suse.com>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 08/17] libmultipath: fix sgio_get_vpd looping
  2020-02-05 18:58 ` [PATCH v2 08/17] libmultipath: fix sgio_get_vpd looping Benjamin Marzinski
@ 2020-02-10 14:35   ` Martin Wilck
  0 siblings, 0 replies; 37+ messages in thread
From: Martin Wilck @ 2020-02-10 14:35 UTC (permalink / raw)
  To: bmarzins, christophe.varoqui; +Cc: dm-devel

On Wed, 2020-02-05 at 12:58 -0600, Benjamin Marzinski wrote:
> If do_inq returns a page with a length that is less than maxlen, but
> larger than DEFAULT_SGIO_LEN, this function will loop forever. Also
> if do_inq returns with a length equal to or greater than maxlen,
> sgio_get_vpd will exit immediately, even if it hasn't read the entire
> page.  Fix these issues, modify the tests to verify the new behavior.
> 
> Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> ---
>  libmultipath/discovery.c | 12 +++----
>  tests/vpd.c              | 77 ++++++++++++++++++++++++------------
> ----
>  2 files changed, 52 insertions(+), 37 deletions(-)
> 
> 
> diff --git a/tests/vpd.c b/tests/vpd.c
> index d9f80eaa..1e653ed4 100644
> --- a/tests/vpd.c
> +++ b/tests/vpd.c
> \
> @@ -518,10 +519,16 @@ static void test_vpd_eui_ ## len ## _ ##
> wlen(void **state)             \
> 	\
>  	n = create_vpd83(vt->vpdbuf, sizeof(vt->vpdbuf), test_id,	\
>  			 2, 0, len);					\
> +	if (sml) {							\
> +		/* overwrite the page side to DEFAULT_SGIO_LEN + 1 */	\
> +		put_unaligned_be16(255, vt->vpdbuf + 2);		\
> +		will_return(__wrap_ioctl, n);				\
> +		will_return(__wrap_ioctl, vt->vpdbuf);			\
> +	}								\
>  	will_return(__wrap_ioctl, n);					\
>  	will_return(__wrap_ioctl, vt->vpdbuf);				

Nitpick: 
1. "side" -> "size"
2. This looks like a missing "else" clause, even though it is not
(because get_vpd_sgio will make 2nd ioctl call). Perhaps add a comment
explaining that.

Anyway, that can be improved later, no need to resend the series again.
So:

Reviewed-by: Martin Wilck <mwilck@suse.com>

-- 
Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
SUSE  Software Solutions Germany GmbH
HRB 36809, AG Nürnberg GF: Felix
Imendörffer

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 10/17] libmultipath: add code to get vendor specific vpd data
  2020-02-05 18:58 ` [PATCH v2 10/17] libmultipath: add code to get vendor specific vpd data Benjamin Marzinski
@ 2020-02-10 14:49   ` Martin Wilck
  2020-02-10 20:49     ` Benjamin Marzinski
  2020-02-11  8:39   ` Martin Wilck
  1 sibling, 1 reply; 37+ messages in thread
From: Martin Wilck @ 2020-02-10 14:49 UTC (permalink / raw)
  To: bmarzins, christophe.varoqui; +Cc: dm-devel

On Wed, 2020-02-05 at 12:58 -0600, Benjamin Marzinski wrote:
> This adds the wildcard 'g' for multipath and path formatted printing,
> which returns extra data from a device's vendor specific vpd
> page.  The
> specific vendor vpd page to use, and the vendor/product id to decode
> it
> can be set in the hwentry with vpd_vendor_pg and vpd_vendor_id. It
> can
> be configured in the devices section of multipath.conf with the
> vpd_vendor parameter. Currently, the only devices that use this are
> HPE
> 3PAR arrays, to return the Volume Name.
> 
> Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> ---
>  libmultipath/config.c      |  2 ++
>  libmultipath/config.h      |  1 +
>  libmultipath/dict.c        | 38 ++++++++++++++++++++++++++++++++++++
>  libmultipath/discovery.c   | 40
> +++++++++++++++++++++++++++++++++++++-
>  libmultipath/hwtable.c     |  1 +
>  libmultipath/print.c       | 25 ++++++++++++++++++++++++
>  libmultipath/propsel.c     | 18 +++++++++++++++++
>  libmultipath/propsel.h     |  1 +
>  libmultipath/structs.h     | 15 ++++++++++++++
>  multipath/multipath.conf.5 |  8 ++++++++
>  10 files changed, 148 insertions(+), 1 deletion(-)
> 
> diff --git a/libmultipath/structs.h b/libmultipath/structs.h
> index 1c32a799..6c03ee5d 100644
> --- a/libmultipath/structs.h
> +++ b/libmultipath/structs.h
> @@ -21,6 +21,7 @@
>  #define HOST_NAME_LEN		16
>  #define SLOT_NAME_SIZE		40
>  #define PRKEY_SIZE		19
> +#define VPD_DATA_SIZE		128
>  
>  #define SCSI_VENDOR_SIZE	9
>  #define SCSI_PRODUCT_SIZE	17
> @@ -221,6 +222,18 @@ enum all_tg_pt_states {
>  	ALL_TG_PT_ON = YNU_YES,
>  };
>  
> +enum vpd_vendor_ids {
> +	VPD_VP_UNDEF,
> +	VPD_VP_HP3PAR,
> +	VPD_VP_ARRAY_SIZE, /* This must remain the last entry */
> +};
> +
> +struct vpd_vendor_page {
> +	int pg;
> +	const char *name;
> +};
> +extern struct vpd_vendor_page vpd_vendor_pages[VPD_VP_ARRAY_SIZE];
> +
>  struct sg_id {
>  	int host_no;
>  	int channel;
> @@ -255,6 +268,7 @@ struct path {
>  	char rev[PATH_REV_SIZE];
>  	char serial[SERIAL_SIZE];
>  	char tgt_node_name[NODE_NAME_SIZE];
> +	char vpd_data[VPD_DATA_SIZE];


Hm, 128 more bytes per path for a rarely-used property... do we need to
store this in "struct path"? Can't we simply fetch that information
when someone requests it with the %g format wildcard? It doesn't matter
much today as "struct path" is really thick already, but I have hopes
to make it a bit leaner some day.

Regards
Martin

-- 
Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
SUSE  Software Solutions Germany GmbH
HRB 36809, AG Nürnberg GF: Felix
Imendörffer

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 12/17] libmultipath: change failed path prio timeout
  2020-02-05 18:58 ` [PATCH v2 12/17] libmultipath: change failed path prio timeout Benjamin Marzinski
@ 2020-02-10 14:51   ` Martin Wilck
  0 siblings, 0 replies; 37+ messages in thread
From: Martin Wilck @ 2020-02-10 14:51 UTC (permalink / raw)
  To: bmarzins, christophe.varoqui; +Cc: dm-devel

On Wed, 2020-02-05 at 12:58 -0600, Benjamin Marzinski wrote:
> multipath will try to get the priority from a PATH_DOWN path, if the
> path doesn't currently have a valid priority. However, if the
> priority
> code needs to contact the device to get the priority, this is likely
> to
> fail for PATH_DOWN paths.  This code dates back to when multipathd
> could
> not easily reload device tables with failed paths, so getting the
> correct priority was important to have a correctly configured device.
> Now multipathd can simply reload the device to move the path to the
> correct pathgroup when the path comes back up.  Since there are a
> number
> of prioritizers that don't require talking to the device, multipath
> shouldn't completely skip attempting to get the priority of these
> paths,
> but it should set a small timeout, so that it isn't hanging in the
> case where it needs to contact a device through a failed path.
> 
> Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> ---
>  libmultipath/discovery.c | 14 ++++++--------
>  libmultipath/prio.c      |  6 +++---
>  2 files changed, 9 insertions(+), 11 deletions(-)

Reviewed-by: Martin Wilck <mwilck@suse.com>

-- 
Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
SUSE  Software Solutions Germany GmbH
HRB 36809, AG Nürnberg GF: Felix
Imendörffer

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 15/17] libmultipath: make directio checker share io contexts
  2020-02-05 18:58 ` [PATCH v2 15/17] libmultipath: make directio checker share io contexts Benjamin Marzinski
@ 2020-02-10 17:08   ` Martin Wilck
  2020-02-10 23:15     ` Benjamin Marzinski
  2020-02-11  8:42   ` Martin Wilck
  1 sibling, 1 reply; 37+ messages in thread
From: Martin Wilck @ 2020-02-10 17:08 UTC (permalink / raw)
  To: Benjamin Marzinski, Christophe Varoqui; +Cc: device-mapper development

On Wed, 2020-02-05 at 12:58 -0600, Benjamin Marzinski wrote:
> On systems with a large number of cores (>500), io_destroy() can take
> tens to hundreds of milliseconds to complete, due to RCU
> synchronization. If there are a large number of paths using the directio
> checker on such a system, this can lead to multipath taking almost a
> minute to complete, with the vast majority of time taken up by
> io_destroy().
> 
> To solve this, the directio checker now allocates one aio context for
> every 1024 checkers. This reduces the io_destroy() delay to a thousandth
> of its previous level. However, this means that muliple checkers are
> sharing the same aio context, and must be able to handle getting results
> for other checkers.  Because only one checker is ever running at a
> time, this doesn't require any locking.  However, locking could be added
> in the future if necessary, to allow multiple checkers to run at the
> same time.
> 
> When checkers are freed, they usually no longer destroy the io context.
> Instead, they attempt to cancel any outstanding request. If that fails,
> they put the request on an orphan list, so that it can be freed by other
> checkers, once it has completed. IO contexts are only destroyed at three
> times, during reconfigure (to deal with the possibility that multipathd
> is holding more aio events than it needs to be, since there is a single
> limit for the whole system), when the checker class is unloaded, and
> in a corner case when checkers are freed. If an aio_group (which
> contains the aio context) is entirely full of orphaned requests, then
> no checker can use it. Since no checker is using it, there is no checker
> to clear out the orphaned requests. In this (likely rare) case, the
> last checker from that group to be freed and leave behind an orphaned
> request will call io_destroy() and remove the group.
> 
> Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> ---
>  libmultipath/checkers/directio.c | 336 +++++++++++++++++++++++++------
>  multipath/multipath.conf.5       |   7 +-
>  2 files changed, 281 insertions(+), 62 deletions(-)

Although I concur now that your design is sound, I still have some
issues, see below.

> 
> diff --git a/libmultipath/checkers/directio.c b/libmultipath/checkers/directio.c
> index 1b00b775..740c68e5 100644
> --- a/libmultipath/checkers/directio.c
> +++ b/libmultipath/checkers/directio.c
> 
> +/* If an aio_group is completely full of orphans, then no checkers can
> + * use it, which means that no checkers can clear out the orphans. To
> + * avoid keeping the useless group around, simply remove remove the
> + * group */
> +static void
> +check_orphaned_group(struct aio_group *aio_grp)
> +{
> +	int count = 0;
> +	struct list_head *item;
> +
> +	if (aio_grp->holders < AIO_GROUP_SIZE)
> +		return;
> +	list_for_each(item, &aio_grp->orphans)
> +		count++;
> +	if (count >= AIO_GROUP_SIZE) {
> +		remove_aio_group(aio_grp);
> +		if (list_empty(&aio_grp_list))
> +			add_aio_group();

OK, but not beautiful. Can be improved later, I guess. In general, you
could delay allocation of an aio_group until it's actually needed (i.e.
when the first path is using it, in set_aio_group(), as you're already
doing it for 2nd and later groups).

> +	}
> +}
> +
> +int libcheck_load (void)
> +{
> +	if (add_aio_group() == NULL) {
> +		LOG(1, "libcheck_load failed: %s", strerror(errno));
> +		return 1;
> +	}
> +	return 0;
> +}
> +
> +void libcheck_unload (void)
> +{
> +	struct aio_group *aio_grp, *tmp;
> +
> +	list_for_each_entry_safe(aio_grp, tmp, &aio_grp_list, node)
> +		remove_aio_group(aio_grp);
> +}

I have one concern here - this might cause delays during multipathd
shutdown, which we have struggled to eliminate in previous patches.
OTOH, according to what you wrote, with the current code the shutdown
delays will probably be higher, so this is actually an improvement.
We should take a mental note about the shutdown issue. Like with TUR,
avoiding hanging on shutdown is tricky if we consider possibly hanging
device I/O.

> +
> +int libcheck_reset (void)
> +{
> +	struct aio_group *aio_grp, *tmp, *reset_grp = NULL;
> +
> +	/* If a clean existing aio_group exists, use that. Otherwise add a
> +	 * new one */
> +	list_for_each_entry(aio_grp, &aio_grp_list, node) {
> +		if (aio_grp->holders == 0 &&
> +		    list_empty(&aio_grp->orphans)) {
> +			reset_grp = aio_grp;
> +			break;
> +		}
> +	}
> +	if (!reset_grp)
> +		reset_grp = add_aio_group();
> +	if (!reset_grp) {
> +		LOG(1, "checker reset failed");
> +		return 1;
> +	}
> +
> +	list_for_each_entry_safe(aio_grp, tmp, &aio_grp_list, node) {
> +		if (aio_grp != reset_grp)
> +			remove_aio_group(aio_grp);
> +	}
> +	return 0;
> +}
>  
>  int libcheck_init (struct checker * c)
>  {
>  	unsigned long pgsize = getpagesize();
>  	struct directio_context * ct;
> +	struct async_req *req = NULL;
>  	long flags;
>  
>  	ct = malloc(sizeof(struct directio_context));
> @@ -56,26 +201,31 @@ int libcheck_init (struct checker * c)
>  		return 1;
>  	memset(ct, 0, sizeof(struct directio_context));
>  
> -	if (io_setup(1, &ct->ioctx) != 0) {
> -		condlog(1, "io_setup failed");
> -		free(ct);
> -		return 1;
> +	if (set_aio_group(ct) < 0)
> +		goto out;
> +
> +	req = malloc(sizeof(struct async_req));
> +	if (!req) {
> +		goto out;
>  	}
> +	memset(req, 0, sizeof(struct async_req));
> +	INIT_LIST_HEAD(&req->node);
>  
> -	if (ioctl(c->fd, BLKBSZGET, &ct->blksize) < 0) {
> +	if (ioctl(c->fd, BLKBSZGET, &req->blksize) < 0) {
>  		c->msgid = MSG_DIRECTIO_BLOCKSIZE;
> -		ct->blksize = 512;
> +		req->blksize = 512;

You didn't change this, but I wonder if this is really a safe default.
IIUC it's safe (although perhaps a bit slower) to read a multiple of
the logical block size, but reading less than the logical block size
might fail. Perhaps we should default to 4k?

>  	}
> -	if (ct->blksize > 4096) {
> +	if (req->blksize > 4096) {
>  		/*
>  		 * Sanity check for DASD; BSZGET is broken
>  		 */
> -		ct->blksize = 4096;
> +		req->blksize = 4096;
>  	}
> -	if (!ct->blksize)
> +	if (!req->blksize)
>  		goto out;
> -	ct->buf = (unsigned char *)malloc(ct->blksize + pgsize);
> -	if (!ct->buf)
> +
> +	req->buf = (unsigned char *)malloc(req->blksize + pgsize);

Why don't you simply use posix_memalign()?

> +	if (!req->buf)
>  		goto out;
>  
>  	flags = fcntl(c->fd, F_GETFL);
> @@ -88,17 +238,22 @@ int libcheck_init (struct checker * c)
>  		ct->reset_flags = 1;
>  	}
>  
> -	ct->ptr = (unsigned char *) (((unsigned long)ct->buf + pgsize - 1) &
> +	req->ptr = (unsigned char *) (((unsigned long)req->buf + pgsize - 1) &
>  		  (~(pgsize - 1)));

See above.

>  
>  	/* Successfully initialized, return the context. */
> +	ct->req = req;
>  	c->context = (void *) ct;
>  	return 0;
>  
>  out:
> -	if (ct->buf)
> -		free(ct->buf);
> -	io_destroy(ct->ioctx);
> +	if (req) {
> +		if (req->buf)
> +			free(req->buf);
> +		free(req);
> +	}
> +	if (ct->aio_grp)
> +		ct->aio_grp->holders--;
>  	free(ct);
>  	return 1;
>  }
> @@ -106,6 +261,7 @@ out:
>  void libcheck_free (struct checker * c)
>  {
>  	struct directio_context * ct = (struct directio_context *)c->context;
> +	struct io_event event;
>  	long flags;
>  
>  	if (!ct)
> @@ -121,20 +277,72 @@ void libcheck_free (struct checker * c)
>  		}
>  	}
>  
> -	if (ct->buf)
> -		free(ct->buf);
> -	io_destroy(ct->ioctx);
> +	if (ct->running &&
> +	    (ct->req->state != PATH_PENDING ||
> +	     io_cancel(ct->aio_grp->ioctx, &ct->req->io, &event) == 0))
> +		ct->running = 0;
> +	if (!ct->running) {
> +		free(ct->req->buf);
> +		free(ct->req);
> +		ct->aio_grp->holders--;
> +	} else {
> +		ct->req->state = PATH_REMOVED;
> +		list_add(&ct->req->node, &ct->aio_grp->orphans);
> +		check_orphaned_group(ct->aio_grp);
> +	}
> +
>  	free(ct);
> +	c->context = NULL;
> +}
> +
> +static int
> +get_events(struct aio_group *aio_grp, struct timespec *timeout)
> +{
> +	struct io_event events[128];
> +	int i, nr, got_events = 0;
> +	struct timespec zero_timeout = {0};
> +	struct timespec *timep = (timeout)? timeout : &zero_timeout;

This isn't wrong, but the semantics of the "timeout" parameter are a
bit confusing, as io_getevents() would interpret a NULL timeout as
"forever", and get_events is mostly a wrapper around io_getevents().

> +
> +	do {
> +		errno = 0;
> +		nr = io_getevents(aio_grp->ioctx, 1, 128, events, timep);
> +		got_events |= (nr > 0);
> +
> +		for (i = 0; i < nr; i++) {
> +			struct async_req *req = container_of(events[i].obj, struct async_req, io);
> +
> +			LOG(3, "io finished %lu/%lu", events[i].res,
> +			    events[i].res2);
> +
> +			/* got an orphaned request */
> +			if (req->state == PATH_REMOVED) {
> +				list_del(&req->node);
> +				free(req->buf);
> +				free(req);
> +				aio_grp->holders--;
> +			} else
> +				req->state = (events[i].res == req->blksize) ?
> +					      PATH_UP : PATH_DOWN;
> +		}
> +		timep = &zero_timeout;
> +	} while (nr == 128); /* assume there are more events and try again */
> +
> +	if (nr < 0)
> +		LOG(3, "async io getevents returned %i (errno=%s)",
> +		    nr, strerror(errno));
> +
> +	return got_events;
>  }
>  static int
>  check_state(int fd, struct directio_context *ct, int sync, int timeout_secs)
>  {
>  	struct timespec	timeout = { .tv_nsec = 5 };

What's the purpose of these 5ns? Unless the device in question is an
NVDIMM, I'd say 5ns is practically equivalent to 0.


> -	struct io_event event;
>  	struct stat	sb;
> -	int		rc = PATH_UNCHECKED;
> +	int		rc;
>  	long		r;
> +	struct timespec currtime, endtime;
> +	struct timespec *timep = &timeout;
>  
>  	if (fstat(fd, &sb) == 0) {
>  		LOG(4, "called for %x", (unsigned) sb.st_rdev);
> @@ -145,50 +353,60 @@ check_state(int fd, struct directio_context *ct, int sync, int timeout_secs)
>  		timeout.tv_nsec = 0;
>  	}
>  
> -	if (!ct->running) {
> -		struct iocb *ios[1] = { &ct->io };
> +	if (ct->running) {
> +		if (ct->req->state != PATH_PENDING) {
> +			ct->running = 0;
> +			return ct->req->state;
> +		}
> +	} else {
> +		struct iocb *ios[1] = { &ct->req->io };
>  
>  		LOG(3, "starting new request");
> -		memset(&ct->io, 0, sizeof(struct iocb));
> -		io_prep_pread(&ct->io, fd, ct->ptr, ct->blksize, 0);
> -		if (io_submit(ct->ioctx, 1, ios) != 1) {
> +		memset(&ct->req->io, 0, sizeof(struct iocb));
> +		io_prep_pread(&ct->req->io, fd, ct->req->ptr,
> +			      ct->req->blksize, 0);
> +		ct->req->state = PATH_PENDING;
> +		if (io_submit(ct->aio_grp->ioctx, 1, ios) != 1) {
>  			LOG(3, "io_submit error %i", errno);
>  			return PATH_UNCHECKED;
>  		}
>  	}
>  	ct->running++;

This looks to me as if in the case (ct->running && ct->req->state ==
PATH_PENDING), ct->running could become > 1, even though you don't
start a new IO. Is that intended? I don't think it matters because you
never decrement, but it looks weird.

>  
> -	errno = 0;
> -	r = io_getevents(ct->ioctx, 1L, 1L, &event, &timeout);
> +	get_monotonic_time(&endtime);
> +	endtime.tv_sec += timeout.tv_sec;
> +	endtime.tv_nsec += timeout.tv_nsec;
> +	normalize_timespec(&endtime);
> +	while(1) {
> +		r = get_events(ct->aio_grp, timep);
>  
> -	if (r < 0 ) {
> -		LOG(3, "async io getevents returned %li (errno=%s)", r,
> -		    strerror(errno));
> -		ct->running = 0;
> -		rc = PATH_UNCHECKED;
> -	} else if (r < 1L) {
> -		if (ct->running > timeout_secs || sync) {
> -			struct iocb *ios[1] = { &ct->io };
> -
> -			LOG(3, "abort check on timeout");
> -			r = io_cancel(ct->ioctx, ios[0], &event);
> -			/*
> -			 * Only reset ct->running if we really
> -			 * could abort the pending I/O
> -			 */
> -			if (r)
> -				LOG(3, "io_cancel error %i", errno);
> -			else
> -				ct->running = 0;
> -			rc = PATH_DOWN;
> -		} else {
> -			LOG(3, "async io pending");
> -			rc = PATH_PENDING;
> -		}
> +		if (ct->req->state != PATH_PENDING) {
> +			ct->running = 0;
> +			return ct->req->state;
> +		} else if (r == 0 || !timep)
> +			break;
> +
> +		get_monotonic_time(&currtime);
> +		timespecsub(&endtime, &currtime, &timeout);
> +		if (timeout.tv_sec < 0)
> +			timep = NULL;

See comment for get_events() above. Why don't you simply do this?

    timeout.tv_sec = timeout.tv_nsec = 0;


> +	}
> +	if (ct->running > timeout_secs || sync) {
> +		struct io_event event;
> +
> +		LOG(3, "abort check on timeout");
> +
> +		r = io_cancel(ct->aio_grp->ioctx, &ct->req->io, &event);
> +		/*
> +		 * Only reset ct->running if we really
> +		 * could abort the pending I/O

... which will never happen ... but never mind.

> +		 */
> +		if (!r)
> +			ct->running = 0;
> +		rc = PATH_DOWN;
>  	} else {
> -		LOG(3, "io finished %lu/%lu", event.res, event.res2);
> -		ct->running = 0;
> -		rc = (event.res == ct->blksize) ? PATH_UP : PATH_DOWN;
> +		LOG(3, "async io pending");
> +		rc = PATH_PENDING;
>  	}
>  
>  	return rc;
> diff --git a/multipath/multipath.conf.5 b/multipath/multipath.conf.5
> index dc103fd8..05a5e8ff 100644
> --- a/multipath/multipath.conf.5
> +++ b/multipath/multipath.conf.5
> @@ -494,9 +494,10 @@ Check the path state for LSI/Engenio/NetApp RDAC class as NetApp SANtricity E/EF
>  Series, and OEM arrays from IBM DELL SGI STK and SUN.
>  .TP
>  .I directio
> -(Deprecated) Read the first sector with direct I/O. This checker is being
> -deprecated, it could cause spurious path failures under high load.
> -Please use \fItur\fR instead.
> +(Deprecated) Read the first sector with direct I/O. If you have a large number
> +of paths, or many AIO users on a system, you may need to use sysctl to
> +increase fs.aio-max-nr. This checker is being deprecated, it could cause
> +spurious path failures under high load. Please use \fItur\fR instead.
>  .TP
>  .I cciss_tur
>  (Hardware-dependent)

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 16/17] tests: add directio unit tests
  2020-02-05 18:58 ` [PATCH v2 16/17] tests: add directio unit tests Benjamin Marzinski
@ 2020-02-10 17:23   ` Martin Wilck
  0 siblings, 0 replies; 37+ messages in thread
From: Martin Wilck @ 2020-02-10 17:23 UTC (permalink / raw)
  To: Benjamin Marzinski, Christophe Varoqui; +Cc: device-mapper development

On Wed, 2020-02-05 at 12:58 -0600, Benjamin Marzinski wrote:
> Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> ---
>  tests/Makefile   |   3 +-
>  tests/directio.c | 704
> +++++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 706 insertions(+), 1 deletion(-)
>  create mode 100644 tests/directio.c
> 

Acked-by: Martin Wilck <mwilck@suse.com>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 17/17] tests: make directio tests able to work on a real device
  2020-02-05 18:58 ` [PATCH v2 17/17] tests: make directio tests able to work on a real device Benjamin Marzinski
@ 2020-02-10 17:27   ` Martin Wilck
  0 siblings, 0 replies; 37+ messages in thread
From: Martin Wilck @ 2020-02-10 17:27 UTC (permalink / raw)
  To: Benjamin Marzinski, Christophe Varoqui; +Cc: device-mapper development

On Wed, 2020-02-05 at 12:58 -0600, Benjamin Marzinski wrote:
> There is now a file in tests called directio_test_dev. If the
> commented
> out test device line is uncommented and set to a device, it can be
> used
> to test the directio checker on that device, instead of faking the
> device.
> 
> Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> ---
>  tests/Makefile          |  16 +++++-
>  tests/directio.c        | 114
> ++++++++++++++++++++++++++++++++++++++--
>  tests/directio_test_dev |   4 ++
>  3 files changed, 130 insertions(+), 4 deletions(-)
>  create mode 100644 tests/directio_test_dev
> 
> @@ -359,6 +427,11 @@ static void test_check_state_timeout(void
> **state)
>  	will_return(__wrap_io_cancel, 0);
>  	do_check_state(&c, 1, 30, PATH_DOWN);
>  	check_aio_grp(aio_grp, 1, 0);
> +#ifdef DIO_TEST_DEV
> +	/* io_cancel will return negative value on timeout, so it
> happens again
> +	 * when freeing the checker */
> +	will_return(__wrap_io_cancel, 0);
> +#endif

I found this a bit confusing after the "will_return(__wrap_io_cancel,
0)" above. (The reader needs to realize that with DIO_TEST_DEV,
__wrap_io_cancel() discards the value passed by will_return()).

But that's just a nit.

Acked-by: Martin Wilck <mwilck@suse.com>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 10/17] libmultipath: add code to get vendor specific vpd data
  2020-02-10 14:49   ` Martin Wilck
@ 2020-02-10 20:49     ` Benjamin Marzinski
  2020-02-11  8:09       ` Martin Wilck
  0 siblings, 1 reply; 37+ messages in thread
From: Benjamin Marzinski @ 2020-02-10 20:49 UTC (permalink / raw)
  To: Martin Wilck; +Cc: dm-devel

On Mon, Feb 10, 2020 at 02:49:38PM +0000, Martin Wilck wrote:
> On Wed, 2020-02-05 at 12:58 -0600, Benjamin Marzinski wrote:
> > This adds the wildcard 'g' for multipath and path formatted printing,
> > which returns extra data from a device's vendor specific vpd
> > page.  The
> > specific vendor vpd page to use, and the vendor/product id to decode
> > it
> > can be set in the hwentry with vpd_vendor_pg and vpd_vendor_id. It
> > can
> > be configured in the devices section of multipath.conf with the
> > vpd_vendor parameter. Currently, the only devices that use this are
> > HPE
> > 3PAR arrays, to return the Volume Name.
> > 
> > Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> > ---
> >  libmultipath/config.c      |  2 ++
> >  libmultipath/config.h      |  1 +
> >  libmultipath/dict.c        | 38 ++++++++++++++++++++++++++++++++++++
> >  libmultipath/discovery.c   | 40
> > +++++++++++++++++++++++++++++++++++++-
> >  libmultipath/hwtable.c     |  1 +
> >  libmultipath/print.c       | 25 ++++++++++++++++++++++++
> >  libmultipath/propsel.c     | 18 +++++++++++++++++
> >  libmultipath/propsel.h     |  1 +
> >  libmultipath/structs.h     | 15 ++++++++++++++
> >  multipath/multipath.conf.5 |  8 ++++++++
> >  10 files changed, 148 insertions(+), 1 deletion(-)
> > 
> > diff --git a/libmultipath/structs.h b/libmultipath/structs.h
> > index 1c32a799..6c03ee5d 100644
> > --- a/libmultipath/structs.h
> > +++ b/libmultipath/structs.h
> > @@ -21,6 +21,7 @@
> >  #define HOST_NAME_LEN		16
> >  #define SLOT_NAME_SIZE		40
> >  #define PRKEY_SIZE		19
> > +#define VPD_DATA_SIZE		128
> >  
> >  #define SCSI_VENDOR_SIZE	9
> >  #define SCSI_PRODUCT_SIZE	17
> > @@ -221,6 +222,18 @@ enum all_tg_pt_states {
> >  	ALL_TG_PT_ON = YNU_YES,
> >  };
> >  
> > +enum vpd_vendor_ids {
> > +	VPD_VP_UNDEF,
> > +	VPD_VP_HP3PAR,
> > +	VPD_VP_ARRAY_SIZE, /* This must remain the last entry */
> > +};
> > +
> > +struct vpd_vendor_page {
> > +	int pg;
> > +	const char *name;
> > +};
> > +extern struct vpd_vendor_page vpd_vendor_pages[VPD_VP_ARRAY_SIZE];
> > +
> >  struct sg_id {
> >  	int host_no;
> >  	int channel;
> > @@ -255,6 +268,7 @@ struct path {
> >  	char rev[PATH_REV_SIZE];
> >  	char serial[SERIAL_SIZE];
> >  	char tgt_node_name[NODE_NAME_SIZE];
> > +	char vpd_data[VPD_DATA_SIZE];
> 
> 
> Hm, 128 more bytes per path for a rarely-used property... do we need to
> store this in "struct path"? Can't we simply fetch that information
> when someone requests it with the %g format wildcard? It doesn't matter
> much today as "struct path" is really thick already, but I have hopes
> to make it a bit leaner some day.

Well, the thought was that this information (which should help map the
path to a physical array) would most often be wanted when things were
going badly, and you might not be able to access the device.

-Ben

> Regards
> Martin
> 
> -- 
> Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
> SUSE  Software Solutions Germany GmbH
> HRB 36809, AG Nürnberg GF: Felix
> Imendörffer
> 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 15/17] libmultipath: make directio checker share io contexts
  2020-02-10 17:08   ` Martin Wilck
@ 2020-02-10 23:15     ` Benjamin Marzinski
  2020-02-11  8:38       ` Martin Wilck
  0 siblings, 1 reply; 37+ messages in thread
From: Benjamin Marzinski @ 2020-02-10 23:15 UTC (permalink / raw)
  To: Martin Wilck; +Cc: device-mapper development

On Mon, Feb 10, 2020 at 06:08:14PM +0100, Martin Wilck wrote:
> On Wed, 2020-02-05 at 12:58 -0600, Benjamin Marzinski wrote:
> > On systems with a large number of cores (>500), io_destroy() can take
> > tens to hundreds of milliseconds to complete, due to RCU
> > synchronization. If there are a large number of paths using the directio
> > checker on such a system, this can lead to multipath taking almost a
> > minute to complete, with the vast majority of time taken up by
> > io_destroy().
> > 
> > To solve this, the directio checker now allocates one aio context for
> > every 1024 checkers. This reduces the io_destroy() delay to a thousandth
> > of its previous level. However, this means that muliple checkers are
> > sharing the same aio context, and must be able to handle getting results
> > for other checkers.  Because only one checker is ever running at a
> > time, this doesn't require any locking.  However, locking could be added
> > in the future if necessary, to allow multiple checkers to run at the
> > same time.
> > 
> > When checkers are freed, they usually no longer destroy the io context.
> > Instead, they attempt to cancel any outstanding request. If that fails,
> > they put the request on an orphan list, so that it can be freed by other
> > checkers, once it has completed. IO contexts are only destroyed at three
> > times, during reconfigure (to deal with the possibility that multipathd
> > is holding more aio events than it needs to be, since there is a single
> > limit for the whole system), when the checker class is unloaded, and
> > in a corner case when checkers are freed. If an aio_group (which
> > contains the aio context) is entirely full of orphaned requests, then
> > no checker can use it. Since no checker is using it, there is no checker
> > to clear out the orphaned requests. In this (likely rare) case, the
> > last checker from that group to be freed and leave behind an orphaned
> > request will call io_destroy() and remove the group.
> > 
> > Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> > ---
> >  libmultipath/checkers/directio.c | 336 +++++++++++++++++++++++++------
> >  multipath/multipath.conf.5       |   7 +-
> >  2 files changed, 281 insertions(+), 62 deletions(-)
> 
> Although I concur now that your design is sound, I still have some
> issues, see below.
> 
> > 
> > diff --git a/libmultipath/checkers/directio.c b/libmultipath/checkers/directio.c
> > index 1b00b775..740c68e5 100644
> > --- a/libmultipath/checkers/directio.c
> > +++ b/libmultipath/checkers/directio.c
> > 
> > +/* If an aio_group is completely full of orphans, then no checkers can
> > + * use it, which means that no checkers can clear out the orphans. To
> > + * avoid keeping the useless group around, simply remove remove the
> > + * group */
> > +static void
> > +check_orphaned_group(struct aio_group *aio_grp)
> > +{
> > +	int count = 0;
> > +	struct list_head *item;
> > +
> > +	if (aio_grp->holders < AIO_GROUP_SIZE)
> > +		return;
> > +	list_for_each(item, &aio_grp->orphans)
> > +		count++;
> > +	if (count >= AIO_GROUP_SIZE) {
> > +		remove_aio_group(aio_grp);
> > +		if (list_empty(&aio_grp_list))
> > +			add_aio_group();
> 
> OK, but not beautiful. Can be improved later, I guess. In general, you
> could delay allocation of an aio_group until it's actually needed (i.e.
> when the first path is using it, in set_aio_group(), as you're already
> doing it for 2nd and later groups).

Sure. I figured it would be more consistent to always have a group after
start-up, since even on reset we keep a group around, to avoid the pain
of removing and then re-adding it. But there really isn't any need to do
it that way, so if you would rather it worked the other way, I can send
another patch to change that.

> 
> > +	}
> > +}
> > +
> > +int libcheck_load (void)
> > +{
> > +	if (add_aio_group() == NULL) {
> > +		LOG(1, "libcheck_load failed: %s", strerror(errno));
> > +		return 1;
> > +	}
> > +	return 0;
> > +}
> > +
> > +void libcheck_unload (void)
> > +{
> > +	struct aio_group *aio_grp, *tmp;
> > +
> > +	list_for_each_entry_safe(aio_grp, tmp, &aio_grp_list, node)
> > +		remove_aio_group(aio_grp);
> > +}
> 
> I have one concern here - this might cause delays during multipathd
> shutdown, which we have struggled to eliminate in previous patches.
> OTOH, according to what you wrote, with the current code the shutdown
> delays will probably be higher, so this is actually an improvement.
> We should take a mental note about the shutdown issue. Like with TUR,
> avoiding hanging on shutdown is tricky if we consider possibly hanging
> device I/O.

Yeah. This should be faster than calling io_destroy on each path, when
shutting down, especially when you have a large number of CPUs and
devices.

> > +
> > +int libcheck_reset (void)
> > +{
> > +	struct aio_group *aio_grp, *tmp, *reset_grp = NULL;
> > +
> > +	/* If a clean existing aio_group exists, use that. Otherwise add a
> > +	 * new one */
> > +	list_for_each_entry(aio_grp, &aio_grp_list, node) {
> > +		if (aio_grp->holders == 0 &&
> > +		    list_empty(&aio_grp->orphans)) {
> > +			reset_grp = aio_grp;
> > +			break;
> > +		}
> > +	}
> > +	if (!reset_grp)
> > +		reset_grp = add_aio_group();
> > +	if (!reset_grp) {
> > +		LOG(1, "checker reset failed");
> > +		return 1;
> > +	}
> > +
> > +	list_for_each_entry_safe(aio_grp, tmp, &aio_grp_list, node) {
> > +		if (aio_grp != reset_grp)
> > +			remove_aio_group(aio_grp);
> > +	}
> > +	return 0;
> > +}
> >  
> >  int libcheck_init (struct checker * c)
> >  {
> >  	unsigned long pgsize = getpagesize();
> >  	struct directio_context * ct;
> > +	struct async_req *req = NULL;
> >  	long flags;
> >  
> >  	ct = malloc(sizeof(struct directio_context));
> > @@ -56,26 +201,31 @@ int libcheck_init (struct checker * c)
> >  		return 1;
> >  	memset(ct, 0, sizeof(struct directio_context));
> >  
> > -	if (io_setup(1, &ct->ioctx) != 0) {
> > -		condlog(1, "io_setup failed");
> > -		free(ct);
> > -		return 1;
> > +	if (set_aio_group(ct) < 0)
> > +		goto out;
> > +
> > +	req = malloc(sizeof(struct async_req));
> > +	if (!req) {
> > +		goto out;
> >  	}
> > +	memset(req, 0, sizeof(struct async_req));
> > +	INIT_LIST_HEAD(&req->node);
> >  
> > -	if (ioctl(c->fd, BLKBSZGET, &ct->blksize) < 0) {
> > +	if (ioctl(c->fd, BLKBSZGET, &req->blksize) < 0) {
> >  		c->msgid = MSG_DIRECTIO_BLOCKSIZE;
> > -		ct->blksize = 512;
> > +		req->blksize = 512;
> 
> You didn't change this, but I wonder if this is really a safe default.
> IIUC it's safe (although perhaps a bit slower) to read a multiple of
> the logical block size, but reading less than the logical block size
> might fail. Perhaps we should default to 4k?

Sure. I can add that to the follow-up patch.

> >  	}
> > -	if (ct->blksize > 4096) {
> > +	if (req->blksize > 4096) {
> >  		/*
> >  		 * Sanity check for DASD; BSZGET is broken
> >  		 */
> > -		ct->blksize = 4096;
> > +		req->blksize = 4096;
> >  	}
> > -	if (!ct->blksize)
> > +	if (!req->blksize)
> >  		goto out;
> > -	ct->buf = (unsigned char *)malloc(ct->blksize + pgsize);
> > -	if (!ct->buf)
> > +
> > +	req->buf = (unsigned char *)malloc(req->blksize + pgsize);
> 
> Why don't you simply use posix_memalign()?

I just reused the code that was already there to allocate the buffer.
Again, follow up patch.

> > +	if (!req->buf)
> >  		goto out;
> >  
> >  	flags = fcntl(c->fd, F_GETFL);
> > @@ -88,17 +238,22 @@ int libcheck_init (struct checker * c)
> >  		ct->reset_flags = 1;
> >  	}
> >  
> > -	ct->ptr = (unsigned char *) (((unsigned long)ct->buf + pgsize - 1) &
> > +	req->ptr = (unsigned char *) (((unsigned long)req->buf + pgsize - 1) &
> >  		  (~(pgsize - 1)));
> 
> See above.
> 
> >  
> >  	/* Successfully initialized, return the context. */
> > +	ct->req = req;
> >  	c->context = (void *) ct;
> >  	return 0;
> >  
> >  out:
> > -	if (ct->buf)
> > -		free(ct->buf);
> > -	io_destroy(ct->ioctx);
> > +	if (req) {
> > +		if (req->buf)
> > +			free(req->buf);
> > +		free(req);
> > +	}
> > +	if (ct->aio_grp)
> > +		ct->aio_grp->holders--;
> >  	free(ct);
> >  	return 1;
> >  }
> > @@ -106,6 +261,7 @@ out:
> >  void libcheck_free (struct checker * c)
> >  {
> >  	struct directio_context * ct = (struct directio_context *)c->context;
> > +	struct io_event event;
> >  	long flags;
> >  
> >  	if (!ct)
> > @@ -121,20 +277,72 @@ void libcheck_free (struct checker * c)
> >  		}
> >  	}
> >  
> > -	if (ct->buf)
> > -		free(ct->buf);
> > -	io_destroy(ct->ioctx);
> > +	if (ct->running &&
> > +	    (ct->req->state != PATH_PENDING ||
> > +	     io_cancel(ct->aio_grp->ioctx, &ct->req->io, &event) == 0))
> > +		ct->running = 0;
> > +	if (!ct->running) {
> > +		free(ct->req->buf);
> > +		free(ct->req);
> > +		ct->aio_grp->holders--;
> > +	} else {
> > +		ct->req->state = PATH_REMOVED;
> > +		list_add(&ct->req->node, &ct->aio_grp->orphans);
> > +		check_orphaned_group(ct->aio_grp);
> > +	}
> > +
> >  	free(ct);
> > +	c->context = NULL;
> > +}
> > +
> > +static int
> > +get_events(struct aio_group *aio_grp, struct timespec *timeout)
> > +{
> > +	struct io_event events[128];
> > +	int i, nr, got_events = 0;
> > +	struct timespec zero_timeout = {0};
> > +	struct timespec *timep = (timeout)? timeout : &zero_timeout;
> 
> This isn't wrong, but the semantics of the "timeout" parameter are a
> bit confusing, as io_getevents() would interpret a NULL timeout as
> "forever", and get_events is mostly a wrapper around io_getevents().

Sure. I can change that.

> > +
> > +	do {
> > +		errno = 0;
> > +		nr = io_getevents(aio_grp->ioctx, 1, 128, events, timep);
> > +		got_events |= (nr > 0);
> > +
> > +		for (i = 0; i < nr; i++) {
> > +			struct async_req *req = container_of(events[i].obj, struct async_req, io);
> > +
> > +			LOG(3, "io finished %lu/%lu", events[i].res,
> > +			    events[i].res2);
> > +
> > +			/* got an orphaned request */
> > +			if (req->state == PATH_REMOVED) {
> > +				list_del(&req->node);
> > +				free(req->buf);
> > +				free(req);
> > +				aio_grp->holders--;
> > +			} else
> > +				req->state = (events[i].res == req->blksize) ?
> > +					      PATH_UP : PATH_DOWN;
> > +		}
> > +		timep = &zero_timeout;
> > +	} while (nr == 128); /* assume there are more events and try again */
> > +
> > +	if (nr < 0)
> > +		LOG(3, "async io getevents returned %i (errno=%s)",
> > +		    nr, strerror(errno));
> > +
> > +	return got_events;
> >  }
> >  static int
> >  check_state(int fd, struct directio_context *ct, int sync, int timeout_secs)
> >  {
> >  	struct timespec	timeout = { .tv_nsec = 5 };
> 
> What's the purpose of these 5ns? Unless the device in question is an
> NVDIMM, I'd say 5ns is practically equivalent to 0.

again, I just kept this the same to keep the checker working the same as
it did before, but yes, this basically means that checker will basically
always return PATH_PENDING on the call where it issues the IO for NAS
storage.  I'm fine with changing this to 1 ms. 

> 
> > -	struct io_event event;
> >  	struct stat	sb;
> > -	int		rc = PATH_UNCHECKED;
> > +	int		rc;
> >  	long		r;
> > +	struct timespec currtime, endtime;
> > +	struct timespec *timep = &timeout;
> >  
> >  	if (fstat(fd, &sb) == 0) {
> >  		LOG(4, "called for %x", (unsigned) sb.st_rdev);
> > @@ -145,50 +353,60 @@ check_state(int fd, struct directio_context *ct, int sync, int timeout_secs)
> >  		timeout.tv_nsec = 0;
> >  	}
> >  
> > -	if (!ct->running) {
> > -		struct iocb *ios[1] = { &ct->io };
> > +	if (ct->running) {
> > +		if (ct->req->state != PATH_PENDING) {
> > +			ct->running = 0;
> > +			return ct->req->state;
> > +		}
> > +	} else {
> > +		struct iocb *ios[1] = { &ct->req->io };
> >  
> >  		LOG(3, "starting new request");
> > -		memset(&ct->io, 0, sizeof(struct iocb));
> > -		io_prep_pread(&ct->io, fd, ct->ptr, ct->blksize, 0);
> > -		if (io_submit(ct->ioctx, 1, ios) != 1) {
> > +		memset(&ct->req->io, 0, sizeof(struct iocb));
> > +		io_prep_pread(&ct->req->io, fd, ct->req->ptr,
> > +			      ct->req->blksize, 0);
> > +		ct->req->state = PATH_PENDING;
> > +		if (io_submit(ct->aio_grp->ioctx, 1, ios) != 1) {
> >  			LOG(3, "io_submit error %i", errno);
> >  			return PATH_UNCHECKED;
> >  		}
> >  	}
> >  	ct->running++;
> 
> This looks to me as if in the case (ct->running && ct->req->state ==
> PATH_PENDING), ct->running could become > 1, even though you don't
> start a new IO. Is that intended? I don't think it matters because you
> never decrement, but it looks weird.

That's necessary for how the checker currently works. In async mode
checker doesn't actually wait for a specific number of seconds (and
didn't before my patch). It assumes 1 sec call times for pending paths,
and times out after ct->running > timeout_secs. That's why the unit
tests can get away with simply calling the checker repeatedly, without
waiting, to make it timeout. But I suppose that wait_for_pending_paths()
will also be making the checker time out quicker, so this should
probably be changed. However, since check_path won't set a paths state
to PATH_PENDING if it wasn't already, it's not really a very likely
issue to occur.
 
> >  
> > -	errno = 0;
> > -	r = io_getevents(ct->ioctx, 1L, 1L, &event, &timeout);
> > +	get_monotonic_time(&endtime);
> > +	endtime.tv_sec += timeout.tv_sec;
> > +	endtime.tv_nsec += timeout.tv_nsec;
> > +	normalize_timespec(&endtime);
> > +	while(1) {
> > +		r = get_events(ct->aio_grp, timep);
> >  
> > -	if (r < 0 ) {
> > -		LOG(3, "async io getevents returned %li (errno=%s)", r,
> > -		    strerror(errno));
> > -		ct->running = 0;
> > -		rc = PATH_UNCHECKED;
> > -	} else if (r < 1L) {
> > -		if (ct->running > timeout_secs || sync) {
> > -			struct iocb *ios[1] = { &ct->io };
> > -
> > -			LOG(3, "abort check on timeout");
> > -			r = io_cancel(ct->ioctx, ios[0], &event);
> > -			/*
> > -			 * Only reset ct->running if we really
> > -			 * could abort the pending I/O
> > -			 */
> > -			if (r)
> > -				LOG(3, "io_cancel error %i", errno);
> > -			else
> > -				ct->running = 0;
> > -			rc = PATH_DOWN;
> > -		} else {
> > -			LOG(3, "async io pending");
> > -			rc = PATH_PENDING;
> > -		}
> > +		if (ct->req->state != PATH_PENDING) {
> > +			ct->running = 0;
> > +			return ct->req->state;
> > +		} else if (r == 0 || !timep)
> > +			break;
> > +
> > +		get_monotonic_time(&currtime);
> > +		timespecsub(&endtime, &currtime, &timeout);
> > +		if (timeout.tv_sec < 0)
> > +			timep = NULL;
> 
> See comment for get_events() above. Why don't you simply do this?
> 
>     timeout.tv_sec = timeout.tv_nsec = 0;
> 

Sure.

So, If I will post a seperate patch that resolves these issues, can this
one have an ack?

-Ben

> 
> > +	}
> > +	if (ct->running > timeout_secs || sync) {
> > +		struct io_event event;
> > +
> > +		LOG(3, "abort check on timeout");
> > +
> > +		r = io_cancel(ct->aio_grp->ioctx, &ct->req->io, &event);
> > +		/*
> > +		 * Only reset ct->running if we really
> > +		 * could abort the pending I/O
> 
> ... which will never happen ... but never mind.
> 
> > +		 */
> > +		if (!r)
> > +			ct->running = 0;
> > +		rc = PATH_DOWN;
> >  	} else {
> > -		LOG(3, "io finished %lu/%lu", event.res, event.res2);
> > -		ct->running = 0;
> > -		rc = (event.res == ct->blksize) ? PATH_UP : PATH_DOWN;
> > +		LOG(3, "async io pending");
> > +		rc = PATH_PENDING;
> >  	}
> >  
> >  	return rc;
> > diff --git a/multipath/multipath.conf.5 b/multipath/multipath.conf.5
> > index dc103fd8..05a5e8ff 100644
> > --- a/multipath/multipath.conf.5
> > +++ b/multipath/multipath.conf.5
> > @@ -494,9 +494,10 @@ Check the path state for LSI/Engenio/NetApp RDAC class as NetApp SANtricity E/EF
> >  Series, and OEM arrays from IBM DELL SGI STK and SUN.
> >  .TP
> >  .I directio
> > -(Deprecated) Read the first sector with direct I/O. This checker is being
> > -deprecated, it could cause spurious path failures under high load.
> > -Please use \fItur\fR instead.
> > +(Deprecated) Read the first sector with direct I/O. If you have a large number
> > +of paths, or many AIO users on a system, you may need to use sysctl to
> > +increase fs.aio-max-nr. This checker is being deprecated, it could cause
> > +spurious path failures under high load. Please use \fItur\fR instead.
> >  .TP
> >  .I cciss_tur
> >  (Hardware-dependent)
> 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 10/17] libmultipath: add code to get vendor specific vpd data
  2020-02-10 20:49     ` Benjamin Marzinski
@ 2020-02-11  8:09       ` Martin Wilck
  0 siblings, 0 replies; 37+ messages in thread
From: Martin Wilck @ 2020-02-11  8:09 UTC (permalink / raw)
  To: Benjamin Marzinski; +Cc: dm-devel

On Mon, 2020-02-10 at 14:49 -0600, Benjamin Marzinski wrote:
> On Mon, Feb 10, 2020 at 02:49:38PM +0000, Martin Wilck wrote:
> > On Wed, 2020-02-05 at 12:58 -0600, Benjamin Marzinski wrote:
> > > This adds the wildcard 'g' for multipath and path formatted
> > > printing,
> > > which returns extra data from a device's vendor specific vpd
> > > page.  The
> > > specific vendor vpd page to use, and the vendor/product id to
> > > decode
> > > it
> > > can be set in the hwentry with vpd_vendor_pg and vpd_vendor_id.
> > > It
> > > can
> > > be configured in the devices section of multipath.conf with the
> > > vpd_vendor parameter. Currently, the only devices that use this
> > > are
> > > HPE
> > > 3PAR arrays, to return the Volume Name.
> > > 
> > > Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> > > ---
> > >  libmultipath/config.c      |  2 ++
> > >  libmultipath/config.h      |  1 +
> > >  libmultipath/dict.c        | 38
> > > ++++++++++++++++++++++++++++++++++++
> > >  libmultipath/discovery.c   | 40
> > > +++++++++++++++++++++++++++++++++++++-
> > >  libmultipath/hwtable.c     |  1 +
> > >  libmultipath/print.c       | 25 ++++++++++++++++++++++++
> > >  libmultipath/propsel.c     | 18 +++++++++++++++++
> > >  libmultipath/propsel.h     |  1 +
> > >  libmultipath/structs.h     | 15 ++++++++++++++
> > >  multipath/multipath.conf.5 |  8 ++++++++
> > >  10 files changed, 148 insertions(+), 1 deletion(-)
> > > 
> > >  struct sg_id {
> > >  	int host_no;
> > >  	int channel;
> > > @@ -255,6 +268,7 @@ struct path {
> > >  	char rev[PATH_REV_SIZE];
> > >  	char serial[SERIAL_SIZE];
> > >  	char tgt_node_name[NODE_NAME_SIZE];
> > > +	char vpd_data[VPD_DATA_SIZE];
> > 
> > Hm, 128 more bytes per path for a rarely-used property... do we
> > need to
> > store this in "struct path"? Can't we simply fetch that information
> > when someone requests it with the %g format wildcard? It doesn't
> > matter
> > much today as "struct path" is really thick already, but I have
> > hopes
> > to make it a bit leaner some day.
> 
> Well, the thought was that this information (which should help map
> the
> path to a physical array) would most often be wanted when things were
> going badly, and you might not be able to access the device.

Point taken. How about using a pointer? That would be only 8 extra
bytes for those cases where the field was unused.
Call me old-fashioned, I've grown up in a world where 1000x128 bytes
mattered.

Regards,
Martin

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 15/17] libmultipath: make directio checker share io contexts
  2020-02-10 23:15     ` Benjamin Marzinski
@ 2020-02-11  8:38       ` Martin Wilck
  2020-02-11 22:21         ` Benjamin Marzinski
  0 siblings, 1 reply; 37+ messages in thread
From: Martin Wilck @ 2020-02-11  8:38 UTC (permalink / raw)
  To: Benjamin Marzinski; +Cc: development, device-mapper

On Mon, 2020-02-10 at 17:15 -0600, Benjamin Marzinski wrote:
> On Mon, Feb 10, 2020 at 06:08:14PM +0100, Martin Wilck wrote:
> > On Wed, 2020-02-05 at 12:58 -0600, Benjamin Marzinski wrote:
> > > On systems with a large number of cores (>500), io_destroy() can
> > > take
> > > tens to hundreds of milliseconds to complete, due to RCU
> > > synchronization. If there are a large number of paths using the
> > > directio
> > > checker on such a system, this can lead to multipath taking
> > > almost a
> > > minute to complete, with the vast majority of time taken up by
> > > io_destroy().
> > > 
> > > To solve this, the directio checker now allocates one aio context
> > > for
> > > every 1024 checkers. This reduces the io_destroy() delay to a
> > > thousandth
> > > of its previous level. However, this means that muliple checkers
> > > are
> > > sharing the same aio context, and must be able to handle getting
> > > results
> > > for other checkers.  Because only one checker is ever running at
> > > a
> > > time, this doesn't require any locking.  However, locking could
> > > be added
> > > in the future if necessary, to allow multiple checkers to run at
> > > the
> > > same time.
> > > 
> > > When checkers are freed, they usually no longer destroy the io
> > > context.
> > > Instead, they attempt to cancel any outstanding request. If that
> > > fails,
> > > they put the request on an orphan list, so that it can be freed
> > > by other
> > > checkers, once it has completed. IO contexts are only destroyed
> > > at three
> > > times, during reconfigure (to deal with the possibility that
> > > multipathd
> > > is holding more aio events than it needs to be, since there is a
> > > single
> > > limit for the whole system), when the checker class is unloaded,
> > > and
> > > in a corner case when checkers are freed. If an aio_group (which
> > > contains the aio context) is entirely full of orphaned requests,
> > > then
> > > no checker can use it. Since no checker is using it, there is no
> > > checker
> > > to clear out the orphaned requests. In this (likely rare) case,
> > > the
> > > last checker from that group to be freed and leave behind an
> > > orphaned
> > > request will call io_destroy() and remove the group.
> > > 
> > > Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> > > ---
> > >  libmultipath/checkers/directio.c | 336
> > > +++++++++++++++++++++++++------
> > >  multipath/multipath.conf.5       |   7 +-
> > >  2 files changed, 281 insertions(+), 62 deletions(-)
> > 
> > Although I concur now that your design is sound, I still have some
> > issues, see below.
> > ...
> >  	}
> > >  	ct->running++;
> > 
> > This looks to me as if in the case (ct->running && ct->req->state
> > ==
> > PATH_PENDING), ct->running could become > 1, even though you don't
> > start a new IO. Is that intended? I don't think it matters because
> > you
> > never decrement, but it looks weird.
> 
> That's necessary for how the checker currently works. In async mode
> checker doesn't actually wait for a specific number of seconds (and
> didn't before my patch). It assumes 1 sec call times for pending
> paths,
> and times out after ct->running > timeout_secs. That's why the unit
> tests can get away with simply calling the checker repeatedly,
> without
> waiting, to make it timeout. But I suppose that
> wait_for_pending_paths()
> will also be making the checker time out quicker, so this should
> probably be changed. However, since check_path won't set a paths
> state
> to PATH_PENDING if it wasn't already, it's not really a very likely
> issue to occur.

Bah. I should have realized that, of course. Yet measuring the timeout
in *seconds*, and polling for it the way we currently do, is really 80s
design... I guess I was confused by the use of ns timeout calculation
for the get_events() call (suggesting high-res timing), and the use of
"ct->running" as both indicator of running IO and "abstract run time".

I hope you admit that the logic in check_path() is convoluted and hard
to understand :-/ . For example here:

> 	while(1) {
> 		r = get_events(ct->aio_grp, timep);
> 
> 		if (ct->req->state != PATH_PENDING) {
> 			ct->running = 0;
> 			return ct->req->state;
> 		} else if (r == 0 || !timep)
> 			break;
> 
> 		get_monotonic_time(&currtime);
> 		timespecsub(&endtime, &currtime, &timeout);
> 		if (timeout.tv_sec < 0)
> 			timep = NULL;

We're past the timeout already, having seen some events, just not for
the path we're currently looking at. Why do we go through another
iteration?

> 	}
> 	if (ct->running > timeout_secs || sync) {
>  
> > See comment for get_events() above. Why don't you simply do this?
> > 
> >     timeout.tv_sec = timeout.tv_nsec = 0;
> > 
> 
> Sure.
> 
> So, If I will post a seperate patch that resolves these issues, can
> this
> one have an ack?

Yes. I agree we should move forward.

Regards
Martin

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 10/17] libmultipath: add code to get vendor specific vpd data
  2020-02-05 18:58 ` [PATCH v2 10/17] libmultipath: add code to get vendor specific vpd data Benjamin Marzinski
  2020-02-10 14:49   ` Martin Wilck
@ 2020-02-11  8:39   ` Martin Wilck
  1 sibling, 0 replies; 37+ messages in thread
From: Martin Wilck @ 2020-02-11  8:39 UTC (permalink / raw)
  To: bmarzins, christophe.varoqui; +Cc: dm-devel

On Wed, 2020-02-05 at 12:58 -0600, Benjamin Marzinski wrote:
> This adds the wildcard 'g' for multipath and path formatted printing,
> which returns extra data from a device's vendor specific vpd
> page.  The
> specific vendor vpd page to use, and the vendor/product id to decode
> it
> can be set in the hwentry with vpd_vendor_pg and vpd_vendor_id. It
> can
> be configured in the devices section of multipath.conf with the
> vpd_vendor parameter. Currently, the only devices that use this are
> HPE
> 3PAR arrays, to return the Volume Name.
> 
> Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> ---
>  libmultipath/config.c      |  2 ++
>  libmultipath/config.h      |  1 +
>  libmultipath/dict.c        | 38 ++++++++++++++++++++++++++++++++++++
>  libmultipath/discovery.c   | 40
> +++++++++++++++++++++++++++++++++++++-
>  libmultipath/hwtable.c     |  1 +
>  libmultipath/print.c       | 25 ++++++++++++++++++++++++
>  libmultipath/propsel.c     | 18 +++++++++++++++++
>  libmultipath/propsel.h     |  1 +
>  libmultipath/structs.h     | 15 ++++++++++++++
>  multipath/multipath.conf.5 |  8 ++++++++
>  10 files changed, 148 insertions(+), 1 deletion(-)

Acked-by: Martin Wilck <mwilck@suse.com>

(wishing that the memory usage can be decreased in a follow-up patch).

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 15/17] libmultipath: make directio checker share io contexts
  2020-02-05 18:58 ` [PATCH v2 15/17] libmultipath: make directio checker share io contexts Benjamin Marzinski
  2020-02-10 17:08   ` Martin Wilck
@ 2020-02-11  8:42   ` Martin Wilck
  1 sibling, 0 replies; 37+ messages in thread
From: Martin Wilck @ 2020-02-11  8:42 UTC (permalink / raw)
  To: Benjamin Marzinski, Christophe Varoqui; +Cc: device-mapper development

On Wed, 2020-02-05 at 12:58 -0600, Benjamin Marzinski wrote:
> On systems with a large number of cores (>500), io_destroy() can take
> tens to hundreds of milliseconds to complete, due to RCU
> synchronization. If there are a large number of paths using the
> directio
> checker on such a system, this can lead to multipath taking almost a
> minute to complete, with the vast majority of time taken up by
> io_destroy().
> 
> To solve this, the directio checker now allocates one aio context for
> every 1024 checkers. This reduces the io_destroy() delay to a
> thousandth
> of its previous level. However, this means that muliple checkers are
> sharing the same aio context, and must be able to handle getting
> results
> for other checkers.  Because only one checker is ever running at a
> time, this doesn't require any locking.  However, locking could be
> added
> in the future if necessary, to allow multiple checkers to run at the
> same time.
> 
> When checkers are freed, they usually no longer destroy the io
> context.
> Instead, they attempt to cancel any outstanding request. If that
> fails,
> they put the request on an orphan list, so that it can be freed by
> other
> checkers, once it has completed. IO contexts are only destroyed at
> three
> times, during reconfigure (to deal with the possibility that
> multipathd
> is holding more aio events than it needs to be, since there is a
> single
> limit for the whole system), when the checker class is unloaded, and
> in a corner case when checkers are freed. If an aio_group (which
> contains the aio context) is entirely full of orphaned requests, then
> no checker can use it. Since no checker is using it, there is no
> checker
> to clear out the orphaned requests. In this (likely rare) case, the
> last checker from that group to be freed and leave behind an orphaned
> request will call io_destroy() and remove the group.
> 
> Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> ---
>  libmultipath/checkers/directio.c | 336 +++++++++++++++++++++++++--
> ----
>  multipath/multipath.conf.5       |   7 +-
>  2 files changed, 281 insertions(+), 62 deletions(-)

Reviewed-by: Martin Wilck <mwilck@suse.com>

(some minor nitpicks to be fixed in follow-up patch)

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 00/17] Multipath patch dump
  2020-02-05 18:58 [PATCH v2 00/17] Multipath patch dump Benjamin Marzinski
                   ` (16 preceding siblings ...)
  2020-02-05 18:58 ` [PATCH v2 17/17] tests: make directio tests able to work on a real device Benjamin Marzinski
@ 2020-02-11  8:45 ` Martin Wilck
  2020-02-11 10:18   ` Martin Wilck
  17 siblings, 1 reply; 37+ messages in thread
From: Martin Wilck @ 2020-02-11  8:45 UTC (permalink / raw)
  To: Benjamin Marzinski, Christophe Varoqui; +Cc: device-mapper development

Hi Christophe,

On Wed, 2020-02-05 at 12:58 -0600, Benjamin Marzinski wrote:
> This patch set is has multiple parts.
> 
> patch 01 is just a resubmit of my previous patch, with Martin's
> corrections added.
> 
> Patches 02 - 07 are miscellaneous fixes and cleanups
> 
> Patches 08 - 10 are related to adding a new format wildcard, at the
> 	request of HPE, that allows multipath to get and display
> 	information from the vendor specific VPD pages
> 
> Patches 11 - 17 are the part that needs the most review, patch 14
> 	especially. It turns out that on machines with a large number
> 	of cores, io_destroy(), which is used by the directio checker,
> 	can take a long time to complete. Talking to some kernel people
> 	at Red Hat, I was told that this isn't a bug. It's working as
> 	expected, and multipath shouldn't be issuing so many
> 	io_destroy() calls (1 per path, when multipath or multipathd
> 	stops). Cutting out the io_destroy calls involved a major
> change
> 	to the directio checker. It's pretty hard to test a lot of the
> 	corner cases on actual hardware, so I've written a bunch of
> 	unit tests for this (patches 16 & 17).
> 
> Changes is v2:
> 0001-multipathd-warn-when-configuration-has-been-changed.patch
> 	Minor changes, including using a structure instead of an
> 	array to hold the watch descriptors, as suggested by Martin
> 	Wilck.
> 
> 0002-multipathd-staticify-uxlsnr-variables-functions.patch
> 	New patch
> 
> 0008-libmultipath-fix-sgio_get_vpd-looping.patch
> 	Test has been changed to keep create_vpd83 the same, and
> 	overwrite the length in the actual test, as suggested by
> 	Martin Wilck
> 
> 0010-libmultipath-add-code-to-get-vendor-specific-vpd-dat.patch
> 	The information to find the vpd page and how to decode it is
> 	now simply the index in a table of name -> page_nr mappings
> 
> 0012-libmultipath-change-failed-path-prio-timeout.patch
> 	The timeout argument has been renamed to avoid confusion,
> 	as suggested by Martin Wilck
> 
> 0015-libmultipath-make-directio-checker-share-io-contexts.patch
> 	Minor changes to print more useful messages, and avoid
> 	printing anything when we get a non-zero io_cancel()
> 	result, as suggested by Martin Wilck
> 
> 0016-tests-add-directio-unit-tests.patch
> 	Minor change to improve readability, as suggested by
> 	Martin Wilck
> 
> 0017-tests-make-directio-tests-able-to-work-on-a-real-dev.patch
> 	New patch. This adds the ability to set a testing device, so
> 	you can run the directio tests on an actual device
> 
> Benjamin Marzinski (17):
>   multipathd: warn when configuration has been changed.
>   multipathd: staticify uxlsnr variables/functions
>   libmultipath: fix leak in foreign code
>   Fix leak in mpathpersist
>   libmultipath: remove unused path->prio_args
>   libmultipath: constify get_unaligned_be*
>   libmultipath: add missing hwe mpe variable merges
>   libmultipath: fix sgio_get_vpd looping
>   libmultipath: add vend_id to get_vpd_sgio
>   libmultipath: add code to get vendor specific vpd data
>   libmultipath: change how the checker async is set
>   libmultipath: change failed path prio timeout
>   multipathd: add new paths under vecs lock
>   libmultipath: add new checker class functions
>   libmultipath: make directio checker share io contexts
>   tests: add directio unit tests
>   tests: make directio tests able to work on a real device
> 
>  libmpathpersist/mpath_persist.c  |   2 +-
>  libmultipath/checkers.c          |  29 +-
>  libmultipath/checkers.h          |   1 +
>  libmultipath/checkers/directio.c | 336 ++++++++++---
>  libmultipath/config.c            |  10 +
>  libmultipath/config.h            |   2 +
>  libmultipath/dict.c              |  38 ++
>  libmultipath/discovery.c         |  80 ++-
>  libmultipath/discovery.h         |   2 +-
>  libmultipath/foreign.c           |  11 +-
>  libmultipath/hwtable.c           |   1 +
>  libmultipath/print.c             |  25 +
>  libmultipath/prio.c              |   6 +-
>  libmultipath/propsel.c           |  20 +-
>  libmultipath/propsel.h           |   1 +
>  libmultipath/structs.h           |  16 +-
>  libmultipath/unaligned.h         |   4 +-
>  mpathpersist/main.c              |   1 +
>  multipath/main.c                 |   1 +
>  multipath/multipath.conf.5       |  15 +-
>  multipathd/main.c                |  18 +-
>  multipathd/uxlsnr.c              | 150 +++++-
>  tests/Makefile                   |  19 +-
>  tests/directio.c                 | 812
> +++++++++++++++++++++++++++++++
>  tests/directio_test_dev          |   4 +
>  tests/vpd.c                      |  87 ++--
>  26 files changed, 1534 insertions(+), 157 deletions(-)
>  create mode 100644 tests/directio.c
>  create mode 100644 tests/directio_test_dev
> 

ACK for the series: Martin Wilck <mwilck@suse.com>

Ben and I agreed that he'll follow up with some minor fixes, but
there's no reason to hold this back.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 03/17] libmultipath: fix leak in foreign code
  2020-02-05 18:58 ` [PATCH v2 03/17] libmultipath: fix leak in foreign code Benjamin Marzinski
@ 2020-02-11  9:36   ` Martin Wilck
  2020-02-11 22:47     ` Benjamin Marzinski
  0 siblings, 1 reply; 37+ messages in thread
From: Martin Wilck @ 2020-02-11  9:36 UTC (permalink / raw)
  To: Benjamin Marzinski, Christophe Varoqui; +Cc: device-mapper development

Hi Ben, hi Christophe,

On Wed, 2020-02-05 at 12:58 -0600, Benjamin Marzinski wrote:
> If scandir fails or finds no foreign libraries, enable_re needs to be
> freed before exitting.
> 
> Fixes: 8d03eda4 'multipath.conf: add "enable_foreign" parameter'
> Reviewed-by: Martin Wilck <mwilck@suse.com>
> Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> ---
>  libmultipath/foreign.c | 11 +++++++----
>  1 file changed, 7 insertions(+), 4 deletions(-)

While trying to merge your series into my tree, I realized that this
patch conflicts with my previously submitted patch "libmultipath:
_init_foreign(): fix possible memory leak"
https://www.redhat.com/archives/dm-devel/2019-October/msg00104.html
which fixed the same issue.

The two patches are almost equal, so I really don't care which one
is eventually taken. I just wanted to alert Christophe about the
conflict.

Anyway, I thought that you'd ACK'd my October 72-part patch series in
the following messages:

https://www.redhat.com/archives/dm-devel/2019-October/msg00214.html
  (Ben: ACK on v2 of 72-part series "multipath-tools: cleanup and
warning enablement", except 16/72 "libmultipath: make path_discovery()
pthread_cancel()-safe")
https://www.redhat.com/archives/dm-devel/2019-November/msg00016.html
  (Ben: ACK on 16/72 "libmultipath: make path_discovery()
pthread_cancel()-safe", after discussion)
https://www.redhat.com/archives/dm-devel/2019-November/msg00085.html
  (Ben: Ack on v2->v3 change, updated patch v3 45/72 "libmultipath: fix
-Wsign-compare warnings with snprintf()")

Please clarify if I misunderstood that and the series still needs work
from your PoV.

Regards
Martin


> 
> diff --git a/libmultipath/foreign.c b/libmultipath/foreign.c
> index 4b34e141..68e9a9b8 100644
> --- a/libmultipath/foreign.c
> +++ b/libmultipath/foreign.c
> @@ -129,7 +129,7 @@ static int _init_foreign(const char
> *multipath_dir, const char *enable)
>  	char pathbuf[PATH_MAX];
>  	struct dirent **di;
>  	struct scandir_result sr;
> -	int r, i;
> +	int r, i, ret = 0;
>  	regex_t *enable_re = NULL;
>  
>  	foreigns = vector_alloc();
> @@ -157,13 +157,15 @@ static int _init_foreign(const char
> *multipath_dir, const char *enable)
>  	if (r == 0) {
>  		condlog(3, "%s: no foreign multipath libraries found",
>  			__func__);
> -		return 0;
> +		ret = 0;
> +		goto out;
>  	} else if (r < 0) {
>  		r = errno;
>  		condlog(1, "%s: error %d scanning foreign multipath
> libraries",
>  			__func__, r);
>  		_cleanup_foreign();
> -		return -r;
> +		ret = -r;
> +		goto out;
>  	}
>  
>  	sr.di = di;
> @@ -250,8 +252,9 @@ static int _init_foreign(const char
> *multipath_dir, const char *enable)
>  		free_foreign(fgn);
>  	}
>  	pthread_cleanup_pop(1); /* free_scandir_result */
> +out:
>  	pthread_cleanup_pop(1); /* free_pre */
> -	return 0;
> +	return ret;
>  }
>  
>  int init_foreign(const char *multipath_dir, const char *enable)

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 00/17] Multipath patch dump
  2020-02-11  8:45 ` [PATCH v2 00/17] Multipath patch dump Martin Wilck
@ 2020-02-11 10:18   ` Martin Wilck
  0 siblings, 0 replies; 37+ messages in thread
From: Martin Wilck @ 2020-02-11 10:18 UTC (permalink / raw)
  To: Benjamin Marzinski, Christophe Varoqui; +Cc: device-mapper development

Hi Christophe,

On Tue, 2020-02-11 at 09:45 +0100, Martin Wilck wrote:
> 
> ACK for the series: Martin Wilck <mwilck@suse.com>
> 
> Ben and I agreed that he'll follow up with some minor fixes, but
> there's no reason to hold this back.

There are a number of conflicts between Ben's series and my previous
72-part series from October/November. Ben and I will sort this out.

Martin

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 15/17] libmultipath: make directio checker share io contexts
  2020-02-11  8:38       ` Martin Wilck
@ 2020-02-11 22:21         ` Benjamin Marzinski
  0 siblings, 0 replies; 37+ messages in thread
From: Benjamin Marzinski @ 2020-02-11 22:21 UTC (permalink / raw)
  To: Martin Wilck; +Cc: device-mapper development

On Tue, Feb 11, 2020 at 09:38:39AM +0100, Martin Wilck wrote:
> On Mon, 2020-02-10 at 17:15 -0600, Benjamin Marzinski wrote:
> > On Mon, Feb 10, 2020 at 06:08:14PM +0100, Martin Wilck wrote:
> > > On Wed, 2020-02-05 at 12:58 -0600, Benjamin Marzinski wrote:
> > > > On systems with a large number of cores (>500), io_destroy() can
> > > > take
> > > > tens to hundreds of milliseconds to complete, due to RCU
> > > > synchronization. If there are a large number of paths using the
> > > > directio
> > > > checker on such a system, this can lead to multipath taking
> > > > almost a
> > > > minute to complete, with the vast majority of time taken up by
> > > > io_destroy().
> > > > 
> > > > To solve this, the directio checker now allocates one aio context
> > > > for
> > > > every 1024 checkers. This reduces the io_destroy() delay to a
> > > > thousandth
> > > > of its previous level. However, this means that muliple checkers
> > > > are
> > > > sharing the same aio context, and must be able to handle getting
> > > > results
> > > > for other checkers.  Because only one checker is ever running at
> > > > a
> > > > time, this doesn't require any locking.  However, locking could
> > > > be added
> > > > in the future if necessary, to allow multiple checkers to run at
> > > > the
> > > > same time.
> > > > 
> > > > When checkers are freed, they usually no longer destroy the io
> > > > context.
> > > > Instead, they attempt to cancel any outstanding request. If that
> > > > fails,
> > > > they put the request on an orphan list, so that it can be freed
> > > > by other
> > > > checkers, once it has completed. IO contexts are only destroyed
> > > > at three
> > > > times, during reconfigure (to deal with the possibility that
> > > > multipathd
> > > > is holding more aio events than it needs to be, since there is a
> > > > single
> > > > limit for the whole system), when the checker class is unloaded,
> > > > and
> > > > in a corner case when checkers are freed. If an aio_group (which
> > > > contains the aio context) is entirely full of orphaned requests,
> > > > then
> > > > no checker can use it. Since no checker is using it, there is no
> > > > checker
> > > > to clear out the orphaned requests. In this (likely rare) case,
> > > > the
> > > > last checker from that group to be freed and leave behind an
> > > > orphaned
> > > > request will call io_destroy() and remove the group.
> > > > 
> > > > Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> > > > ---
> > > >  libmultipath/checkers/directio.c | 336
> > > > +++++++++++++++++++++++++------
> > > >  multipath/multipath.conf.5       |   7 +-
> > > >  2 files changed, 281 insertions(+), 62 deletions(-)
> > > 
> > > Although I concur now that your design is sound, I still have some
> > > issues, see below.
> > > ...
> > >  	}
> > > >  	ct->running++;
> > > 
> > > This looks to me as if in the case (ct->running && ct->req->state
> > > ==
> > > PATH_PENDING), ct->running could become > 1, even though you don't
> > > start a new IO. Is that intended? I don't think it matters because
> > > you
> > > never decrement, but it looks weird.
> > 
> > That's necessary for how the checker currently works. In async mode
> > checker doesn't actually wait for a specific number of seconds (and
> > didn't before my patch). It assumes 1 sec call times for pending
> > paths,
> > and times out after ct->running > timeout_secs. That's why the unit
> > tests can get away with simply calling the checker repeatedly,
> > without
> > waiting, to make it timeout. But I suppose that
> > wait_for_pending_paths()
> > will also be making the checker time out quicker, so this should
> > probably be changed. However, since check_path won't set a paths
> > state
> > to PATH_PENDING if it wasn't already, it's not really a very likely
> > issue to occur.
> 
> Bah. I should have realized that, of course. Yet measuring the timeout
> in *seconds*, and polling for it the way we currently do, is really 80s
> design... I guess I was confused by the use of ns timeout calculation
> for the get_events() call (suggesting high-res timing), and the use of
> "ct->running" as both indicator of running IO and "abstract run time".
> 
> I hope you admit that the logic in check_path() is convoluted and hard
> to understand :-/ . For example here:
> 
> > 	while(1) {
> > 		r = get_events(ct->aio_grp, timep);
> > 
> > 		if (ct->req->state != PATH_PENDING) {
> > 			ct->running = 0;
> > 			return ct->req->state;
> > 		} else if (r == 0 || !timep)
> > 			break;
> > 
> > 		get_monotonic_time(&currtime);
> > 		timespecsub(&endtime, &currtime, &timeout);
> > 		if (timeout.tv_sec < 0)
> > 			timep = NULL;
> 
> We're past the timeout already, having seen some events, just not for
> the path we're currently looking at. Why do we go through another
> iteration?

Well, in this case we are past the timeout now, but weren't when
io_getevents() completed, so the code loops one more time to check
(without waiting) if the current path's request has completed. Since we
call io_getevents() after setting the timeout to be all the remaining
time, this should only ever happen if io_getevents() exitted early.
Granted, it likely didn't exit very early (unless the thread was
preempted afterwards), since it is now past the timeout.

-Ben

> 
> > 	}
> > 	if (ct->running > timeout_secs || sync) {
> >  
> > > See comment for get_events() above. Why don't you simply do this?
> > > 
> > >     timeout.tv_sec = timeout.tv_nsec = 0;
> > > 
> > 
> > Sure.
> > 
> > So, If I will post a seperate patch that resolves these issues, can
> > this
> > one have an ack?
> 
> Yes. I agree we should move forward.
> 
> Regards
> Martin
> 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 03/17] libmultipath: fix leak in foreign code
  2020-02-11  9:36   ` Martin Wilck
@ 2020-02-11 22:47     ` Benjamin Marzinski
  0 siblings, 0 replies; 37+ messages in thread
From: Benjamin Marzinski @ 2020-02-11 22:47 UTC (permalink / raw)
  To: Martin Wilck; +Cc: device-mapper development

On Tue, Feb 11, 2020 at 10:36:19AM +0100, Martin Wilck wrote:
> Hi Ben, hi Christophe,
> 
> On Wed, 2020-02-05 at 12:58 -0600, Benjamin Marzinski wrote:
> > If scandir fails or finds no foreign libraries, enable_re needs to be
> > freed before exitting.
> > 
> > Fixes: 8d03eda4 'multipath.conf: add "enable_foreign" parameter'
> > Reviewed-by: Martin Wilck <mwilck@suse.com>
> > Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> > ---
> >  libmultipath/foreign.c | 11 +++++++----
> >  1 file changed, 7 insertions(+), 4 deletions(-)
> 
> While trying to merge your series into my tree, I realized that this
> patch conflicts with my previously submitted patch "libmultipath:
> _init_foreign(): fix possible memory leak"
> https://www.redhat.com/archives/dm-devel/2019-October/msg00104.html
> which fixed the same issue.
> 
> The two patches are almost equal, so I really don't care which one
> is eventually taken. I just wanted to alert Christophe about the
> conflict.
> 
> Anyway, I thought that you'd ACK'd my October 72-part patch series in
> the following messages:
> 
> https://www.redhat.com/archives/dm-devel/2019-October/msg00214.html
>   (Ben: ACK on v2 of 72-part series "multipath-tools: cleanup and
> warning enablement", except 16/72 "libmultipath: make path_discovery()
> pthread_cancel()-safe")
> https://www.redhat.com/archives/dm-devel/2019-November/msg00016.html
>   (Ben: ACK on 16/72 "libmultipath: make path_discovery()
> pthread_cancel()-safe", after discussion)
> https://www.redhat.com/archives/dm-devel/2019-November/msg00085.html
>   (Ben: Ack on v2->v3 change, updated patch v3 45/72 "libmultipath: fix
> -Wsign-compare warnings with snprintf()")
> 
> Please clarify if I misunderstood that and the series still needs work
> from your PoV.

No. Your fix is fine. It's my bad. I didn't rebase my branch on top of
your fixes.

-Ben
 
> Regards
> Martin
> 
> 
> > 
> > diff --git a/libmultipath/foreign.c b/libmultipath/foreign.c
> > index 4b34e141..68e9a9b8 100644
> > --- a/libmultipath/foreign.c
> > +++ b/libmultipath/foreign.c
> > @@ -129,7 +129,7 @@ static int _init_foreign(const char
> > *multipath_dir, const char *enable)
> >  	char pathbuf[PATH_MAX];
> >  	struct dirent **di;
> >  	struct scandir_result sr;
> > -	int r, i;
> > +	int r, i, ret = 0;
> >  	regex_t *enable_re = NULL;
> >  
> >  	foreigns = vector_alloc();
> > @@ -157,13 +157,15 @@ static int _init_foreign(const char
> > *multipath_dir, const char *enable)
> >  	if (r == 0) {
> >  		condlog(3, "%s: no foreign multipath libraries found",
> >  			__func__);
> > -		return 0;
> > +		ret = 0;
> > +		goto out;
> >  	} else if (r < 0) {
> >  		r = errno;
> >  		condlog(1, "%s: error %d scanning foreign multipath
> > libraries",
> >  			__func__, r);
> >  		_cleanup_foreign();
> > -		return -r;
> > +		ret = -r;
> > +		goto out;
> >  	}
> >  
> >  	sr.di = di;
> > @@ -250,8 +252,9 @@ static int _init_foreign(const char
> > *multipath_dir, const char *enable)
> >  		free_foreign(fgn);
> >  	}
> >  	pthread_cleanup_pop(1); /* free_scandir_result */
> > +out:
> >  	pthread_cleanup_pop(1); /* free_pre */
> > -	return 0;
> > +	return ret;
> >  }
> >  
> >  int init_foreign(const char *multipath_dir, const char *enable)
> 

^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, other threads:[~2020-02-11 22:47 UTC | newest]

Thread overview: 37+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-05 18:58 [PATCH v2 00/17] Multipath patch dump Benjamin Marzinski
2020-02-05 18:58 ` [PATCH v2 01/17] multipathd: warn when configuration has been changed Benjamin Marzinski
2020-02-10 14:22   ` Martin Wilck
2020-02-05 18:58 ` [PATCH v2 02/17] multipathd: staticify uxlsnr variables/functions Benjamin Marzinski
2020-02-10 14:24   ` Martin Wilck
2020-02-05 18:58 ` [PATCH v2 03/17] libmultipath: fix leak in foreign code Benjamin Marzinski
2020-02-11  9:36   ` Martin Wilck
2020-02-11 22:47     ` Benjamin Marzinski
2020-02-05 18:58 ` [PATCH v2 04/17] Fix leak in mpathpersist Benjamin Marzinski
2020-02-05 18:58 ` [PATCH v2 05/17] libmultipath: remove unused path->prio_args Benjamin Marzinski
2020-02-05 18:58 ` [PATCH v2 06/17] libmultipath: constify get_unaligned_be* Benjamin Marzinski
2020-02-05 18:58 ` [PATCH v2 07/17] libmultipath: add missing hwe mpe variable merges Benjamin Marzinski
2020-02-05 18:58 ` [PATCH v2 08/17] libmultipath: fix sgio_get_vpd looping Benjamin Marzinski
2020-02-10 14:35   ` Martin Wilck
2020-02-05 18:58 ` [PATCH v2 09/17] libmultipath: add vend_id to get_vpd_sgio Benjamin Marzinski
2020-02-05 18:58 ` [PATCH v2 10/17] libmultipath: add code to get vendor specific vpd data Benjamin Marzinski
2020-02-10 14:49   ` Martin Wilck
2020-02-10 20:49     ` Benjamin Marzinski
2020-02-11  8:09       ` Martin Wilck
2020-02-11  8:39   ` Martin Wilck
2020-02-05 18:58 ` [PATCH v2 11/17] libmultipath: change how the checker async is set Benjamin Marzinski
2020-02-05 18:58 ` [PATCH v2 12/17] libmultipath: change failed path prio timeout Benjamin Marzinski
2020-02-10 14:51   ` Martin Wilck
2020-02-05 18:58 ` [PATCH v2 13/17] multipathd: add new paths under vecs lock Benjamin Marzinski
2020-02-05 18:58 ` [PATCH v2 14/17] libmultipath: add new checker class functions Benjamin Marzinski
2020-02-05 18:58 ` [PATCH v2 15/17] libmultipath: make directio checker share io contexts Benjamin Marzinski
2020-02-10 17:08   ` Martin Wilck
2020-02-10 23:15     ` Benjamin Marzinski
2020-02-11  8:38       ` Martin Wilck
2020-02-11 22:21         ` Benjamin Marzinski
2020-02-11  8:42   ` Martin Wilck
2020-02-05 18:58 ` [PATCH v2 16/17] tests: add directio unit tests Benjamin Marzinski
2020-02-10 17:23   ` Martin Wilck
2020-02-05 18:58 ` [PATCH v2 17/17] tests: make directio tests able to work on a real device Benjamin Marzinski
2020-02-10 17:27   ` Martin Wilck
2020-02-11  8:45 ` [PATCH v2 00/17] Multipath patch dump Martin Wilck
2020-02-11 10:18   ` Martin Wilck

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.