All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 00/20] Various multipath-tools fixes
@ 2018-01-13 21:19 Martin Wilck
  2018-01-13 21:19 ` [PATCH v2 01/20] libmultipath: don't try to set hwhandler if it is retained Martin Wilck
                   ` (21 more replies)
  0 siblings, 22 replies; 33+ messages in thread
From: Martin Wilck @ 2018-01-13 21:19 UTC (permalink / raw)
  To: Christophe Varoqui, dm-devel, Xose Vazquez Perez; +Cc: Martin Wilck

Hi Christophe,

here is v2 of my last series, rebased onto your latest merges.

Changes wrt v1:

  - dropped already merged CLOCK_MONOTONIC patch
  - added a compiler warning fix
  - improved placement of NetApp entry in hwtable (Xose Vazquez Perez, 09/20)
  - added two SUSE patches that were missing in the previous series
    (15/20, 16/20; Xose Vazquez Perez)
  - added the re-submission of my path latency prio fixes from November (17-20/20)
  - fixed wrong calculation of standard deviation (Guan Junxiong, 19/20)

I added 16/20 at Xose's request, knowing that it may be a bit more
controversial than the rest. At SUSE we've currently added it only for
SLE-12 SP2, because SLES12-SP2 GA had DefaultTaskMax=512, which is much
too low for multipathd. Current systemd versions have increased this to
4915, which would be sufficient for most but not all multipath
deployments (https://github.com/systemd/systemd/issues/3211).

Regards,
Martin

Martin Wilck (20):
  libmultipath: don't try to set hwhandler if it is retained
  multipath: delegate dangerous commands to multipathd
  libmultipath: condlog: log to stderr
  libmultipath: fix return code of sysfs_get_timeout
  libmultipath: fix return code of sgio_get_vpd()
  libmultipath: sgio_get_vpd: add page argument
  libmultipath: get_vpd_sgio: support VPD 0xc9
  libmultipath: select ALUA prioritizer for RDAC arrays only
  libmultipath: hwtable: multibus for NetApp NVMe-FC
  multipath -C: decrease log level
  kpartx.rules: fix by-id/scsi-* for user_friendly_names
  multipathd.socket: add WantedBy=sockets.target
  multipathd.service: drop Before=lvm2-lvmetad.service
  multipathd: fix compiler warning for uev_pathfail_check
  test-kpartx: add test for mapping without UUID
  multipathd.service: set TasksMax=infinity
  libmultipath: path latency: fix default base num
  libmultipath: path latency: log threshold with p2
  libmultipath: path latency: simplify getprio()
  libmultipath: path latency: remove warnings

 Makefile.inc                             |   2 +-
 kpartx/dm-parts.rules                    |   4 +-
 kpartx/kpartx_id                         |   8 ++-
 kpartx/test-kpartx                       |  11 ++-
 libmultipath/configure.c                 |   8 ++-
 libmultipath/debug.c                     |   4 +-
 libmultipath/discovery.c                 |  16 +++--
 libmultipath/hwtable.c                   |  37 +++++++---
 libmultipath/prioritizers/path_latency.c | 112 ++++++++++---------------------
 libmultipath/propsel.c                   |  37 +++++++++-
 multipath/11-dm-mpath.rules              |   3 +-
 multipath/main.c                         |  66 +++++++++++++++++-
 multipathd/main.c                        |   4 +-
 multipathd/multipathd.service            |   3 +-
 multipathd/multipathd.socket             |   3 +
 15 files changed, 210 insertions(+), 108 deletions(-)

-- 
2.15.1

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v2 01/20] libmultipath: don't try to set hwhandler if it is retained
  2018-01-13 21:19 [PATCH v2 00/20] Various multipath-tools fixes Martin Wilck
@ 2018-01-13 21:19 ` Martin Wilck
  2018-01-13 21:19 ` [PATCH v2 02/20] multipath: delegate dangerous commands to multipathd Martin Wilck
                   ` (20 subsequent siblings)
  21 siblings, 0 replies; 33+ messages in thread
From: Martin Wilck @ 2018-01-13 21:19 UTC (permalink / raw)
  To: Christophe Varoqui, dm-devel, Xose Vazquez Perez; +Cc: Martin Wilck

Setting a device handler only works if retain_attached_hw_handler
is 'no', or if the kernel didn't auto-assign a handler. If this
is not the case, don't even attempt to set a different handler.

This requires reading the sysfs "dh_state" path attribute.
---
 libmultipath/configure.c |  8 +++++++-
 libmultipath/propsel.c   | 35 +++++++++++++++++++++++++++++++++++
 2 files changed, 42 insertions(+), 1 deletion(-)

diff --git a/libmultipath/configure.c b/libmultipath/configure.c
index 22c11a26c284..13e14cc25fff 100644
--- a/libmultipath/configure.c
+++ b/libmultipath/configure.c
@@ -277,15 +277,21 @@ int setup_map(struct multipath *mpp, char *params, int params_size)
 
 	/*
 	 * properties selectors
+	 *
+	 * Ordering matters for some properties:
+	 * - features after no_path_retry and retain_hwhandler
+	 * - hwhandler after retain_hwhandler
+	 * No guarantee that this list is complete, check code in
+	 * propsel.c if in doubt.
 	 */
 	conf = get_multipath_config();
 	select_pgfailback(conf, mpp);
 	select_pgpolicy(conf, mpp);
 	select_selector(conf, mpp);
-	select_hwhandler(conf, mpp);
 	select_no_path_retry(conf, mpp);
 	select_retain_hwhandler(conf, mpp);
 	select_features(conf, mpp);
+	select_hwhandler(conf, mpp);
 	select_rr_weight(conf, mpp);
 	select_minio(conf, mpp);
 	select_mode(conf, mpp);
diff --git a/libmultipath/propsel.c b/libmultipath/propsel.c
index be371fcc35bc..4db0913a95a8 100644
--- a/libmultipath/propsel.c
+++ b/libmultipath/propsel.c
@@ -19,9 +19,11 @@
 #include "discovery.h"
 #include "dict.h"
 #include "util.h"
+#include "sysfs.h"
 #include "prioritizers/alua_rtpg.h"
 #include "prkey.h"
 #include <inttypes.h>
+#include <libudev.h>
 
 pgpolicyfn *pgpolicies[] = {
 	NULL,
@@ -353,9 +355,42 @@ out:
 	return 0;
 }
 
+static int get_dh_state(struct path *pp, char *value, size_t value_len)
+{
+	struct udev_device *ud;
+
+	if (pp->udev == NULL)
+		return -1;
+
+	ud = udev_device_get_parent_with_subsystem_devtype(
+		pp->udev, "scsi", "scsi_device");
+	if (ud == NULL)
+		return -1;
+
+	return sysfs_attr_get_value(ud, "dh_state", value, value_len);
+}
+
 int select_hwhandler(struct config *conf, struct multipath *mp)
 {
 	char *origin;
+	struct path *pp;
+	/* dh_state is no longer than "detached" */
+	char handler[12];
+	char *dh_state;
+	int i;
+
+	dh_state = &handler[2];
+	if (mp->retain_hwhandler != RETAIN_HWHANDLER_OFF) {
+		vector_foreach_slot(mp->paths, pp, i) {
+			if (get_dh_state(pp, dh_state, sizeof(handler) - 2) > 0
+			    && strcmp(dh_state, "detached")) {
+				memcpy(handler, "1 ", 2);
+				mp->hwhandler = handler;
+				origin = "(setting: retained by kernel driver)";
+				goto out;
+			}
+		}
+	}
 
 	mp_set_hwe(hwhandler);
 	mp_set_conf(hwhandler);
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 02/20] multipath: delegate dangerous commands to multipathd
  2018-01-13 21:19 [PATCH v2 00/20] Various multipath-tools fixes Martin Wilck
  2018-01-13 21:19 ` [PATCH v2 01/20] libmultipath: don't try to set hwhandler if it is retained Martin Wilck
@ 2018-01-13 21:19 ` Martin Wilck
  2018-01-13 21:19 ` [PATCH v2 03/20] libmultipath: condlog: log to stderr Martin Wilck
                   ` (19 subsequent siblings)
  21 siblings, 0 replies; 33+ messages in thread
From: Martin Wilck @ 2018-01-13 21:19 UTC (permalink / raw)
  To: Christophe Varoqui, dm-devel, Xose Vazquez Perez; +Cc: Martin Wilck

Some multipath commands are dangerous to run while multipathd is running.
For example, "multipath -r" may apply a modified configuration to the kernel,
while multipathd is still using the old configuration, leading to
inconsistent state between multipathd and the kernel.

It is safer to use equivalent multipathd client commands instead.
For now, do this only for "multipath -r", but other invocations
may be added in the future. Perhaps some day, all "multipath"
commands will be mapped to multipathd actions.

Note that with delegation, "multipath -r" will not produce any
terminal output, so this may affect users who capture "multipath -r"
output for parsing it. It would be very hard to produce compatible output
to the normal multipath command for different verbosity levels. I
considered running "show topology" after "reconfigure", but the output
would have slightly different format and would only match -v2, anyway.

I plan to convert more multipath commands, but that needs some more
thought. Some additional multipathd client commands need to be
implemented first.

Changes wrt v2:
 - use libmpathcmd rather than exec'ing multipathd (Ben Marzinski)
 - pass more parameters from main program, preparing for other commands
---
 Makefile.inc     |  2 +-
 multipath/main.c | 63 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 64 insertions(+), 1 deletion(-)

diff --git a/Makefile.inc b/Makefile.inc
index d953f5ef34e6..d82d3b5df3fe 100644
--- a/Makefile.inc
+++ b/Makefile.inc
@@ -90,7 +90,7 @@ OPTFLAGS	= -O2 -g -pipe -Wall -Wextra -Wformat=2 -Werror=implicit-int \
 		  -Wp,-D_FORTIFY_SOURCE=2 $(STACKPROT) \
 		  --param=ssp-buffer-size=4
 
-CFLAGS		:= $(OPTFLAGS) -DLIB_STRING=\"${LIB}\" -DRUN_DIR=\"${RUN}\" \
+CFLAGS		:= $(OPTFLAGS) -DBIN_DIR=\"$(bindir)\" -DLIB_STRING=\"${LIB}\" -DRUN_DIR=\"${RUN}\" \
 		   $(CFLAGS)
 BIN_CFLAGS	= -fPIE -DPIE
 LIB_CFLAGS	= -fPIC
diff --git a/multipath/main.c b/multipath/main.c
index bffe0653d905..25162a0e9a7c 100644
--- a/multipath/main.c
+++ b/multipath/main.c
@@ -578,6 +578,64 @@ get_dev_type(char *dev) {
 	return DEV_NONE;
 }
 
+/*
+ * Some multipath commands are dangerous to run while multipathd is running.
+ * For example, "multipath -r" may apply a modified configuration to the kernel,
+ * while multipathd is still using the old configuration, leading to
+ * inconsistent state.
+ *
+ * It is safer to use equivalent multipathd client commands instead.
+ */
+int delegate_to_multipathd(enum mpath_cmds cmd, const char *dev,
+			   enum devtypes dev_type, const struct config *conf)
+{
+	int fd;
+	char command[1024], *p, *reply;
+	int n, r = 0;
+
+	fd = mpath_connect();
+	if (fd == -1)
+		return 0;
+
+	p = command;
+	*p = '\0';
+	n = sizeof(command);
+
+	if (cmd == CMD_CREATE && conf->force_reload == FORCE_RELOAD_YES) {
+		p += snprintf(p, n, "reconfigure");
+	}
+	/* Add other translations here */
+
+	if (strlen(command) == 0)
+		/* No command found, no need to delegate */
+		return 0;
+	else if (p >= command + sizeof(command)) {
+		condlog(0, "internal error - command buffer overflow");
+		r = -1;
+		goto out;
+	}
+
+	condlog(3, "delegating command to multipathd");
+	r = mpath_process_cmd(fd, command, &reply, conf->uxsock_timeout);
+
+	if (r == -1) {
+		condlog(1, "error in multipath command %s: %s",
+			command, strerror(errno));
+		goto out;
+	}
+
+	if (reply != NULL && *reply != '\0' && strcmp(reply, "ok\n"))
+		printf("%s", reply);
+	r = 1;
+
+out:
+	FREE(reply);
+	close(fd);
+	if (r < 0)
+		exit(1);
+	return r;
+}
+
 int
 main (int argc, char *argv[])
 {
@@ -781,10 +839,15 @@ main (int argc, char *argv[])
 		} else
 			mpath_disconnect(fd);
 	}
+
 	if (cmd == CMD_REMOVE_WWID && !dev) {
 		condlog(0, "the -w option requires a device");
 		goto out;
 	}
+
+	if (delegate_to_multipathd(cmd, dev, dev_type, conf))
+		exit(0);
+
 	if (cmd == CMD_RESET_WWIDS) {
 		struct multipath * mpp;
 		int i;
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 03/20] libmultipath: condlog: log to stderr
  2018-01-13 21:19 [PATCH v2 00/20] Various multipath-tools fixes Martin Wilck
  2018-01-13 21:19 ` [PATCH v2 01/20] libmultipath: don't try to set hwhandler if it is retained Martin Wilck
  2018-01-13 21:19 ` [PATCH v2 02/20] multipath: delegate dangerous commands to multipathd Martin Wilck
@ 2018-01-13 21:19 ` Martin Wilck
  2018-01-13 21:19 ` [PATCH v2 04/20] libmultipath: fix return code of sysfs_get_timeout Martin Wilck
                   ` (18 subsequent siblings)
  21 siblings, 0 replies; 33+ messages in thread
From: Martin Wilck @ 2018-01-13 21:19 UTC (permalink / raw)
  To: Christophe Varoqui, dm-devel, Xose Vazquez Perez; +Cc: Martin Wilck

Calling 'multipath' might result in various error messages, all
of which should be directed to stderr.
Having them intermixed with the actual output on stdout makes
parsing really hard.

Signed-off-by: Martin Wilck <mwilck@suse.com>
---
 libmultipath/debug.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/libmultipath/debug.c b/libmultipath/debug.c
index f89b26443ea5..f95a3e5f97f5 100644
--- a/libmultipath/debug.c
+++ b/libmultipath/debug.c
@@ -37,9 +37,9 @@ void dlog (int sink, int prio, const char * fmt, ...)
 					 "%b %d %H:%M:%S", tb);
 				buff[sizeof(buff)-1] = '\0';
 
-				fprintf(stdout, "%s | ", buff);
+				fprintf(stderr, "%s | ", buff);
 			}
-			vfprintf(stdout, fmt, ap);
+			vfprintf(stderr, fmt, ap);
 		}
 		else
 			log_safe(prio + 3, fmt, ap);
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 04/20] libmultipath: fix return code of sysfs_get_timeout
  2018-01-13 21:19 [PATCH v2 00/20] Various multipath-tools fixes Martin Wilck
                   ` (2 preceding siblings ...)
  2018-01-13 21:19 ` [PATCH v2 03/20] libmultipath: condlog: log to stderr Martin Wilck
@ 2018-01-13 21:19 ` Martin Wilck
  2018-01-13 21:19 ` [PATCH v2 05/20] libmultipath: fix return code of sgio_get_vpd() Martin Wilck
                   ` (17 subsequent siblings)
  21 siblings, 0 replies; 33+ messages in thread
From: Martin Wilck @ 2018-01-13 21:19 UTC (permalink / raw)
  To: Christophe Varoqui, dm-devel, Xose Vazquez Perez; +Cc: Martin Wilck

This function should return 1 on success, as the other sysfs_get_XXX
functions. The callers actually expect this - therefore timeout values
from sysfs are never used.
---
 libmultipath/discovery.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libmultipath/discovery.c b/libmultipath/discovery.c
index 4b31ddef23f0..ab7794d8d2e1 100644
--- a/libmultipath/discovery.c
+++ b/libmultipath/discovery.c
@@ -276,7 +276,7 @@ sysfs_get_timeout(struct path *pp, unsigned int *timeout)
 	}
 	*timeout = t;
 
-	return 0;
+	return 1;
 }
 
 int
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 05/20] libmultipath: fix return code of sgio_get_vpd()
  2018-01-13 21:19 [PATCH v2 00/20] Various multipath-tools fixes Martin Wilck
                   ` (3 preceding siblings ...)
  2018-01-13 21:19 ` [PATCH v2 04/20] libmultipath: fix return code of sysfs_get_timeout Martin Wilck
@ 2018-01-13 21:19 ` Martin Wilck
  2018-01-13 21:19 ` [PATCH v2 06/20] libmultipath: sgio_get_vpd: add page argument Martin Wilck
                   ` (16 subsequent siblings)
  21 siblings, 0 replies; 33+ messages in thread
From: Martin Wilck @ 2018-01-13 21:19 UTC (permalink / raw)
  To: Christophe Varoqui, dm-devel, Xose Vazquez Perez; +Cc: Martin Wilck

This function must return the length of data received in success
case.
---
 libmultipath/discovery.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libmultipath/discovery.c b/libmultipath/discovery.c
index ab7794d8d2e1..c53d3aab897f 100644
--- a/libmultipath/discovery.c
+++ b/libmultipath/discovery.c
@@ -852,7 +852,7 @@ retry:
 			return len;
 		if (len > DEFAULT_SGIO_LEN)
 			goto retry;
-		return 0;
+		return len;
 	}
 	return -1;
 }
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 06/20] libmultipath: sgio_get_vpd: add page argument
  2018-01-13 21:19 [PATCH v2 00/20] Various multipath-tools fixes Martin Wilck
                   ` (4 preceding siblings ...)
  2018-01-13 21:19 ` [PATCH v2 05/20] libmultipath: fix return code of sgio_get_vpd() Martin Wilck
@ 2018-01-13 21:19 ` Martin Wilck
  2018-01-13 21:19 ` [PATCH v2 07/20] libmultipath: get_vpd_sgio: support VPD 0xc9 Martin Wilck
                   ` (15 subsequent siblings)
  21 siblings, 0 replies; 33+ messages in thread
From: Martin Wilck @ 2018-01-13 21:19 UTC (permalink / raw)
  To: Christophe Varoqui, dm-devel, Xose Vazquez Perez; +Cc: Martin Wilck

get_vpd_sgio() assumes to be able to send different VPD inquires.
This requires passing the pg argument to sgio_get_vpd().
---
 libmultipath/discovery.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/libmultipath/discovery.c b/libmultipath/discovery.c
index c53d3aab897f..43631c46f21e 100644
--- a/libmultipath/discovery.c
+++ b/libmultipath/discovery.c
@@ -837,7 +837,7 @@ detect_alua(struct path * pp, struct config *conf)
 #define DEFAULT_SGIO_LEN 254
 
 static int
-sgio_get_vpd (unsigned char * buff, int maxlen, int fd)
+sgio_get_vpd (unsigned char * buff, int maxlen, int fd, int pg)
 {
 	int len = DEFAULT_SGIO_LEN;
 
@@ -846,7 +846,7 @@ sgio_get_vpd (unsigned char * buff, int maxlen, int fd)
 		return -1;
 	}
 retry:
-	if (0 == do_inq(fd, 0, 1, 0x83, buff, len)) {
+	if (0 == do_inq(fd, 0, 1, pg, buff, len)) {
 		len = buff[3] + (buff[2] << 8);
 		if (len >= maxlen)
 			return len;
@@ -1099,7 +1099,7 @@ get_vpd_sgio (int fd, int pg, char * str, int maxlen)
 	unsigned char buff[4096];
 
 	memset(buff, 0x0, 4096);
-	if (sgio_get_vpd(buff, 4096, fd) <= 0) {
+	if (sgio_get_vpd(buff, 4096, fd, pg) <= 0) {
 		condlog(3, "failed to issue vpd inquiry for pg%02x",
 			pg);
 		return -errno;
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 07/20] libmultipath: get_vpd_sgio: support VPD 0xc9
  2018-01-13 21:19 [PATCH v2 00/20] Various multipath-tools fixes Martin Wilck
                   ` (5 preceding siblings ...)
  2018-01-13 21:19 ` [PATCH v2 06/20] libmultipath: sgio_get_vpd: add page argument Martin Wilck
@ 2018-01-13 21:19 ` Martin Wilck
  2018-01-13 21:19 ` [PATCH v2 08/20] libmultipath: select ALUA prioritizer for RDAC arrays only Martin Wilck
                   ` (14 subsequent siblings)
  21 siblings, 0 replies; 33+ messages in thread
From: Martin Wilck @ 2018-01-13 21:19 UTC (permalink / raw)
  To: Christophe Varoqui, dm-devel, Xose Vazquez Perez; +Cc: Martin Wilck

This VPD is needed to check for NetApp E-Series RDAC support.
---
 libmultipath/discovery.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/libmultipath/discovery.c b/libmultipath/discovery.c
index 43631c46f21e..8ae170ee3a27 100644
--- a/libmultipath/discovery.c
+++ b/libmultipath/discovery.c
@@ -1118,7 +1118,11 @@ get_vpd_sgio (int fd, int pg, char * str, int maxlen)
 		len = parse_vpd_pg80(buff, str, maxlen);
 	else if (pg == 0x83)
 		len = parse_vpd_pg83(buff, buff_len, str, maxlen);
-	else
+	else if (pg == 0xc9 && maxlen >= 8) {
+		len = buff_len < 8 ? -ENODATA :
+			(buff_len <= maxlen ? buff_len : maxlen);
+		memcpy (str, buff, len);
+	} else
 		len = -ENOSYS;
 
 	return len;
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 08/20] libmultipath: select ALUA prioritizer for RDAC arrays only
  2018-01-13 21:19 [PATCH v2 00/20] Various multipath-tools fixes Martin Wilck
                   ` (6 preceding siblings ...)
  2018-01-13 21:19 ` [PATCH v2 07/20] libmultipath: get_vpd_sgio: support VPD 0xc9 Martin Wilck
@ 2018-01-13 21:19 ` Martin Wilck
  2018-01-13 21:19 ` [PATCH v2 09/20] libmultipath: hwtable: multibus for NetApp NVMe-FC Martin Wilck
                   ` (13 subsequent siblings)
  21 siblings, 0 replies; 33+ messages in thread
From: Martin Wilck @ 2018-01-13 21:19 UTC (permalink / raw)
  To: Christophe Varoqui, dm-devel, Xose Vazquez Perez; +Cc: Martin Wilck

Since commit 7e2f46d3, multipathd with "detect_prio" setting (=default)
chooses ALUA prioritizer rather than sysfs prioritizer for arrays with
implicit TPGS. But the intention of that patch was to choose ALUA for
NetApp E-Series (RDAC) storage arrays *only*, not for every array
with implicit TPGS.
---
 libmultipath/propsel.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libmultipath/propsel.c b/libmultipath/propsel.c
index 4db0913a95a8..58a6a42fe333 100644
--- a/libmultipath/propsel.c
+++ b/libmultipath/propsel.c
@@ -497,7 +497,7 @@ detect_prio(struct config *conf, struct path * pp)
 
 	if (pp->tpgs <= 0)
 		return;
-	if (pp->tpgs == 2 && !check_rdac(pp)) {
+	if (pp->tpgs == 2 || !check_rdac(pp)) {
 		if (sysfs_get_asymmetric_access_state(pp, buff, 512) >= 0)
 			default_prio = PRIO_SYSFS;
 	}
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 09/20] libmultipath: hwtable: multibus for NetApp NVMe-FC
  2018-01-13 21:19 [PATCH v2 00/20] Various multipath-tools fixes Martin Wilck
                   ` (7 preceding siblings ...)
  2018-01-13 21:19 ` [PATCH v2 08/20] libmultipath: select ALUA prioritizer for RDAC arrays only Martin Wilck
@ 2018-01-13 21:19 ` Martin Wilck
  2018-01-19 11:52   ` [PATCH] FIX "libmultipath: hwtable: multibus for NetApp NVMe-FC" Martin Wilck
  2018-01-23 17:07   ` [FIX PATCH v2 09/20] libmultipath: hwtable: no_path_retry="queue" for NetApp NVMe Martin Wilck
  2018-01-13 21:19 ` [PATCH v2 10/20] multipath -C: decrease log level Martin Wilck
                   ` (12 subsequent siblings)
  21 siblings, 2 replies; 33+ messages in thread
From: Martin Wilck @ 2018-01-13 21:19 UTC (permalink / raw)
  To: Christophe Varoqui, dm-devel, Xose Vazquez Perez; +Cc: Martin Wilck

Use multibus policy for NetApp NVMe-FC namespace controllers.
The search logic in find_hwe() looks for vendor/product matches backwards, and
quits if a match is found. Therefore specific sub-entries of a generic entry
have to be listed below the generic ones. Therefore, pull the generic NVME
entry with ".*" product name match on top. The NetApp-specific one is put
into the NetApp section. This way, more vendor-specific exceptions for NVME
may be added later.
---
 libmultipath/hwtable.c | 37 +++++++++++++++++++++++++++----------
 1 file changed, 27 insertions(+), 10 deletions(-)

diff --git a/libmultipath/hwtable.c b/libmultipath/hwtable.c
index 448effe3c859..1cde60a6a41d 100644
--- a/libmultipath/hwtable.c
+++ b/libmultipath/hwtable.c
@@ -77,6 +77,20 @@
 #endif
 
 static struct hwentry default_hw[] = {
+       /*
+	* Generic NVMe devices
+	*
+	* Due to the parsing logic in find_hwe(), generic entries
+	* have to be put on top of this list, and more specific ones
+	* below.
+	*/
+	{
+		.vendor        = "NVME",
+		.product       = ".*",
+		.uid_attribute = "ID_WWN",
+		.checker_name  = NONE,
+		.retain_hwhandler = RETAIN_HWHANDLER_OFF,
+	},
 	/*
 	 * Apple
 	 *
@@ -666,6 +680,19 @@ static struct hwentry default_hw[] = {
 		.pgpolicy      = MULTIBUS,
 		.no_path_retry = 24,
 	},
+	/*
+	 * NetApp NVMe-FC namespace devices: MULTIBUS preferred
+	 *
+	 * The table is searched backwards, so place this after generic NVMe
+	 */
+	{
+		.vendor	       = "NVME",
+		.product       = "(NetApp |)ONTAP Controller)",
+		.uid_attribute = "ID_WWN",
+		.checker_name  = NONE,
+		.pgpolicy      = MULTIBUS,
+		.retain_hwhandler = RETAIN_HWHANDLER_OFF,
+	},
 	/*
 	 * Nexenta
 	 *
@@ -1133,16 +1160,6 @@ static struct hwentry default_hw[] = {
 		.prio_name     = PRIO_ALUA,
 		.no_path_retry = 30,
 	},
-	/*
-	 * Generic NVMe devices
-	 */
-	{
-		.vendor        = "NVME",
-		.product       = ".*",
-		.uid_attribute = "ID_WWN",
-		.checker_name  = NONE,
-		.retain_hwhandler = RETAIN_HWHANDLER_OFF,
-	},
 	/*
 	 * Dot Hill Systems - Seagate Technology
 	 */
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 10/20] multipath -C: decrease log level
  2018-01-13 21:19 [PATCH v2 00/20] Various multipath-tools fixes Martin Wilck
                   ` (8 preceding siblings ...)
  2018-01-13 21:19 ` [PATCH v2 09/20] libmultipath: hwtable: multibus for NetApp NVMe-FC Martin Wilck
@ 2018-01-13 21:19 ` Martin Wilck
  2018-01-13 21:19 ` [PATCH v2 11/20] kpartx.rules: fix by-id/scsi-* for user_friendly_names Martin Wilck
                   ` (11 subsequent siblings)
  21 siblings, 0 replies; 33+ messages in thread
From: Martin Wilck @ 2018-01-13 21:19 UTC (permalink / raw)
  To: Christophe Varoqui, dm-devel, Xose Vazquez Perez; +Cc: Martin Wilck

Avoid that the positive case ("usable paths found") spams the
syslog on systems with many LUNs.
---
 multipath/main.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/multipath/main.c b/multipath/main.c
index 25162a0e9a7c..52bf1658bbca 100644
--- a/multipath/main.c
+++ b/multipath/main.c
@@ -325,7 +325,8 @@ static int check_usable_paths(struct config *conf,
 		}
 	}
 found:
-	condlog(2, "%s:%s usable paths found", devpath, r == 0 ? "" : " no");
+	condlog(r == 0 ? 3 : 2, "%s:%s usable paths found",
+		devpath, r == 0 ? "" : " no");
 free:
 	FREE(mapname);
 	free_multipath(mpp, FREE_PATHS);
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 11/20] kpartx.rules: fix by-id/scsi-* for user_friendly_names
  2018-01-13 21:19 [PATCH v2 00/20] Various multipath-tools fixes Martin Wilck
                   ` (9 preceding siblings ...)
  2018-01-13 21:19 ` [PATCH v2 10/20] multipath -C: decrease log level Martin Wilck
@ 2018-01-13 21:19 ` Martin Wilck
  2018-01-13 21:19 ` [PATCH v2 12/20] multipathd.socket: add WantedBy=sockets.target Martin Wilck
                   ` (10 subsequent siblings)
  21 siblings, 0 replies; 33+ messages in thread
From: Martin Wilck @ 2018-01-13 21:19 UTC (permalink / raw)
  To: Christophe Varoqui, dm-devel, Xose Vazquez Perez; +Cc: Martin Wilck

With user_friendly names (or generally, with aliases), the udev
rules create /dev/disk/by-id/scsi-${ALIAS} rather than
/dev/disk/by-id/scsi-${ID_SERIAL}. Fix that.
---
 kpartx/dm-parts.rules       | 4 ++--
 kpartx/kpartx_id            | 8 +++++++-
 multipath/11-dm-mpath.rules | 3 ++-
 3 files changed, 11 insertions(+), 4 deletions(-)

diff --git a/kpartx/dm-parts.rules b/kpartx/dm-parts.rules
index 235642fd9ae3..b48b67c809d5 100644
--- a/kpartx/dm-parts.rules
+++ b/kpartx/dm-parts.rules
@@ -31,8 +31,8 @@ ENV{DM_UDEV_LOW_PRIORITY_FLAG}!="1", OPTIONS+="link_priority=50"
 IMPORT{program}=="kpartx_id %M %m $env{DM_UUID}"
 
 # DM_TYPE only has a reasonable value for partitions on multipath.
-ENV{DM_UUID}=="*-mpath-*", ENV{DM_TYPE}=="?*" \
-	SYMLINK+="disk/by-id/$env{DM_TYPE}-$env{DM_NAME}"
+ENV{DM_UUID}=="*-mpath-*", ENV{DM_TYPE}=="?*", ENV{DM_SERIAL}=="?*" \
+	SYMLINK+="disk/by-id/$env{DM_TYPE}-$env{DM_SERIAL}-part$env{DM_PART}"
 ENV{DM_WWN}=="?*", ENV{DM_PART}=="?*", \
 	SYMLINK+="disk/by-id/wwn-$env{DM_WWN}-part$env{DM_PART}"
 
diff --git a/kpartx/kpartx_id b/kpartx/kpartx_id
index b7f802d9b455..c45db2f80489 100755
--- a/kpartx/kpartx_id
+++ b/kpartx/kpartx_id
@@ -42,6 +42,7 @@ fi
 dmuuid=${UUID#*-}
 dmtbl=${UUID%%-*}
 dmpart=${dmtbl#part}
+dmserial=
 # kpartx types are 'part<num>'
 if [ "$dmpart" = "$dmtbl" ] ; then
     dmpart=
@@ -59,10 +60,12 @@ if [ "$dmtbl" = "part" ] ; then
     case "$dmuuid" in
 	mpath-*)
 	    dmdeps=$($DMSETUP deps -u $dmuuid)
+	    dmserial=${dmuuid#mpath-}
 	    ;;
     esac
 elif [ "$dmtbl" = "mpath" ] ; then
     dmname="$dmuuid"
+    dmserial="$dmuuid"
     # We need the dependencies of the table to figure out the type
     dmdeps=$($DMSETUP deps -u $UUID)
 fi
@@ -84,11 +87,14 @@ if [ -n "$dmdeps" ] ; then
 	    ;;
 	*)
 	    echo "DM_TYPE=scsi"
-	    echo "DM_WWN=0x${dmname#?}"
+	    echo "DM_WWN=0x${dmserial#?}"
 	    ;;
     esac
 else
     echo "DM_TYPE=raid"
 fi
+if [[ $dmserial ]]; then
+    echo "DM_SERIAL=$dmserial"
+fi
 
 exit 0
diff --git a/multipath/11-dm-mpath.rules b/multipath/11-dm-mpath.rules
index 91d805877f0c..03ac5da22957 100644
--- a/multipath/11-dm-mpath.rules
+++ b/multipath/11-dm-mpath.rules
@@ -100,7 +100,8 @@ ENV{DM_UDEV_LOW_PRIORITY_FLAG}!="1", OPTIONS+="link_priority=50"
 TEST=="/usr/lib/udev/kpartx_id", \
 	IMPORT{program}=="kpartx_id %M %m $env{DM_UUID}"
 
-ENV{DM_TYPE}=="?*", SYMLINK+="disk/by-id/$env{DM_TYPE}-$env{DM_NAME}"
+ENV{DM_TYPE}=="?*", ENV{DM_SERIAL}=="?*", \
+	SYMLINK+="disk/by-id/$env{DM_TYPE}-$env{DM_SERIAL}"
 ENV{DM_WWN}=="?*", SYMLINK+="disk/by-id/wwn-$env{DM_WWN}"
 
 LABEL="mpath_end"
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 12/20] multipathd.socket: add WantedBy=sockets.target
  2018-01-13 21:19 [PATCH v2 00/20] Various multipath-tools fixes Martin Wilck
                   ` (10 preceding siblings ...)
  2018-01-13 21:19 ` [PATCH v2 11/20] kpartx.rules: fix by-id/scsi-* for user_friendly_names Martin Wilck
@ 2018-01-13 21:19 ` Martin Wilck
  2018-01-13 21:19 ` [PATCH v2 13/20] multipathd.service: drop Before=lvm2-lvmetad.service Martin Wilck
                   ` (9 subsequent siblings)
  21 siblings, 0 replies; 33+ messages in thread
From: Martin Wilck @ 2018-01-13 21:19 UTC (permalink / raw)
  To: Christophe Varoqui, dm-devel, Xose Vazquez Perez; +Cc: Martin Wilck

Without this, "systemctl enable multipathd.service" spits out
an error message.
---
 multipathd/multipathd.socket | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/multipathd/multipathd.socket b/multipathd/multipathd.socket
index 921706d5d333..0ed4a1f7958b 100644
--- a/multipathd/multipathd.socket
+++ b/multipathd/multipathd.socket
@@ -5,3 +5,6 @@ Before=sockets.target
 
 [Socket]
 ListenStream=@/org/kernel/linux/storage/multipathd
+
+[Install]
+WantedBy=sockets.target
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 13/20] multipathd.service: drop Before=lvm2-lvmetad.service
  2018-01-13 21:19 [PATCH v2 00/20] Various multipath-tools fixes Martin Wilck
                   ` (11 preceding siblings ...)
  2018-01-13 21:19 ` [PATCH v2 12/20] multipathd.socket: add WantedBy=sockets.target Martin Wilck
@ 2018-01-13 21:19 ` Martin Wilck
  2018-01-13 21:19 ` [PATCH v2 14/20] multipathd: fix compiler warning for uev_pathfail_check Martin Wilck
                   ` (8 subsequent siblings)
  21 siblings, 0 replies; 33+ messages in thread
From: Martin Wilck @ 2018-01-13 21:19 UTC (permalink / raw)
  To: Christophe Varoqui, dm-devel, Xose Vazquez Perez; +Cc: Martin Wilck

lvm2-lvmetad does not depend on multipathd. On the contrary,
when multipathd sets up a map that contains LVM PVs, it's good
if lvm2-lvmetad is already running.
---
 multipathd/multipathd.service | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/multipathd/multipathd.service b/multipathd/multipathd.service
index fd66cf62af17..0cf99b2f2df5 100644
--- a/multipathd/multipathd.service
+++ b/multipathd/multipathd.service
@@ -1,7 +1,7 @@
 [Unit]
 Description=Device-Mapper Multipath Device Controller
 Wants=systemd-udev-trigger.service systemd-udev-settle.service
-Before=iscsi.service iscsid.service lvm2-lvmetad.service lvm2-activation-early.service
+Before=iscsi.service iscsid.service lvm2-activation-early.service
 Before=local-fs-pre.target blk-availability.service
 After=multipathd.socket systemd-udev-trigger.service systemd-udev-settle.service
 DefaultDependencies=no
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 14/20] multipathd: fix compiler warning for uev_pathfail_check
  2018-01-13 21:19 [PATCH v2 00/20] Various multipath-tools fixes Martin Wilck
                   ` (12 preceding siblings ...)
  2018-01-13 21:19 ` [PATCH v2 13/20] multipathd.service: drop Before=lvm2-lvmetad.service Martin Wilck
@ 2018-01-13 21:19 ` Martin Wilck
  2018-01-13 21:19 ` [PATCH v2 15/20] test-kpartx: add test for mapping without UUID Martin Wilck
                   ` (7 subsequent siblings)
  21 siblings, 0 replies; 33+ messages in thread
From: Martin Wilck @ 2018-01-13 21:19 UTC (permalink / raw)
  To: Christophe Varoqui, dm-devel, Xose Vazquez Perez; +Cc: Martin Wilck

gcc7 spits out an indentation warning for this function.

Fixes: 8392431 "multipath-tools: check null path before handle path-failed event"
---
 multipathd/main.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/multipathd/main.c b/multipathd/main.c
index 27cf234623d0..ccbb0ad13d32 100644
--- a/multipathd/main.c
+++ b/multipathd/main.c
@@ -1020,8 +1020,8 @@ uev_pathfail_check(struct uevent *uev, struct vectors *vecs)
 	lock(&vecs->lock);
 	pthread_testcancel();
 	pp = find_path_by_devt(vecs->pathvec, devt);
-    if (!pp)
-        goto out_lock;
+	if (!pp)
+		goto out_lock;
 	r = io_err_stat_handle_pathfail(pp);
 	if (r)
 		condlog(3, "io_err_stat: %s: cannot handle pathfail uevent",
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 15/20] test-kpartx: add test for mapping without UUID
  2018-01-13 21:19 [PATCH v2 00/20] Various multipath-tools fixes Martin Wilck
                   ` (13 preceding siblings ...)
  2018-01-13 21:19 ` [PATCH v2 14/20] multipathd: fix compiler warning for uev_pathfail_check Martin Wilck
@ 2018-01-13 21:19 ` Martin Wilck
  2018-01-13 21:19 ` [PATCH v2 16/20] multipathd.service: set TasksMax=infinity Martin Wilck
                   ` (6 subsequent siblings)
  21 siblings, 0 replies; 33+ messages in thread
From: Martin Wilck @ 2018-01-13 21:19 UTC (permalink / raw)
  To: Christophe Varoqui, dm-devel, Xose Vazquez Perez; +Cc: Martin Wilck

kpartx shouldn't remove these mappings.
---
 kpartx/test-kpartx | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/kpartx/test-kpartx b/kpartx/test-kpartx
index 60b3eb231de2..09d15a9458ea 100755
--- a/kpartx/test-kpartx
+++ b/kpartx/test-kpartx
@@ -131,7 +131,7 @@ step "create DM devices (spans)"
 # They also serve as DM devices to test partition removal on those.
 
 TABLE="\
-0 $((SIZE/SECTSIZ-OFFS)) linear $DEV1 $OFFS
+0 $((SIZE/SECTSIZ-OFFS)) linear $DEV1 $OFFS 
 $((SIZE/SECTSIZ-OFFS)) $((SIZE/SECTSIZ-OFFS)) linear $DEV2 $OFFS"
 
 SPAN1=kpt
@@ -142,9 +142,17 @@ push_cleanup 'dmsetup remove -f $SPAN1'
 dmsetup create $SPAN2 <<<"$TABLE"
 push_cleanup 'dmsetup remove -f $SPAN2'
 
+# This is a non-kpartx pseudo "partition" mapping
+USER1=user1
+push_cleanup 'dmsetup remove -f $USER1'
+dmsetup create $USER1 <<EOF
+0 $((SIZE/SECTSIZ-OFFS)) linear $DEV1 $OFFS
+EOF
+
 usleep $WAIT_US
 [[ -b /dev/mapper/$SPAN1 ]]
 [[ -b /dev/mapper/$SPAN2 ]]
+[[ -b /dev/mapper/$USER1 ]]
 
 step "create vg on $LO3"
 # On the 3rd loop device, we create a VG and an LV
@@ -290,6 +298,7 @@ pop_cleanup
 # spans should not have been removed
 [[ -b /dev/mapper/$SPAN1 ]]
 [[ -b /dev/mapper/$SPAN2 ]]
+[[ -b /dev/mapper/$USER1 ]]
 # LVs neither
 [[ -b /dev/mapper/$VG-$LV ]]
 
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 16/20] multipathd.service: set TasksMax=infinity
  2018-01-13 21:19 [PATCH v2 00/20] Various multipath-tools fixes Martin Wilck
                   ` (14 preceding siblings ...)
  2018-01-13 21:19 ` [PATCH v2 15/20] test-kpartx: add test for mapping without UUID Martin Wilck
@ 2018-01-13 21:19 ` Martin Wilck
  2018-01-13 21:19 ` [PATCH v2 17/20] libmultipath: path latency: fix default base num Martin Wilck
                   ` (5 subsequent siblings)
  21 siblings, 0 replies; 33+ messages in thread
From: Martin Wilck @ 2018-01-13 21:19 UTC (permalink / raw)
  To: Christophe Varoqui, dm-devel, Xose Vazquez Perez; +Cc: Martin Wilck

Systemd 228 introduced a global task limit of 512 for system services.
System 231 changed this to 15% of pid_max, or 4915.
In particular the one of systemd 228 is too low for multipathd with
many LUNs. Use TasksMax=infinity to overcome this limitation.
---
 multipathd/multipathd.service | 1 +
 1 file changed, 1 insertion(+)

diff --git a/multipathd/multipathd.service b/multipathd/multipathd.service
index 0cf99b2f2df5..ba24983ef291 100644
--- a/multipathd/multipathd.service
+++ b/multipathd/multipathd.service
@@ -16,6 +16,7 @@ LimitCORE=infinity
 ExecStartPre=-/sbin/modprobe -a scsi_dh_alua scsi_dh_emc scsi_dh_rdac dm-multipath
 ExecStart=/sbin/multipathd -d -s
 ExecReload=/sbin/multipathd reconfigure
+TasksMax=infinity
 
 [Install]
 WantedBy=sysinit.target
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 17/20] libmultipath: path latency: fix default base num
  2018-01-13 21:19 [PATCH v2 00/20] Various multipath-tools fixes Martin Wilck
                   ` (15 preceding siblings ...)
  2018-01-13 21:19 ` [PATCH v2 16/20] multipathd.service: set TasksMax=infinity Martin Wilck
@ 2018-01-13 21:19 ` Martin Wilck
  2018-02-08 12:03   ` Guan Junxiong
  2018-01-13 21:19 ` [PATCH v2 18/20] libmultipath: path latency: log threshold with p2 Martin Wilck
                   ` (4 subsequent siblings)
  21 siblings, 1 reply; 33+ messages in thread
From: Martin Wilck @ 2018-01-13 21:19 UTC (permalink / raw)
  To: Christophe Varoqui, dm-devel, Xose Vazquez Perez; +Cc: Martin Wilck

I don't think anyone can measure latency to 1% accuracy. It's
better to not even pretend to be able to. 10% should be fine
even for the most latency-critical environments.
---
 libmultipath/prioritizers/path_latency.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/libmultipath/prioritizers/path_latency.c b/libmultipath/prioritizers/path_latency.c
index 9d5397ec1b3a..b8c5bc7c50a4 100644
--- a/libmultipath/prioritizers/path_latency.c
+++ b/libmultipath/prioritizers/path_latency.c
@@ -42,8 +42,9 @@
 #define DEF_IO_NUM		100
 
 #define MAX_BASE_NUM		10
-#define MIN_BASE_NUM		1.01
-#define DEF_BASE_NUM		1.5
+#define MIN_BASE_NUM		1.1
+// This is 10**(1/4). 4 prio steps correspond to a factor of 10.
+#define DEF_BASE_NUM		1.77827941004
 
 #define MAX_AVG_LATENCY		100000000.	/* Unit: us */
 #define MIN_AVG_LATENCY		1.		/* Unit: us */
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 18/20] libmultipath: path latency: log threshold with p2
  2018-01-13 21:19 [PATCH v2 00/20] Various multipath-tools fixes Martin Wilck
                   ` (16 preceding siblings ...)
  2018-01-13 21:19 ` [PATCH v2 17/20] libmultipath: path latency: fix default base num Martin Wilck
@ 2018-01-13 21:19 ` Martin Wilck
  2018-02-08 12:04   ` Guan Junxiong
  2018-01-13 21:19 ` [PATCH v2 19/20] libmultipath: path latency: simplify getprio() Martin Wilck
                   ` (3 subsequent siblings)
  21 siblings, 1 reply; 33+ messages in thread
From: Martin Wilck @ 2018-01-13 21:19 UTC (permalink / raw)
  To: Christophe Varoqui, dm-devel, Xose Vazquez Perez; +Cc: Martin Wilck

This is not a critical error. It just means that the path in
question will have low priority (rightly so, if it has >100s latency).
---
 libmultipath/prioritizers/path_latency.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libmultipath/prioritizers/path_latency.c b/libmultipath/prioritizers/path_latency.c
index b8c5bc7c50a4..ce5c95dd7075 100644
--- a/libmultipath/prioritizers/path_latency.c
+++ b/libmultipath/prioritizers/path_latency.c
@@ -299,7 +299,7 @@ int getprio(struct path *pp, char *args, unsigned int timeout)
 			(long long)pow(base_num, lg_avglatency));
 
 	if (lg_avglatency > lg_maxavglatency) {
-		pp_pl_log(0,
+		pp_pl_log(2,
 			  "%s: average latency (%lld us) is outside the thresold (%lld us)",
 			  pp->dev, (long long)pow(base_num, lg_avglatency),
 			  (long long)MAX_AVG_LATENCY);
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 19/20] libmultipath: path latency: simplify getprio()
  2018-01-13 21:19 [PATCH v2 00/20] Various multipath-tools fixes Martin Wilck
                   ` (17 preceding siblings ...)
  2018-01-13 21:19 ` [PATCH v2 18/20] libmultipath: path latency: log threshold with p2 Martin Wilck
@ 2018-01-13 21:19 ` Martin Wilck
  2018-02-08 12:05   ` Guan Junxiong
  2018-01-13 21:19 ` [PATCH v2 20/20] libmultipath: path latency: remove warnings Martin Wilck
                   ` (2 subsequent siblings)
  21 siblings, 1 reply; 33+ messages in thread
From: Martin Wilck @ 2018-01-13 21:19 UTC (permalink / raw)
  To: Christophe Varoqui, dm-devel, Xose Vazquez Perez; +Cc: Martin Wilck

The log standard deviation can be calculated much more simply
by realizing

   sum_n (x_i - avg(x))^2 == sum_n x_i^2 - n * avg(x)^2

Also, use timespecsub rather than the custom timeval_to_usec,
and avoid taking log(0).
---
 libmultipath/prioritizers/path_latency.c | 71 ++++++++++++++------------------
 1 file changed, 30 insertions(+), 41 deletions(-)

diff --git a/libmultipath/prioritizers/path_latency.c b/libmultipath/prioritizers/path_latency.c
index ce5c95dd7075..e764f1dd8a21 100644
--- a/libmultipath/prioritizers/path_latency.c
+++ b/libmultipath/prioritizers/path_latency.c
@@ -34,6 +34,7 @@
 #include "prio.h"
 #include "structs.h"
 #include "util.h"
+#include "time-util.h"
 
 #define pp_pl_log(prio, fmt, args...) condlog(prio, "path_latency prio: " fmt, ##args)
 
@@ -56,14 +57,6 @@
 
 #define DEF_BLK_SIZE		4096
 
-static double lg_path_latency[MAX_IO_NUM];
-
-static inline long long timeval_to_us(const struct timespec *tv)
-{
-	return ((long long)tv->tv_sec * USEC_PER_SEC) +
-	    (tv->tv_nsec / NSEC_PER_USEC);
-}
-
 static int prepare_directio_read(int fd, int *blksz, char **pbuf,
 		int *restore_flags)
 {
@@ -199,22 +192,6 @@ out:
 	return 0;
 }
 
-double calc_standard_deviation(double *lg_path_latency, int size,
-				  double lg_avglatency)
-{
-	int index;
-	double sum = 0;
-
-	for (index = 0; index < size; index++) {
-		sum += (lg_path_latency[index] - lg_avglatency) *
-			(lg_path_latency[index] - lg_avglatency);
-	}
-
-	sum /= (size - 1);
-
-	return sqrt(sum);
-}
-
 /*
  * Do not scale the prioriy in a certain range such as [0, 1024]
  * because scaling will eliminate the effect of base_num.
@@ -234,20 +211,16 @@ int calcPrio(double lg_avglatency, double lg_maxavglatency,
 int getprio(struct path *pp, char *args, unsigned int timeout)
 {
 	int rc, temp;
-	int index = 0;
 	int io_num = 0;
 	double base_num = 0;
 	double lg_avglatency, lg_maxavglatency, lg_minavglatency;
 	double standard_deviation;
 	double lg_toldelay = 0;
-	long long before, after;
-	struct timespec tv;
 	int blksize;
 	char *buf;
 	int restore_flags = 0;
 	double lg_base;
-	long long sum_latency = 0;
-	long long arith_mean_lat;
+	double sum_squares = 0;
 
 	if (pp->fd < 0)
 		return -1;
@@ -260,7 +233,6 @@ int getprio(struct path *pp, char *args, unsigned int timeout)
 				pp->dev, io_num, base_num);
 	}
 
-	memset(lg_path_latency, 0, sizeof(lg_path_latency));
 	lg_base = log(base_num);
 	lg_maxavglatency = log(MAX_AVG_LATENCY) / lg_base;
 	lg_minavglatency = log(MIN_AVG_LATENCY) / lg_base;
@@ -269,8 +241,10 @@ int getprio(struct path *pp, char *args, unsigned int timeout)
 
 	temp = io_num;
 	while (temp-- > 0) {
-		(void)clock_gettime(CLOCK_MONOTONIC, &tv);
-		before = timeval_to_us(&tv);
+		struct timespec tv_before, tv_after, tv_diff;
+		double diff, reldiff;
+
+		(void)clock_gettime(CLOCK_MONOTONIC, &tv_before);
 
 		if (do_directio_read(pp->fd, timeout, buf, blksize)) {
 			pp_pl_log(0, "%s: path down", pp->dev);
@@ -278,25 +252,34 @@ int getprio(struct path *pp, char *args, unsigned int timeout)
 			return -1;
 		}
 
-		(void)clock_gettime(CLOCK_MONOTONIC, &tv);
-		after = timeval_to_us(&tv);
+		(void)clock_gettime(CLOCK_MONOTONIC, &tv_after);
+
+		timespecsub(&tv_after, &tv_before, &tv_diff);
+		diff = tv_diff.tv_sec * 1000 * 1000 + tv_diff.tv_nsec / 1000;
+
+		if (diff == 0)
+			/*
+			 * Avoid taking log(0).
+			 * This unlikely case is treated as minimum -
+			 * the sums don't increase
+			 */
+			continue;
+
+		/* we scale by lg_base here */
+		reldiff = log(diff) / lg_base;
+
 		/*
 		 * We assume that the latency complies with Log-normal
 		 * distribution. The logarithm of latency is in normal
 		 * distribution.
 		 */
-		lg_path_latency[index] = log(after - before) / lg_base;
-		lg_toldelay += lg_path_latency[index++];
-		sum_latency += after - before;
+		lg_toldelay += reldiff;
+		sum_squares += reldiff * reldiff;
 	}
 
 	cleanup_directio_read(pp->fd, buf, restore_flags);
 
 	lg_avglatency = lg_toldelay / (long long)io_num;
-	arith_mean_lat = sum_latency / (long long)io_num;
-	pp_pl_log(4, "%s: arithmetic mean latency is (%lld us), geometric mean latency is (%lld us)",
-			pp->dev, arith_mean_lat,
-			(long long)pow(base_num, lg_avglatency));
 
 	if (lg_avglatency > lg_maxavglatency) {
 		pp_pl_log(2,
@@ -340,8 +323,14 @@ int getprio(struct path *pp, char *args, unsigned int timeout)
 			  ".It is recommend to be set larger",
 			  pp->dev, base_num);
 
+	standard_deviation = sqrt((sum_squares - lg_toldelay * lg_avglatency)
+				  / (io_num - 1));
 
 	rc = calcPrio(lg_avglatency, lg_maxavglatency, lg_minavglatency);
 
+	pp_pl_log(3, "%s: latency avg=%.2e uncertainty=%.1f prio=%d\n",
+		  pp->dev, exp(lg_avglatency * lg_base),
+		  exp(standard_deviation * lg_base), rc);
+
 	return rc;
 }
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 20/20] libmultipath: path latency: remove warnings
  2018-01-13 21:19 [PATCH v2 00/20] Various multipath-tools fixes Martin Wilck
                   ` (18 preceding siblings ...)
  2018-01-13 21:19 ` [PATCH v2 19/20] libmultipath: path latency: simplify getprio() Martin Wilck
@ 2018-01-13 21:19 ` Martin Wilck
  2018-02-08 12:06   ` Guan Junxiong
  2018-01-15  8:05 ` [PATCH v2 00/20] Various multipath-tools fixes Hannes Reinecke
  2018-01-18 17:02 ` Benjamin Marzinski
  21 siblings, 1 reply; 33+ messages in thread
From: Martin Wilck @ 2018-01-13 21:19 UTC (permalink / raw)
  To: Christophe Varoqui, dm-devel, Xose Vazquez Perez; +Cc: Martin Wilck

The warnings at here are pointless. We are looking at a single
path only. Firstly, the standdard deviation for this measurement
can't be "too low" - the lower, the more precise the measurement,
the better. Secondly, a high standard deviation indicates an
unstable path with highly variable latency. Not good, but nothing
to warn about here.

What matters for the selection of "base_num" is not how a single
path behaves, but how different paths of the same path group relate
to each other, which we don't know at this point at the code.

What we want to avoid is too fine a differentiation, in particular
in combination with group_by_prio, because we'd loose the ability for
load balancing. But this is rather a topic for the man page or a
"best practices" document.
---
 libmultipath/prioritizers/path_latency.c | 34 --------------------------------
 1 file changed, 34 deletions(-)

diff --git a/libmultipath/prioritizers/path_latency.c b/libmultipath/prioritizers/path_latency.c
index e764f1dd8a21..765265c02333 100644
--- a/libmultipath/prioritizers/path_latency.c
+++ b/libmultipath/prioritizers/path_latency.c
@@ -289,40 +289,6 @@ int getprio(struct path *pp, char *args, unsigned int timeout)
 		return DEFAULT_PRIORITY;
 	}
 
-	standard_deviation = calc_standard_deviation(lg_path_latency,
-			index, lg_avglatency);
-	/*
-	 * In calPrio(), we let prio y = f(x) = log(max, base) - log (x, base);
-	 * So if we want to let the priority of the latency outside 2 standard
-	 * deviations can be distinguished from the latency inside 2 standard
-	 * deviation, in others words at most 95% are the same and at least 5%
-	 * are different according interval estimation of normal distribution,
-	 * we should warn the user to set the base_num to be smaller if the
-	 * log(x_threshold, base) is small than 2 standard deviation.
-	 * x_threshold is derived from:
-	 * y + 1 = f(x) + 1 = f(x) + log(base, base), so x_threadshold =
-	 * base_num; Note that we only can compare the logarithm of x_threshold
-	 * with the standard deviation because the standard deviation is derived
-	 * from logarithm of latency.
-	 *
-	 * therefore , we recommend the base_num to meet the condition :
-	 * 1 <= 2 * standard_deviation
-	 */
-	pp_pl_log(5, "%s: standard deviation for logarithm of latency = %.6f",
-			pp->dev, standard_deviation);
-	if (standard_deviation <= 0.5)
-		pp_pl_log(3, "%s: the base_num(%.3lf) is too big to distinguish different priority "
-			  "of two far-away latency. It is recommend to be set smaller",
-			  pp->dev, base_num);
-	/*
-	 * If the standard deviation is too large , we should also warn the user
-	 */
-
-	if (standard_deviation > 4)
-		pp_pl_log(3, "%s: the base_num(%.3lf) is too small to avoid noise disturbance "
-			  ".It is recommend to be set larger",
-			  pp->dev, base_num);
-
 	standard_deviation = sqrt((sum_squares - lg_toldelay * lg_avglatency)
 				  / (io_num - 1));
 
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 00/20] Various multipath-tools fixes
  2018-01-13 21:19 [PATCH v2 00/20] Various multipath-tools fixes Martin Wilck
                   ` (19 preceding siblings ...)
  2018-01-13 21:19 ` [PATCH v2 20/20] libmultipath: path latency: remove warnings Martin Wilck
@ 2018-01-15  8:05 ` Hannes Reinecke
  2018-01-18 17:02 ` Benjamin Marzinski
  21 siblings, 0 replies; 33+ messages in thread
From: Hannes Reinecke @ 2018-01-15  8:05 UTC (permalink / raw)
  To: Martin Wilck, Christophe Varoqui, dm-devel, Xose Vazquez Perez

On 01/13/2018 10:19 PM, Martin Wilck wrote:
> Hi Christophe,
> 
> here is v2 of my last series, rebased onto your latest merges.
> 
> Changes wrt v1:
> 
>   - dropped already merged CLOCK_MONOTONIC patch
>   - added a compiler warning fix
>   - improved placement of NetApp entry in hwtable (Xose Vazquez Perez, 09/20)
>   - added two SUSE patches that were missing in the previous series
>     (15/20, 16/20; Xose Vazquez Perez)
>   - added the re-submission of my path latency prio fixes from November (17-20/20)
>   - fixed wrong calculation of standard deviation (Guan Junxiong, 19/20)
> 
> I added 16/20 at Xose's request, knowing that it may be a bit more
> controversial than the rest. At SUSE we've currently added it only for
> SLE-12 SP2, because SLES12-SP2 GA had DefaultTaskMax=512, which is much
> too low for multipathd. Current systemd versions have increased this to
> 4915, which would be sufficient for most but not all multipath
> deployments (https://github.com/systemd/systemd/issues/3211).
> 
> Regards,
> Martin
> 
Patches look good.
For this series:

Reviewed-by: Hannes Reinecke <hare@suse.com>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 00/20] Various multipath-tools fixes
  2018-01-13 21:19 [PATCH v2 00/20] Various multipath-tools fixes Martin Wilck
                   ` (20 preceding siblings ...)
  2018-01-15  8:05 ` [PATCH v2 00/20] Various multipath-tools fixes Hannes Reinecke
@ 2018-01-18 17:02 ` Benjamin Marzinski
  21 siblings, 0 replies; 33+ messages in thread
From: Benjamin Marzinski @ 2018-01-18 17:02 UTC (permalink / raw)
  To: Martin Wilck; +Cc: dm-devel, Xose Vazquez Perez

On Sat, Jan 13, 2018 at 10:19:18PM +0100, Martin Wilck wrote:
> Hi Christophe,
> 
> here is v2 of my last series, rebased onto your latest merges.
> 
> Changes wrt v1:
> 
>   - dropped already merged CLOCK_MONOTONIC patch
>   - added a compiler warning fix
>   - improved placement of NetApp entry in hwtable (Xose Vazquez Perez, 09/20)
>   - added two SUSE patches that were missing in the previous series
>     (15/20, 16/20; Xose Vazquez Perez)
>   - added the re-submission of my path latency prio fixes from November (17-20/20)
>   - fixed wrong calculation of standard deviation (Guan Junxiong, 19/20)
> 
> I added 16/20 at Xose's request, knowing that it may be a bit more
> controversial than the rest. At SUSE we've currently added it only for
> SLE-12 SP2, because SLES12-SP2 GA had DefaultTaskMax=512, which is much
> too low for multipathd. Current systemd versions have increased this to
> 4915, which would be sufficient for most but not all multipath
> deployments (https://github.com/systemd/systemd/issues/3211).

These all look fine.

Reviewed-by: Benjamin Marzinski <bmarzins@redhat.com>
 
> Regards,
> Martin
> 
> Martin Wilck (20):
>   libmultipath: don't try to set hwhandler if it is retained
>   multipath: delegate dangerous commands to multipathd
>   libmultipath: condlog: log to stderr
>   libmultipath: fix return code of sysfs_get_timeout
>   libmultipath: fix return code of sgio_get_vpd()
>   libmultipath: sgio_get_vpd: add page argument
>   libmultipath: get_vpd_sgio: support VPD 0xc9
>   libmultipath: select ALUA prioritizer for RDAC arrays only
>   libmultipath: hwtable: multibus for NetApp NVMe-FC
>   multipath -C: decrease log level
>   kpartx.rules: fix by-id/scsi-* for user_friendly_names
>   multipathd.socket: add WantedBy=sockets.target
>   multipathd.service: drop Before=lvm2-lvmetad.service
>   multipathd: fix compiler warning for uev_pathfail_check
>   test-kpartx: add test for mapping without UUID
>   multipathd.service: set TasksMax=infinity
>   libmultipath: path latency: fix default base num
>   libmultipath: path latency: log threshold with p2
>   libmultipath: path latency: simplify getprio()
>   libmultipath: path latency: remove warnings
> 
>  Makefile.inc                             |   2 +-
>  kpartx/dm-parts.rules                    |   4 +-
>  kpartx/kpartx_id                         |   8 ++-
>  kpartx/test-kpartx                       |  11 ++-
>  libmultipath/configure.c                 |   8 ++-
>  libmultipath/debug.c                     |   4 +-
>  libmultipath/discovery.c                 |  16 +++--
>  libmultipath/hwtable.c                   |  37 +++++++---
>  libmultipath/prioritizers/path_latency.c | 112 ++++++++++---------------------
>  libmultipath/propsel.c                   |  37 +++++++++-
>  multipath/11-dm-mpath.rules              |   3 +-
>  multipath/main.c                         |  66 +++++++++++++++++-
>  multipathd/main.c                        |   4 +-
>  multipathd/multipathd.service            |   3 +-
>  multipathd/multipathd.socket             |   3 +
>  15 files changed, 210 insertions(+), 108 deletions(-)
> 
> -- 
> 2.15.1

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH] FIX "libmultipath: hwtable: multibus for NetApp NVMe-FC"
  2018-01-13 21:19 ` [PATCH v2 09/20] libmultipath: hwtable: multibus for NetApp NVMe-FC Martin Wilck
@ 2018-01-19 11:52   ` Martin Wilck
  2018-01-19 14:44     ` Xose Vazquez Perez
  2018-01-23 17:07   ` [FIX PATCH v2 09/20] libmultipath: hwtable: no_path_retry="queue" for NetApp NVMe Martin Wilck
  1 sibling, 1 reply; 33+ messages in thread
From: Martin Wilck @ 2018-01-19 11:52 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: dm-devel, Xose Vazquez Perez

There was an extra parenthesis in the patch I submitted
previously.

Fixes: "libmultipath: hwtable: multibus for NetApp NVMe-FC"
Signed-off-by: Martin Wilck <mwilck@suse.com>
---
 libmultipath/hwtable.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libmultipath/hwtable.c b/libmultipath/hwtable.c
index 4f3290123f09..918db7b0d019 100644
--- a/libmultipath/hwtable.c
+++ b/libmultipath/hwtable.c
@@ -687,7 +687,7 @@ static struct hwentry default_hw[] = {
 	 */
 	{
 		.vendor	       = "NVME",
-		.product       = "(NetApp |)ONTAP Controller)",
+		.product       = "(NetApp |)ONTAP Controller",
 		.uid_attribute = "ID_WWN",
 		.checker_name  = NONE,
 		.pgpolicy      = MULTIBUS,
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [PATCH] FIX "libmultipath: hwtable: multibus for NetApp NVMe-FC"
  2018-01-19 11:52   ` [PATCH] FIX "libmultipath: hwtable: multibus for NetApp NVMe-FC" Martin Wilck
@ 2018-01-19 14:44     ` Xose Vazquez Perez
  2018-01-19 15:12       ` Martin Wilck
  0 siblings, 1 reply; 33+ messages in thread
From: Xose Vazquez Perez @ 2018-01-19 14:44 UTC (permalink / raw)
  To: Martin Wilck; +Cc: dm-devel

On 01/19/2018 12:52 PM, Martin Wilck wrote:
> There was an extra parenthesis in the patch I submitted
> previously.

>  		.vendor	       = "NVME",
> -		.product       = "(NetApp |)ONTAP Controller)",
> +		.product       = "(NetApp |)ONTAP Controller",

Out of curiosity, what are the complete product and vendor id strings?

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] FIX "libmultipath: hwtable: multibus for NetApp NVMe-FC"
  2018-01-19 14:44     ` Xose Vazquez Perez
@ 2018-01-19 15:12       ` Martin Wilck
  2018-01-22 20:53         ` Xose Vazquez Perez
  0 siblings, 1 reply; 33+ messages in thread
From: Martin Wilck @ 2018-01-19 15:12 UTC (permalink / raw)
  To: Xose Vazquez Perez; +Cc: dm-devel

On Fri, 2018-01-19 at 15:44 +0100, Xose Vazquez Perez wrote:
> On 01/19/2018 12:52 PM, Martin Wilck wrote:
> > There was an extra parenthesis in the patch I submitted
> > previously.
> >  		.vendor	       = "NVME",
> > -		.product       = "(NetApp |)ONTAP Controller)",
> > +		.product       = "(NetApp |)ONTAP Controller",
> 
> Out of curiosity, what are the complete product and vendor id
> strings?
> 

All NVMe devices use "NVME" as vendor string.

The device in question uses "NetApp ONTAP Controller" for the product
(-> "model" in NVMe terms). IIUC there's older FW around that uses
"ONTAP Controller" only.

Regards,
Martin

-- 
Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] FIX "libmultipath: hwtable: multibus for NetApp NVMe-FC"
  2018-01-19 15:12       ` Martin Wilck
@ 2018-01-22 20:53         ` Xose Vazquez Perez
  2018-01-22 22:15           ` Martin Wilck
  0 siblings, 1 reply; 33+ messages in thread
From: Xose Vazquez Perez @ 2018-01-22 20:53 UTC (permalink / raw)
  To: Martin Wilck; +Cc: dm-devel

On 01/19/2018 04:12 PM, Martin Wilck wrote:

>>> +		.product       = "(NetApp |)ONTAP Controller",

> The device in question uses "NetApp ONTAP Controller" for the product
> (-> "model" in NVMe terms). IIUC there's older FW around that uses
> "ONTAP Controller" only.

.product       = "ONTAP Controller" should be enough,
it's shared by both strings.

But if an exact match is desired, ^ should be added at the
beginning.


Regex can be tested with: https://eli.thegreenplace.net/2012/11/14/some-notes-on-posix-regular-expressions

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] FIX "libmultipath: hwtable: multibus for NetApp NVMe-FC"
  2018-01-22 20:53         ` Xose Vazquez Perez
@ 2018-01-22 22:15           ` Martin Wilck
  0 siblings, 0 replies; 33+ messages in thread
From: Martin Wilck @ 2018-01-22 22:15 UTC (permalink / raw)
  To: Xose Vazquez Perez; +Cc: dm-devel

On Mon, 2018-01-22 at 21:53 +0100, Xose Vazquez Perez wrote:
> On 01/19/2018 04:12 PM, Martin Wilck wrote:
> 
> > > > +		.product       = "(NetApp |)ONTAP Controller",
> > The device in question uses "NetApp ONTAP Controller" for the
> > product
> > (-> "model" in NVMe terms). IIUC there's older FW around that uses
> > "ONTAP Controller" only.
> 
> .product       = "ONTAP Controller" should be enough,
> it's shared by both strings.
>
> But if an exact match is desired, ^ should be added at the
> beginning.

Right. Thanks. I'll add '^'. I thought these were anchored :-/

Martin

-- 
Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [FIX PATCH v2 09/20] libmultipath: hwtable: no_path_retry="queue" for NetApp NVMe
  2018-01-13 21:19 ` [PATCH v2 09/20] libmultipath: hwtable: multibus for NetApp NVMe-FC Martin Wilck
  2018-01-19 11:52   ` [PATCH] FIX "libmultipath: hwtable: multibus for NetApp NVMe-FC" Martin Wilck
@ 2018-01-23 17:07   ` Martin Wilck
  1 sibling, 0 replies; 33+ messages in thread
From: Martin Wilck @ 2018-01-23 17:07 UTC (permalink / raw)
  To: Christophe Varoqui, Xose Vazquez Perez, dm-devel; +Cc: Martin Wilck

Netapp requested this default setting for NetApp E-Series NVMe in addition to "multibus".
Also, finalize the product ID regex after consulting with NetApp (the FW reporting just
"ONTAP Controller" was beta only).

This obsoletes my previous "FIX" patch
'FIX "libmultipath: hwtable: multibus for NetApp NVMe-FC'

Fixes: "libmultipath: hwtable: multibus for NetApp NVMe-FC"
Signed-off-by: Martin Wilck <mwilck@suse.com>
---
 libmultipath/hwtable.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/libmultipath/hwtable.c b/libmultipath/hwtable.c
index 1cde60a6a41d..0915ea4dfa56 100644
--- a/libmultipath/hwtable.c
+++ b/libmultipath/hwtable.c
@@ -681,16 +681,17 @@ static struct hwentry default_hw[] = {
 		.no_path_retry = 24,
 	},
 	/*
-	 * NetApp NVMe-FC namespace devices: MULTIBUS preferred
+	 * NetApp NVMe-FC namespace devices: MULTIBUS, queueing preferred
 	 *
 	 * The table is searched backwards, so place this after generic NVMe
 	 */
 	{
 		.vendor	       = "NVME",
-		.product       = "(NetApp |)ONTAP Controller)",
+		.product       = "^NetApp ONTAP Controller",
 		.uid_attribute = "ID_WWN",
 		.checker_name  = NONE,
 		.pgpolicy      = MULTIBUS,
+		.no_path_retry = NO_PATH_RETRY_QUEUE,
 		.retain_hwhandler = RETAIN_HWHANDLER_OFF,
 	},
 	/*
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 17/20] libmultipath: path latency: fix default base num
  2018-01-13 21:19 ` [PATCH v2 17/20] libmultipath: path latency: fix default base num Martin Wilck
@ 2018-02-08 12:03   ` Guan Junxiong
  0 siblings, 0 replies; 33+ messages in thread
From: Guan Junxiong @ 2018-02-08 12:03 UTC (permalink / raw)
  To: Martin Wilck; +Cc: dm-devel, Xose Vazquez Perez, niuhaoxin, Shenhong (C)

Looks Good.

Reviewed-by Guan Junxiong <guanjunxiong@huawei.com>

On 2018/1/14 5:19, Martin Wilck wrote:
> I don't think anyone can measure latency to 1% accuracy. It's
> better to not even pretend to be able to. 10% should be fine
> even for the most latency-critical environments.
> ---
>  libmultipath/prioritizers/path_latency.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/libmultipath/prioritizers/path_latency.c b/libmultipath/prioritizers/path_latency.c
> index 9d5397ec1b3a..b8c5bc7c50a4 100644
> --- a/libmultipath/prioritizers/path_latency.c
> +++ b/libmultipath/prioritizers/path_latency.c
> @@ -42,8 +42,9 @@
>  #define DEF_IO_NUM		100
>  
>  #define MAX_BASE_NUM		10
> -#define MIN_BASE_NUM		1.01
> -#define DEF_BASE_NUM		1.5
> +#define MIN_BASE_NUM		1.1
> +// This is 10**(1/4). 4 prio steps correspond to a factor of 10.
> +#define DEF_BASE_NUM		1.77827941004
>  
>  #define MAX_AVG_LATENCY		100000000.	/* Unit: us */
>  #define MIN_AVG_LATENCY		1.		/* Unit: us */
> 

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 18/20] libmultipath: path latency: log threshold with p2
  2018-01-13 21:19 ` [PATCH v2 18/20] libmultipath: path latency: log threshold with p2 Martin Wilck
@ 2018-02-08 12:04   ` Guan Junxiong
  0 siblings, 0 replies; 33+ messages in thread
From: Guan Junxiong @ 2018-02-08 12:04 UTC (permalink / raw)
  To: Martin Wilck, Christophe Varoqui
  Cc: dm-devel, Xose Vazquez Perez, niuhaoxin, Shenhong (C)

Looks Good.

Reviewed-by: Guan Junxiong <guanjunxiong@huawei.com>

On 2018/1/14 5:19, Martin Wilck wrote:
> This is not a critical error. It just means that the path in
> question will have low priority (rightly so, if it has >100s latency).
> ---
>  libmultipath/prioritizers/path_latency.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/libmultipath/prioritizers/path_latency.c b/libmultipath/prioritizers/path_latency.c
> index b8c5bc7c50a4..ce5c95dd7075 100644
> --- a/libmultipath/prioritizers/path_latency.c
> +++ b/libmultipath/prioritizers/path_latency.c
> @@ -299,7 +299,7 @@ int getprio(struct path *pp, char *args, unsigned int timeout)
>  			(long long)pow(base_num, lg_avglatency));
>  
>  	if (lg_avglatency > lg_maxavglatency) {
> -		pp_pl_log(0,
> +		pp_pl_log(2,
>  			  "%s: average latency (%lld us) is outside the thresold (%lld us)",
>  			  pp->dev, (long long)pow(base_num, lg_avglatency),
>  			  (long long)MAX_AVG_LATENCY);
> 

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 19/20] libmultipath: path latency: simplify getprio()
  2018-01-13 21:19 ` [PATCH v2 19/20] libmultipath: path latency: simplify getprio() Martin Wilck
@ 2018-02-08 12:05   ` Guan Junxiong
  0 siblings, 0 replies; 33+ messages in thread
From: Guan Junxiong @ 2018-02-08 12:05 UTC (permalink / raw)
  To: Martin Wilck, Christophe Varoqui, dm-devel
  Cc: Xose Vazquez Perez, niuhaoxin, Shenhong (C)

Looks Good.

Reviewed-by Guan Junxiong <guanjunxiong@huawei.com>

On 2018/1/14 5:19, Martin Wilck wrote:
> The log standard deviation can be calculated much more simply
> by realizing
> 
>    sum_n (x_i - avg(x))^2 == sum_n x_i^2 - n * avg(x)^2
> 
> Also, use timespecsub rather than the custom timeval_to_usec,
> and avoid taking log(0).
> ---
>  libmultipath/prioritizers/path_latency.c | 71 ++++++++++++++------------------
>  1 file changed, 30 insertions(+), 41 deletions(-)
> 
> diff --git a/libmultipath/prioritizers/path_latency.c b/libmultipath/prioritizers/path_latency.c
> index ce5c95dd7075..e764f1dd8a21 100644
> --- a/libmultipath/prioritizers/path_latency.c
> +++ b/libmultipath/prioritizers/path_latency.c
> @@ -34,6 +34,7 @@
>  #include "prio.h"
>  #include "structs.h"
>  #include "util.h"
> +#include "time-util.h"
>  
>  #define pp_pl_log(prio, fmt, args...) condlog(prio, "path_latency prio: " fmt, ##args)
>  
> @@ -56,14 +57,6 @@
>  
>  #define DEF_BLK_SIZE		4096
>  
> -static double lg_path_latency[MAX_IO_NUM];
> -
> -static inline long long timeval_to_us(const struct timespec *tv)
> -{
> -	return ((long long)tv->tv_sec * USEC_PER_SEC) +
> -	    (tv->tv_nsec / NSEC_PER_USEC);
> -}
> -
>  static int prepare_directio_read(int fd, int *blksz, char **pbuf,
>  		int *restore_flags)
>  {
> @@ -199,22 +192,6 @@ out:
>  	return 0;
>  }
>  
> -double calc_standard_deviation(double *lg_path_latency, int size,
> -				  double lg_avglatency)
> -{
> -	int index;
> -	double sum = 0;
> -
> -	for (index = 0; index < size; index++) {
> -		sum += (lg_path_latency[index] - lg_avglatency) *
> -			(lg_path_latency[index] - lg_avglatency);
> -	}
> -
> -	sum /= (size - 1);
> -
> -	return sqrt(sum);
> -}
> -
>  /*
>   * Do not scale the prioriy in a certain range such as [0, 1024]
>   * because scaling will eliminate the effect of base_num.
> @@ -234,20 +211,16 @@ int calcPrio(double lg_avglatency, double lg_maxavglatency,
>  int getprio(struct path *pp, char *args, unsigned int timeout)
>  {
>  	int rc, temp;
> -	int index = 0;
>  	int io_num = 0;
>  	double base_num = 0;
>  	double lg_avglatency, lg_maxavglatency, lg_minavglatency;
>  	double standard_deviation;
>  	double lg_toldelay = 0;
> -	long long before, after;
> -	struct timespec tv;
>  	int blksize;
>  	char *buf;
>  	int restore_flags = 0;
>  	double lg_base;
> -	long long sum_latency = 0;
> -	long long arith_mean_lat;
> +	double sum_squares = 0;
>  
>  	if (pp->fd < 0)
>  		return -1;
> @@ -260,7 +233,6 @@ int getprio(struct path *pp, char *args, unsigned int timeout)
>  				pp->dev, io_num, base_num);
>  	}
>  
> -	memset(lg_path_latency, 0, sizeof(lg_path_latency));
>  	lg_base = log(base_num);
>  	lg_maxavglatency = log(MAX_AVG_LATENCY) / lg_base;
>  	lg_minavglatency = log(MIN_AVG_LATENCY) / lg_base;
> @@ -269,8 +241,10 @@ int getprio(struct path *pp, char *args, unsigned int timeout)
>  
>  	temp = io_num;
>  	while (temp-- > 0) {
> -		(void)clock_gettime(CLOCK_MONOTONIC, &tv);
> -		before = timeval_to_us(&tv);
> +		struct timespec tv_before, tv_after, tv_diff;
> +		double diff, reldiff;
> +
> +		(void)clock_gettime(CLOCK_MONOTONIC, &tv_before);
>  
>  		if (do_directio_read(pp->fd, timeout, buf, blksize)) {
>  			pp_pl_log(0, "%s: path down", pp->dev);
> @@ -278,25 +252,34 @@ int getprio(struct path *pp, char *args, unsigned int timeout)
>  			return -1;
>  		}
>  
> -		(void)clock_gettime(CLOCK_MONOTONIC, &tv);
> -		after = timeval_to_us(&tv);
> +		(void)clock_gettime(CLOCK_MONOTONIC, &tv_after);
> +
> +		timespecsub(&tv_after, &tv_before, &tv_diff);
> +		diff = tv_diff.tv_sec * 1000 * 1000 + tv_diff.tv_nsec / 1000;
> +
> +		if (diff == 0)
> +			/*
> +			 * Avoid taking log(0).
> +			 * This unlikely case is treated as minimum -
> +			 * the sums don't increase
> +			 */
> +			continue;
> +
> +		/* we scale by lg_base here */
> +		reldiff = log(diff) / lg_base;
> +
>  		/*
>  		 * We assume that the latency complies with Log-normal
>  		 * distribution. The logarithm of latency is in normal
>  		 * distribution.
>  		 */
> -		lg_path_latency[index] = log(after - before) / lg_base;
> -		lg_toldelay += lg_path_latency[index++];
> -		sum_latency += after - before;
> +		lg_toldelay += reldiff;
> +		sum_squares += reldiff * reldiff;
>  	}
>  
>  	cleanup_directio_read(pp->fd, buf, restore_flags);
>  
>  	lg_avglatency = lg_toldelay / (long long)io_num;
> -	arith_mean_lat = sum_latency / (long long)io_num;
> -	pp_pl_log(4, "%s: arithmetic mean latency is (%lld us), geometric mean latency is (%lld us)",
> -			pp->dev, arith_mean_lat,
> -			(long long)pow(base_num, lg_avglatency));
>  
>  	if (lg_avglatency > lg_maxavglatency) {
>  		pp_pl_log(2,
> @@ -340,8 +323,14 @@ int getprio(struct path *pp, char *args, unsigned int timeout)
>  			  ".It is recommend to be set larger",
>  			  pp->dev, base_num);
>  
> +	standard_deviation = sqrt((sum_squares - lg_toldelay * lg_avglatency)
> +				  / (io_num - 1));
>  
>  	rc = calcPrio(lg_avglatency, lg_maxavglatency, lg_minavglatency);
>  
> +	pp_pl_log(3, "%s: latency avg=%.2e uncertainty=%.1f prio=%d\n",
> +		  pp->dev, exp(lg_avglatency * lg_base),
> +		  exp(standard_deviation * lg_base), rc);
> +
>  	return rc;
>  }
> 

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 20/20] libmultipath: path latency: remove warnings
  2018-01-13 21:19 ` [PATCH v2 20/20] libmultipath: path latency: remove warnings Martin Wilck
@ 2018-02-08 12:06   ` Guan Junxiong
  0 siblings, 0 replies; 33+ messages in thread
From: Guan Junxiong @ 2018-02-08 12:06 UTC (permalink / raw)
  To: Martin Wilck, Christophe Varoqui, dm-devel
  Cc: Xose Vazquez Perez, niuhaoxin, Shenhong (C)

Looks Good. Thanks for your improvement.

Reviewed-by Guan Junxiong <guanjunxiong@huawei.com>


On 2018/1/14 5:19, Martin Wilck wrote:
> The warnings at here are pointless. We are looking at a single
> path only. Firstly, the standdard deviation for this measurement
> can't be "too low" - the lower, the more precise the measurement,
> the better. Secondly, a high standard deviation indicates an
> unstable path with highly variable latency. Not good, but nothing
> to warn about here.
> 
> What matters for the selection of "base_num" is not how a single
> path behaves, but how different paths of the same path group relate
> to each other, which we don't know at this point at the code.
> 
> What we want to avoid is too fine a differentiation, in particular
> in combination with group_by_prio, because we'd loose the ability for
> load balancing. But this is rather a topic for the man page or a
> "best practices" document.
> ---
>  libmultipath/prioritizers/path_latency.c | 34 --------------------------------
>  1 file changed, 34 deletions(-)
> 
> diff --git a/libmultipath/prioritizers/path_latency.c b/libmultipath/prioritizers/path_latency.c
> index e764f1dd8a21..765265c02333 100644
> --- a/libmultipath/prioritizers/path_latency.c
> +++ b/libmultipath/prioritizers/path_latency.c
> @@ -289,40 +289,6 @@ int getprio(struct path *pp, char *args, unsigned int timeout)
>  		return DEFAULT_PRIORITY;
>  	}
>  
> -	standard_deviation = calc_standard_deviation(lg_path_latency,
> -			index, lg_avglatency);
> -	/*
> -	 * In calPrio(), we let prio y = f(x) = log(max, base) - log (x, base);
> -	 * So if we want to let the priority of the latency outside 2 standard
> -	 * deviations can be distinguished from the latency inside 2 standard
> -	 * deviation, in others words at most 95% are the same and at least 5%
> -	 * are different according interval estimation of normal distribution,
> -	 * we should warn the user to set the base_num to be smaller if the
> -	 * log(x_threshold, base) is small than 2 standard deviation.
> -	 * x_threshold is derived from:
> -	 * y + 1 = f(x) + 1 = f(x) + log(base, base), so x_threadshold =
> -	 * base_num; Note that we only can compare the logarithm of x_threshold
> -	 * with the standard deviation because the standard deviation is derived
> -	 * from logarithm of latency.
> -	 *
> -	 * therefore , we recommend the base_num to meet the condition :
> -	 * 1 <= 2 * standard_deviation
> -	 */
> -	pp_pl_log(5, "%s: standard deviation for logarithm of latency = %.6f",
> -			pp->dev, standard_deviation);
> -	if (standard_deviation <= 0.5)
> -		pp_pl_log(3, "%s: the base_num(%.3lf) is too big to distinguish different priority "
> -			  "of two far-away latency. It is recommend to be set smaller",
> -			  pp->dev, base_num);
> -	/*
> -	 * If the standard deviation is too large , we should also warn the user
> -	 */
> -
> -	if (standard_deviation > 4)
> -		pp_pl_log(3, "%s: the base_num(%.3lf) is too small to avoid noise disturbance "
> -			  ".It is recommend to be set larger",
> -			  pp->dev, base_num);
> -
>  	standard_deviation = sqrt((sum_squares - lg_toldelay * lg_avglatency)
>  				  / (io_num - 1));
>  
> 

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2018-02-08 12:06 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-01-13 21:19 [PATCH v2 00/20] Various multipath-tools fixes Martin Wilck
2018-01-13 21:19 ` [PATCH v2 01/20] libmultipath: don't try to set hwhandler if it is retained Martin Wilck
2018-01-13 21:19 ` [PATCH v2 02/20] multipath: delegate dangerous commands to multipathd Martin Wilck
2018-01-13 21:19 ` [PATCH v2 03/20] libmultipath: condlog: log to stderr Martin Wilck
2018-01-13 21:19 ` [PATCH v2 04/20] libmultipath: fix return code of sysfs_get_timeout Martin Wilck
2018-01-13 21:19 ` [PATCH v2 05/20] libmultipath: fix return code of sgio_get_vpd() Martin Wilck
2018-01-13 21:19 ` [PATCH v2 06/20] libmultipath: sgio_get_vpd: add page argument Martin Wilck
2018-01-13 21:19 ` [PATCH v2 07/20] libmultipath: get_vpd_sgio: support VPD 0xc9 Martin Wilck
2018-01-13 21:19 ` [PATCH v2 08/20] libmultipath: select ALUA prioritizer for RDAC arrays only Martin Wilck
2018-01-13 21:19 ` [PATCH v2 09/20] libmultipath: hwtable: multibus for NetApp NVMe-FC Martin Wilck
2018-01-19 11:52   ` [PATCH] FIX "libmultipath: hwtable: multibus for NetApp NVMe-FC" Martin Wilck
2018-01-19 14:44     ` Xose Vazquez Perez
2018-01-19 15:12       ` Martin Wilck
2018-01-22 20:53         ` Xose Vazquez Perez
2018-01-22 22:15           ` Martin Wilck
2018-01-23 17:07   ` [FIX PATCH v2 09/20] libmultipath: hwtable: no_path_retry="queue" for NetApp NVMe Martin Wilck
2018-01-13 21:19 ` [PATCH v2 10/20] multipath -C: decrease log level Martin Wilck
2018-01-13 21:19 ` [PATCH v2 11/20] kpartx.rules: fix by-id/scsi-* for user_friendly_names Martin Wilck
2018-01-13 21:19 ` [PATCH v2 12/20] multipathd.socket: add WantedBy=sockets.target Martin Wilck
2018-01-13 21:19 ` [PATCH v2 13/20] multipathd.service: drop Before=lvm2-lvmetad.service Martin Wilck
2018-01-13 21:19 ` [PATCH v2 14/20] multipathd: fix compiler warning for uev_pathfail_check Martin Wilck
2018-01-13 21:19 ` [PATCH v2 15/20] test-kpartx: add test for mapping without UUID Martin Wilck
2018-01-13 21:19 ` [PATCH v2 16/20] multipathd.service: set TasksMax=infinity Martin Wilck
2018-01-13 21:19 ` [PATCH v2 17/20] libmultipath: path latency: fix default base num Martin Wilck
2018-02-08 12:03   ` Guan Junxiong
2018-01-13 21:19 ` [PATCH v2 18/20] libmultipath: path latency: log threshold with p2 Martin Wilck
2018-02-08 12:04   ` Guan Junxiong
2018-01-13 21:19 ` [PATCH v2 19/20] libmultipath: path latency: simplify getprio() Martin Wilck
2018-02-08 12:05   ` Guan Junxiong
2018-01-13 21:19 ` [PATCH v2 20/20] libmultipath: path latency: remove warnings Martin Wilck
2018-02-08 12:06   ` Guan Junxiong
2018-01-15  8:05 ` [PATCH v2 00/20] Various multipath-tools fixes Hannes Reinecke
2018-01-18 17:02 ` Benjamin Marzinski

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.