dm-devel.redhat.com archive mirror
 help / color / mirror / Atom feed
* [dm-devel] [PATCH 0/5] Misc Multipath patches
@ 2020-10-23 21:15 Benjamin Marzinski
  2020-10-23 21:15 ` [dm-devel] [PATCH 1/5] libmultipath: move fast_io_fail defines to structs.h Benjamin Marzinski
                   ` (4 more replies)
  0 siblings, 5 replies; 25+ messages in thread
From: Benjamin Marzinski @ 2020-10-23 21:15 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck

This is a set of unrelated patches, based on top of my previous "add
library to check if device is a valid path" patchset. The first two
patches add a new config option, eh_deadline, that sets the scsi sysfs
value of the same name for scsi path devices. This has been requested by
multiple customers. Patch 0004 is a change to detecting rdac support
requested by Netapp.

Patch 0005 fixes a rare segfault during shutdown, that happens when the
tur DSO is unloaded while the tur thread is still running. The issue is
that unless we make the tur_checker thread joinable, there is no way to
be sure that it isn't still running when the DSO is unloaded. I tried to
fix this in a way that allowed the DSO to get cleaned up. But without
redoing the tur_thread, that was impossible. Even if the cleanup
happened when the tur_thread was running a cleanup handler function from
libmultipath (not the DSO), while in pthread_exit(), which doesn't
return back to the calling fuction, it would segfault.

I realize that there has just been a lot of work done to make sure that
multipathd is cleaning up before exitting that this code is going
against that, but I'm not sure that the cost of redoing the tur_thread
is worth the benefit of being able to unload the DSO. If people feel
strongly that we should always unload the DSO, I can redo this and make
the tur_thread joinable with pthread_tryjoin_np(), and add code to the
checkerloop or uxlsnrloop to join with orphaned tur_threads.  That
should work.

Benjamin Marzinski (5):
  libmultipath: move fast_io_fail defines to structs.h
  libmultipath: add eh_deadline multipath.conf parameter
  multipathd: remove redundant vector_free() int configure
  libmultipath: only read 0xc9 vpd page for devices with rdac checker
  libmultipath: don't dlclose tur checker DSO

 libmultipath/checkers.c     | 10 +++++-
 libmultipath/checkers.h     |  1 +
 libmultipath/checkers/tur.c |  1 +
 libmultipath/config.c       |  2 ++
 libmultipath/config.h       | 10 ++----
 libmultipath/configure.c    |  1 +
 libmultipath/dict.c         | 40 +++++++++++++--------
 libmultipath/dict.h         |  2 +-
 libmultipath/discovery.c    | 70 +++++++++++++++++++++++++++++--------
 libmultipath/propsel.c      | 21 +++++++++--
 libmultipath/propsel.h      |  1 +
 libmultipath/structs.h      | 24 +++++++++++++
 multipath/multipath.conf.5  | 16 +++++++++
 multipathd/main.c           |  8 ++---
 14 files changed, 159 insertions(+), 48 deletions(-)

-- 
2.17.2

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [dm-devel] [PATCH 1/5] libmultipath: move fast_io_fail defines to structs.h
  2020-10-23 21:15 [dm-devel] [PATCH 0/5] Misc Multipath patches Benjamin Marzinski
@ 2020-10-23 21:15 ` Benjamin Marzinski
  2020-10-30 20:43   ` Martin Wilck
  2020-10-23 21:15 ` [dm-devel] [PATCH 2/5] libmultipath: add eh_deadline multipath.conf parameter Benjamin Marzinski
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 25+ messages in thread
From: Benjamin Marzinski @ 2020-10-23 21:15 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck

Since fast_io_fail is part of the multipath struct, its symbolic values
belong in structs.h. Also, make it an instance of a general enum, which
will be used again in future patches, and change the set/print functions
which use it to use the general enum instead.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 libmultipath/config.h  |  8 --------
 libmultipath/dict.c    | 30 +++++++++++++++---------------
 libmultipath/dict.h    |  2 +-
 libmultipath/propsel.c |  2 +-
 libmultipath/structs.h | 17 +++++++++++++++++
 5 files changed, 34 insertions(+), 25 deletions(-)

diff --git a/libmultipath/config.h b/libmultipath/config.h
index 5d460359..661dd586 100644
--- a/libmultipath/config.h
+++ b/libmultipath/config.h
@@ -10,14 +10,6 @@
 #define ORIGIN_DEFAULT 0
 #define ORIGIN_CONFIG  1
 
-/*
- * In kernel, fast_io_fail == 0 means immediate failure on rport delete.
- * OTOH '0' means not-configured in various places in multipath-tools.
- */
-#define MP_FAST_IO_FAIL_UNSET (0)
-#define MP_FAST_IO_FAIL_OFF (-1)
-#define MP_FAST_IO_FAIL_ZERO (-2)
-
 enum devtypes {
 	DEV_NONE,
 	DEV_DEVT,
diff --git a/libmultipath/dict.c b/libmultipath/dict.c
index f12c2e5c..f4357da1 100644
--- a/libmultipath/dict.c
+++ b/libmultipath/dict.c
@@ -822,7 +822,7 @@ declare_mp_attr_handler(gid, set_gid)
 declare_mp_attr_snprint(gid, print_gid)
 
 static int
-set_fast_io_fail(vector strvec, void *ptr)
+set_undef_off_zero(vector strvec, void *ptr)
 {
 	char * buff;
 	int *int_ptr = (int *)ptr;
@@ -832,36 +832,36 @@ set_fast_io_fail(vector strvec, void *ptr)
 		return 1;
 
 	if (strcmp(buff, "off") == 0)
-		*int_ptr = MP_FAST_IO_FAIL_OFF;
+		*int_ptr = UOZ_OFF;
 	else if (sscanf(buff, "%d", int_ptr) != 1 ||
-		 *int_ptr < MP_FAST_IO_FAIL_ZERO)
-		*int_ptr = MP_FAST_IO_FAIL_UNSET;
+		 *int_ptr < UOZ_ZERO)
+		*int_ptr = UOZ_UNDEF;
 	else if (*int_ptr == 0)
-		*int_ptr = MP_FAST_IO_FAIL_ZERO;
+		*int_ptr = UOZ_ZERO;
 
 	FREE(buff);
 	return 0;
 }
 
 int
-print_fast_io_fail(char * buff, int len, long v)
+print_undef_off_zero(char * buff, int len, long v)
 {
-	if (v == MP_FAST_IO_FAIL_UNSET)
+	if (v == UOZ_UNDEF)
 		return 0;
-	if (v == MP_FAST_IO_FAIL_OFF)
+	if (v == UOZ_OFF)
 		return snprintf(buff, len, "\"off\"");
-	if (v == MP_FAST_IO_FAIL_ZERO)
+	if (v == UOZ_ZERO)
 		return snprintf(buff, len, "0");
 	return snprintf(buff, len, "%ld", v);
 }
 
-declare_def_handler(fast_io_fail, set_fast_io_fail)
-declare_def_snprint_defint(fast_io_fail, print_fast_io_fail,
+declare_def_handler(fast_io_fail, set_undef_off_zero)
+declare_def_snprint_defint(fast_io_fail, print_undef_off_zero,
 			   DEFAULT_FAST_IO_FAIL)
-declare_ovr_handler(fast_io_fail, set_fast_io_fail)
-declare_ovr_snprint(fast_io_fail, print_fast_io_fail)
-declare_hw_handler(fast_io_fail, set_fast_io_fail)
-declare_hw_snprint(fast_io_fail, print_fast_io_fail)
+declare_ovr_handler(fast_io_fail, set_undef_off_zero)
+declare_ovr_snprint(fast_io_fail, print_undef_off_zero)
+declare_hw_handler(fast_io_fail, set_undef_off_zero)
+declare_hw_snprint(fast_io_fail, print_undef_off_zero)
 
 static int
 set_dev_loss(vector strvec, void *ptr)
diff --git a/libmultipath/dict.h b/libmultipath/dict.h
index a40ac66f..a917e1ca 100644
--- a/libmultipath/dict.h
+++ b/libmultipath/dict.h
@@ -13,7 +13,7 @@ int print_rr_weight(char *buff, int len, long v);
 int print_pgfailback(char *buff, int len, long v);
 int print_pgpolicy(char *buff, int len, long v);
 int print_no_path_retry(char *buff, int len, long v);
-int print_fast_io_fail(char *buff, int len, long v);
+int print_undef_off_zero(char *buff, int len, long v);
 int print_dev_loss(char *buff, int len, unsigned long v);
 int print_reservation_key(char * buff, int len, struct be64 key, uint8_t
 			  flags, int source);
diff --git a/libmultipath/propsel.c b/libmultipath/propsel.c
index 3f2c2cfa..67d025cf 100644
--- a/libmultipath/propsel.c
+++ b/libmultipath/propsel.c
@@ -754,7 +754,7 @@ int select_fast_io_fail(struct config *conf, struct multipath *mp)
 	mp_set_conf(fast_io_fail);
 	mp_set_default(fast_io_fail, DEFAULT_FAST_IO_FAIL);
 out:
-	print_fast_io_fail(buff, 12, mp->fast_io_fail);
+	print_undef_off_zero(buff, 12, mp->fast_io_fail);
 	condlog(3, "%s: fast_io_fail_tmo = %s %s", mp->alias, buff, origin);
 	return 0;
 }
diff --git a/libmultipath/structs.h b/libmultipath/structs.h
index 4ce30551..cfa7b649 100644
--- a/libmultipath/structs.h
+++ b/libmultipath/structs.h
@@ -219,6 +219,23 @@ enum vpd_vendor_ids {
 	VPD_VP_ARRAY_SIZE, /* This must remain the last entry */
 };
 
+/*
+ * Multipath treats 0 as undefined for optional config parameters.
+ * Use this for cases where 0 is a valid option for systems multipath
+ * is communicating with
+ */
+enum undefined_off_zero {
+	UOZ_UNDEF = 0,
+	UOZ_OFF = -1,
+	UOZ_ZERO = -2,
+};
+
+enum fast_io_fail_states {
+	MP_FAST_IO_FAIL_UNSET = UOZ_UNDEF,
+	MP_FAST_IO_FAIL_OFF = UOZ_OFF,
+	MP_FAST_IO_FAIL_ZERO = UOZ_ZERO,
+};
+
 struct vpd_vendor_page {
 	int pg;
 	const char *name;
-- 
2.17.2

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [dm-devel] [PATCH 2/5] libmultipath: add eh_deadline multipath.conf parameter
  2020-10-23 21:15 [dm-devel] [PATCH 0/5] Misc Multipath patches Benjamin Marzinski
  2020-10-23 21:15 ` [dm-devel] [PATCH 1/5] libmultipath: move fast_io_fail defines to structs.h Benjamin Marzinski
@ 2020-10-23 21:15 ` Benjamin Marzinski
  2020-10-30 21:00   ` Martin Wilck
  2020-10-23 21:15 ` [dm-devel] [PATCH 3/5] multipathd: remove redundant vector_free() int configure Benjamin Marzinski
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 25+ messages in thread
From: Benjamin Marzinski @ 2020-10-23 21:15 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck

There are times a fc rport is never lost, meaning that fast_io_fail_tmo
and dev_loss_tmo never trigger, but scsi commands still hang. This can
cause problems in cases where users have string timing requirements, and
the easiest way to solve these issues is to set eh_deadline. Since it's
already possible to set fast_io_fail_tmo and dev_loss_tmo from
multipath.conf, and have multipath take care of setting it correctly for
the scsi devices in sysfs, it makes sense to allow users to set
eh_deadline here as well.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 libmultipath/config.c      |  2 ++
 libmultipath/config.h      |  2 ++
 libmultipath/configure.c   |  1 +
 libmultipath/dict.c        | 10 ++++++
 libmultipath/discovery.c   | 70 ++++++++++++++++++++++++++++++--------
 libmultipath/propsel.c     | 17 +++++++++
 libmultipath/propsel.h     |  1 +
 libmultipath/structs.h     |  7 ++++
 multipath/multipath.conf.5 | 16 +++++++++
 9 files changed, 111 insertions(+), 15 deletions(-)

diff --git a/libmultipath/config.c b/libmultipath/config.c
index 49e7fb81..9f3cb38d 100644
--- a/libmultipath/config.c
+++ b/libmultipath/config.c
@@ -424,6 +424,7 @@ merge_hwe (struct hwentry * dst, struct hwentry * src)
 	merge_num(flush_on_last_del);
 	merge_num(fast_io_fail);
 	merge_num(dev_loss);
+	merge_num(eh_deadline);
 	merge_num(user_friendly_names);
 	merge_num(retain_hwhandler);
 	merge_num(detect_prio);
@@ -579,6 +580,7 @@ store_hwe (vector hwtable, struct hwentry * dhwe)
 	hwe->flush_on_last_del = dhwe->flush_on_last_del;
 	hwe->fast_io_fail = dhwe->fast_io_fail;
 	hwe->dev_loss = dhwe->dev_loss;
+	hwe->eh_deadline = dhwe->eh_deadline;
 	hwe->user_friendly_names = dhwe->user_friendly_names;
 	hwe->retain_hwhandler = dhwe->retain_hwhandler;
 	hwe->detect_prio = dhwe->detect_prio;
diff --git a/libmultipath/config.h b/libmultipath/config.h
index 661dd586..9ce37f16 100644
--- a/libmultipath/config.h
+++ b/libmultipath/config.h
@@ -63,6 +63,7 @@ struct hwentry {
 	int flush_on_last_del;
 	int fast_io_fail;
 	unsigned int dev_loss;
+	int eh_deadline;
 	int user_friendly_names;
 	int retain_hwhandler;
 	int detect_prio;
@@ -148,6 +149,7 @@ struct config {
 	int attribute_flags;
 	int fast_io_fail;
 	unsigned int dev_loss;
+	int eh_deadline;
 	int log_checker_err;
 	int allow_queueing;
 	int allow_usb_devices;
diff --git a/libmultipath/configure.c b/libmultipath/configure.c
index 1c8aac08..a878d870 100644
--- a/libmultipath/configure.c
+++ b/libmultipath/configure.c
@@ -368,6 +368,7 @@ int setup_map(struct multipath *mpp, char *params, int params_size,
 	select_gid(conf, mpp);
 	select_fast_io_fail(conf, mpp);
 	select_dev_loss(conf, mpp);
+	select_eh_deadline(conf, mpp);
 	select_reservation_key(conf, mpp);
 	select_deferred_remove(conf, mpp);
 	select_marginal_path_err_sample_time(conf, mpp);
diff --git a/libmultipath/dict.c b/libmultipath/dict.c
index f4357da1..bab96146 100644
--- a/libmultipath/dict.c
+++ b/libmultipath/dict.c
@@ -899,6 +899,13 @@ declare_ovr_snprint(dev_loss, print_dev_loss)
 declare_hw_handler(dev_loss, set_dev_loss)
 declare_hw_snprint(dev_loss, print_dev_loss)
 
+declare_def_handler(eh_deadline, set_undef_off_zero)
+declare_def_snprint(eh_deadline, print_undef_off_zero)
+declare_ovr_handler(eh_deadline, set_undef_off_zero)
+declare_ovr_snprint(eh_deadline, print_undef_off_zero)
+declare_hw_handler(eh_deadline, set_undef_off_zero)
+declare_hw_snprint(eh_deadline, print_undef_off_zero)
+
 static int
 set_pgpolicy(vector strvec, void *ptr)
 {
@@ -1771,6 +1778,7 @@ init_keywords(vector keywords)
 	install_keyword("gid", &def_gid_handler, &snprint_def_gid);
 	install_keyword("fast_io_fail_tmo", &def_fast_io_fail_handler, &snprint_def_fast_io_fail);
 	install_keyword("dev_loss_tmo", &def_dev_loss_handler, &snprint_def_dev_loss);
+	install_keyword("eh_deadline", &def_eh_deadline_handler, &snprint_def_eh_deadline);
 	install_keyword("bindings_file", &def_bindings_file_handler, &snprint_def_bindings_file);
 	install_keyword("wwids_file", &def_wwids_file_handler, &snprint_def_wwids_file);
 	install_keyword("prkeys_file", &def_prkeys_file_handler, &snprint_def_prkeys_file);
@@ -1880,6 +1888,7 @@ init_keywords(vector keywords)
 	install_keyword("flush_on_last_del", &hw_flush_on_last_del_handler, &snprint_hw_flush_on_last_del);
 	install_keyword("fast_io_fail_tmo", &hw_fast_io_fail_handler, &snprint_hw_fast_io_fail);
 	install_keyword("dev_loss_tmo", &hw_dev_loss_handler, &snprint_hw_dev_loss);
+	install_keyword("eh_deadline", &hw_eh_deadline_handler, &snprint_hw_eh_deadline);
 	install_keyword("user_friendly_names", &hw_user_friendly_names_handler, &snprint_hw_user_friendly_names);
 	install_keyword("retain_attached_hw_handler", &hw_retain_hwhandler_handler, &snprint_hw_retain_hwhandler);
 	install_keyword("detect_prio", &hw_detect_prio_handler, &snprint_hw_detect_prio);
@@ -1920,6 +1929,7 @@ init_keywords(vector keywords)
 	install_keyword("flush_on_last_del", &ovr_flush_on_last_del_handler, &snprint_ovr_flush_on_last_del);
 	install_keyword("fast_io_fail_tmo", &ovr_fast_io_fail_handler, &snprint_ovr_fast_io_fail);
 	install_keyword("dev_loss_tmo", &ovr_dev_loss_handler, &snprint_ovr_dev_loss);
+	install_keyword("eh_deadline", &ovr_eh_deadline_handler, &snprint_ovr_eh_deadline);
 	install_keyword("user_friendly_names", &ovr_user_friendly_names_handler, &snprint_ovr_user_friendly_names);
 	install_keyword("retain_attached_hw_handler", &ovr_retain_hwhandler_handler, &snprint_ovr_retain_hwhandler);
 	install_keyword("detect_prio", &ovr_detect_prio_handler, &snprint_ovr_detect_prio);
diff --git a/libmultipath/discovery.c b/libmultipath/discovery.c
index 950b1586..ef9a9a23 100644
--- a/libmultipath/discovery.c
+++ b/libmultipath/discovery.c
@@ -585,6 +585,42 @@ sysfs_get_asymmetric_access_state(struct path *pp, char *buff, int buflen)
 	return !!preferred;
 }
 
+static int
+sysfs_set_eh_deadline(struct multipath *mpp, struct path *pp)
+{
+	struct udev_device *hostdev;
+	char host_name[HOST_NAME_LEN], value[16];
+	int ret;
+
+	if (mpp->eh_deadline == EH_DEADLINE_UNSET)
+		return 0;
+
+	sprintf(host_name, "host%d", pp->sg_id.host_no);
+	hostdev = udev_device_new_from_subsystem_sysname(udev,
+			"scsi_host", host_name);
+	if (!hostdev)
+		return 1;
+
+	if (mpp->eh_deadline == EH_DEADLINE_OFF)
+		sprintf(value, "off");
+	else if (mpp->eh_deadline == EH_DEADLINE_ZERO)
+		sprintf(value, "0");
+	else
+		snprintf(value, 16, "%u", mpp->eh_deadline);
+
+	ret = sysfs_attr_set_value(hostdev, "eh_deadline",
+				   value, strlen(value));
+	/*
+	 * not all scsi drivers support setting eh_deadline, so failing
+	 * is totally reasonable
+	 */
+	if (ret <= 0)
+		condlog(4, "%s: failed to set eh_deadline to %s, error %d", udev_device_get_sysname(hostdev), value, -ret);
+
+	udev_device_unref(hostdev);
+	return (ret <= 0);
+}
+
 static void
 sysfs_set_rport_tmo(struct multipath *mpp, struct path *pp)
 {
@@ -799,7 +835,8 @@ sysfs_set_scsi_tmo (struct multipath *mpp, unsigned int checkint)
 		mpp->fast_io_fail = MP_FAST_IO_FAIL_OFF;
 	}
 	if (mpp->dev_loss == DEV_LOSS_TMO_UNSET &&
-	    mpp->fast_io_fail == MP_FAST_IO_FAIL_UNSET)
+	    mpp->fast_io_fail == MP_FAST_IO_FAIL_UNSET &&
+	    mpp->eh_deadline == EH_DEADLINE_UNSET)
 		return 0;
 
 	vector_foreach_slot(mpp->paths, pp, i) {
@@ -808,21 +845,24 @@ sysfs_set_scsi_tmo (struct multipath *mpp, unsigned int checkint)
 				err_path = pp;
 			continue;
 		}
-
-		switch (pp->sg_id.proto_id) {
-		case SCSI_PROTOCOL_FCP:
-			sysfs_set_rport_tmo(mpp, pp);
-			continue;
-		case SCSI_PROTOCOL_ISCSI:
-			sysfs_set_session_tmo(mpp, pp);
-			continue;
-		case SCSI_PROTOCOL_SAS:
-			sysfs_set_nexus_loss_tmo(mpp, pp);
-			continue;
-		default:
-			if (!err_path)
-				err_path = pp;
+		if (mpp->dev_loss != DEV_LOSS_TMO_UNSET ||
+		    mpp->fast_io_fail != MP_FAST_IO_FAIL_UNSET) {
+			switch (pp->sg_id.proto_id) {
+			case SCSI_PROTOCOL_FCP:
+				sysfs_set_rport_tmo(mpp, pp);
+				break;
+			case SCSI_PROTOCOL_ISCSI:
+				sysfs_set_session_tmo(mpp, pp);
+				break;
+			case SCSI_PROTOCOL_SAS:
+				sysfs_set_nexus_loss_tmo(mpp, pp);
+				break;
+			default:
+				if (!err_path)
+					err_path = pp;
+			}
 		}
+		sysfs_set_eh_deadline(mpp, pp);
 	}
 
 	if (err_path) {
diff --git a/libmultipath/propsel.c b/libmultipath/propsel.c
index 67d025cf..fa4ac5d9 100644
--- a/libmultipath/propsel.c
+++ b/libmultipath/propsel.c
@@ -775,6 +775,23 @@ out:
 	return 0;
 }
 
+int select_eh_deadline(struct config *conf, struct multipath *mp)
+{
+	const char *origin;
+	char buff[12];
+
+	mp_set_ovr(eh_deadline);
+	mp_set_hwe(eh_deadline);
+	mp_set_conf(eh_deadline);
+	mp->eh_deadline = EH_DEADLINE_UNSET;
+	/* not changing sysfs in default cause, so don't print anything */
+	return 0;
+out:
+	print_undef_off_zero(buff, 12, mp->eh_deadline);
+	condlog(3, "%s: eh_deadline = %s %s", mp->alias, buff, origin);
+	return 0;
+}
+
 int select_flush_on_last_del(struct config *conf, struct multipath *mp)
 {
 	const char *origin;
diff --git a/libmultipath/propsel.h b/libmultipath/propsel.h
index 3d6edd8a..a68bacf0 100644
--- a/libmultipath/propsel.h
+++ b/libmultipath/propsel.h
@@ -17,6 +17,7 @@ int select_uid(struct config *conf, struct multipath *mp);
 int select_gid(struct config *conf, struct multipath *mp);
 int select_fast_io_fail(struct config *conf, struct multipath *mp);
 int select_dev_loss(struct config *conf, struct multipath *mp);
+int select_eh_deadline(struct config *conf, struct multipath *mp);
 int select_reservation_key(struct config *conf, struct multipath *mp);
 int select_retain_hwhandler (struct config *conf, struct multipath * mp);
 int select_detect_prio(struct config *conf, struct path * pp);
diff --git a/libmultipath/structs.h b/libmultipath/structs.h
index cfa7b649..d6ff6762 100644
--- a/libmultipath/structs.h
+++ b/libmultipath/structs.h
@@ -236,6 +236,12 @@ enum fast_io_fail_states {
 	MP_FAST_IO_FAIL_ZERO = UOZ_ZERO,
 };
 
+enum eh_deadline_states {
+	EH_DEADLINE_UNSET = UOZ_UNDEF,
+	EH_DEADLINE_OFF = UOZ_OFF,
+	EH_DEADLINE_ZERO = UOZ_ZERO,
+};
+
 struct vpd_vendor_page {
 	int pg;
 	const char *name;
@@ -356,6 +362,7 @@ struct multipath {
 	int ghost_delay;
 	int ghost_delay_tick;
 	unsigned int dev_loss;
+	int eh_deadline;
 	uid_t uid;
 	gid_t gid;
 	mode_t mode;
diff --git a/multipath/multipath.conf.5 b/multipath/multipath.conf.5
index d2101ed6..cf05c1ab 100644
--- a/multipath/multipath.conf.5
+++ b/multipath/multipath.conf.5
@@ -717,6 +717,22 @@ The default is: \fB600\fR
 .
 .
 .TP
+.B eh_deadline
+Specify the maximum number of seconds the SCSI layer will spend doing error
+handling when scsi devices fail. After this timeout the scsi layer will perform
+a full HBA reset. Setting this may be necessary in cases where the rport is
+never lost, so \fIfast_io_fail_tmo\fR and \fIdev_loss_tmo\fR will never
+trigger, but (frequently do to load) scsi commands still hang. \fBNote:\fR when
+the scsi error handler performs the HBA reset, all target paths on that HBA
+will be affected. eh_deadline should only be set in cases where all targets on
+the affected HBAs are multipathed.
+.RS
+.TP
+The default is: \fB<unset>\fR
+.RE
+.
+.
+.TP
 .B bindings_file
 The full pathname of the binding file to be used when the user_friendly_names
 option is set.
-- 
2.17.2

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [dm-devel] [PATCH 3/5] multipathd: remove redundant vector_free() int configure
  2020-10-23 21:15 [dm-devel] [PATCH 0/5] Misc Multipath patches Benjamin Marzinski
  2020-10-23 21:15 ` [dm-devel] [PATCH 1/5] libmultipath: move fast_io_fail defines to structs.h Benjamin Marzinski
  2020-10-23 21:15 ` [dm-devel] [PATCH 2/5] libmultipath: add eh_deadline multipath.conf parameter Benjamin Marzinski
@ 2020-10-23 21:15 ` Benjamin Marzinski
  2020-10-30 21:01   ` Martin Wilck
  2020-10-23 21:15 ` [dm-devel] [PATCH 4/5] libmultipath: only read 0xc9 vpd page for devices with rdac checker Benjamin Marzinski
  2020-10-23 21:15 ` [dm-devel] [PATCH 5/5] libmultipath: don't dlclose tur checker DSO Benjamin Marzinski
  4 siblings, 1 reply; 25+ messages in thread
From: Benjamin Marzinski @ 2020-10-23 21:15 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck

remove_maps(vecs) already calls vector_free(vecs->mpvec)

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 multipathd/main.c | 8 ++------
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/multipathd/main.c b/multipathd/main.c
index 0cf8a264..e114bab7 100644
--- a/multipathd/main.c
+++ b/multipathd/main.c
@@ -2634,14 +2634,10 @@ configure (struct vectors * vecs)
 	}
 
 	/*
-	 * purge dm of old maps
+	 * purge dm of old maps and save new set of maps formed by
+	 * considering current path state
 	 */
 	remove_maps(vecs);
-
-	/*
-	 * save new set of maps formed by considering current path state
-	 */
-	vector_free(vecs->mpvec);
 	vecs->mpvec = mpvec;
 
 	/*
-- 
2.17.2

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [dm-devel] [PATCH 4/5] libmultipath: only read 0xc9 vpd page for devices with rdac checker
  2020-10-23 21:15 [dm-devel] [PATCH 0/5] Misc Multipath patches Benjamin Marzinski
                   ` (2 preceding siblings ...)
  2020-10-23 21:15 ` [dm-devel] [PATCH 3/5] multipathd: remove redundant vector_free() int configure Benjamin Marzinski
@ 2020-10-23 21:15 ` Benjamin Marzinski
  2020-10-26 15:52   ` Schremmer, Steven
  2020-10-30 21:12   ` Martin Wilck
  2020-10-23 21:15 ` [dm-devel] [PATCH 5/5] libmultipath: don't dlclose tur checker DSO Benjamin Marzinski
  4 siblings, 2 replies; 25+ messages in thread
From: Benjamin Marzinski @ 2020-10-23 21:15 UTC (permalink / raw)
  To: Christophe Varoqui
  Cc: NetApp RDAC team, device-mapper development, Steve Schremmer,
	Martin Wilck

Only rdac arrays support 0xC9 vpd page inquiries. All other arrays will
return a failure. Since all rdac arrays in the the built-in device
configuration explicitly set the RDAC path checker, and almost all other
scsi arrays do not set a path checker, it makes sense to only do the
rdac inquiry when detecting array capabilities if the array's path
checker is explicitly set to rdac.

Multipath was doing the check if either the path checker was set to
rdac, or no path checker was set.  This means that for almost all
non-rdac arrays, multipath was issuing a bad inquiry. This was annoying
users.

Cc: Steve Schremmer <sschremm@netapp.com>
Cc: NetApp RDAC team <ng-eseries-upstream-maintainers@netapp.com>
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 libmultipath/propsel.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libmultipath/propsel.c b/libmultipath/propsel.c
index fa4ac5d9..90a77d77 100644
--- a/libmultipath/propsel.c
+++ b/libmultipath/propsel.c
@@ -501,7 +501,7 @@ check_rdac(struct path * pp)
 	if (pp->bus != SYSFS_BUS_SCSI)
 		return 0;
 	/* Avoid ioctl if this is likely not an RDAC array */
-	if (__do_set_from_hwe(checker_name, pp, checker_name) &&
+	if (!__do_set_from_hwe(checker_name, pp, checker_name) ||
 	    strcmp(checker_name, RDAC))
 		return 0;
 	len = get_vpd_sgio(pp->fd, 0xC9, 0, buff, 44);
-- 
2.17.2

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [dm-devel] [PATCH 5/5] libmultipath: don't dlclose tur checker DSO
  2020-10-23 21:15 [dm-devel] [PATCH 0/5] Misc Multipath patches Benjamin Marzinski
                   ` (3 preceding siblings ...)
  2020-10-23 21:15 ` [dm-devel] [PATCH 4/5] libmultipath: only read 0xc9 vpd page for devices with rdac checker Benjamin Marzinski
@ 2020-10-23 21:15 ` Benjamin Marzinski
  2020-10-30 21:15   ` Martin Wilck
  4 siblings, 1 reply; 25+ messages in thread
From: Benjamin Marzinski @ 2020-10-23 21:15 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck

The multipathd tur checker thread is designed to be able to finish at
any time, even after the tur checker itself has been freed. The
multipathd shutdown code makes sure all the checkers have been freed
before freeing the checker_class and calling dlclose() to unload the
DSO, but this doesn't guarantee that the checker threads have finished.
If one hasn't, the DSO will get unloaded while the thread still running
code from it, causing a segfault. Unfortunately, it's not possible to be
sure that all tur checker threads have ended during shutdown, without
making them joinable.

However, since libmultipath will never be reinitialized after it has
been uninitialzed, not dlclosing the tur checker DSO once a thread is
started has minimal cost (keeping the DSO code around until the program
exits, which usually happens right after freeing the checkers).

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 libmultipath/checkers.c     | 10 +++++++++-
 libmultipath/checkers.h     |  1 +
 libmultipath/checkers/tur.c |  1 +
 3 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/libmultipath/checkers.c b/libmultipath/checkers.c
index 18b1f5eb..35a17f8c 100644
--- a/libmultipath/checkers.c
+++ b/libmultipath/checkers.c
@@ -22,6 +22,7 @@ struct checker_class {
 	void (*reset)(void);		     /* to reset the global variables */
 	const char **msgtable;
 	short msgtable_size;
+	int keep_dso;
 };
 
 static const char *checker_state_names[PATH_MAX_STATE] = {
@@ -74,7 +75,7 @@ void free_checker_class(struct checker_class *c)
 	list_del(&c->node);
 	if (c->reset)
 		c->reset();
-	if (c->handle) {
+	if (c->handle && !c->keep_dso) {
 		if (dlclose(c->handle) != 0) {
 			condlog(0, "Cannot unload checker %s: %s",
 				c->name, dlerror());
@@ -197,6 +198,13 @@ out:
 	return NULL;
 }
 
+void checker_keep_dso(struct checker * c)
+{
+	if (!c || !c->cls)
+		return;
+	c->cls->keep_dso = 1;
+}
+
 void checker_set_fd (struct checker * c, int fd)
 {
 	if (!c)
diff --git a/libmultipath/checkers.h b/libmultipath/checkers.h
index 9d5f90b9..af5a4006 100644
--- a/libmultipath/checkers.h
+++ b/libmultipath/checkers.h
@@ -146,6 +146,7 @@ void checker_reset (struct checker *);
 void checker_set_sync (struct checker *);
 void checker_set_async (struct checker *);
 void checker_set_fd (struct checker *, int);
+void checker_keep_dso(struct checker *c);
 void checker_enable (struct checker *);
 void checker_disable (struct checker *);
 int checker_check (struct checker *, int);
diff --git a/libmultipath/checkers/tur.c b/libmultipath/checkers/tur.c
index e886fcf8..fd58d62a 100644
--- a/libmultipath/checkers/tur.c
+++ b/libmultipath/checkers/tur.c
@@ -394,6 +394,7 @@ int libcheck_check(struct checker * c)
 		uatomic_set(&ct->running, 1);
 		tur_set_async_timeout(c);
 		setup_thread_attr(&attr, 32 * 1024, 1);
+		checker_keep_dso(c);
 		r = pthread_create(&ct->thread, &attr, tur_thread, ct);
 		pthread_attr_destroy(&attr);
 		if (r) {
-- 
2.17.2

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [dm-devel] [PATCH 4/5] libmultipath: only read 0xc9 vpd page for devices with rdac checker
  2020-10-23 21:15 ` [dm-devel] [PATCH 4/5] libmultipath: only read 0xc9 vpd page for devices with rdac checker Benjamin Marzinski
@ 2020-10-26 15:52   ` Schremmer, Steven
  2020-10-30 21:12   ` Martin Wilck
  1 sibling, 0 replies; 25+ messages in thread
From: Schremmer, Steven @ 2020-10-26 15:52 UTC (permalink / raw)
  To: Benjamin Marzinski, Christophe Varoqui
  Cc: ng-eseries-upstream-maintainers, device-mapper development, Martin Wilck

> From: Benjamin Marzinski <bmarzins@redhat.com>
> Sent: Friday, October 23, 2020 4:15 PM
> 
> Cc: Steve Schremmer <sschremm@netapp.com>
> Cc: NetApp RDAC team <ng-eseries-upstream-maintainers@netapp.com>
> Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> ---
>  libmultipath/propsel.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/libmultipath/propsel.c b/libmultipath/propsel.c
> index fa4ac5d9..90a77d77 100644
> --- a/libmultipath/propsel.c
> +++ b/libmultipath/propsel.c
> @@ -501,7 +501,7 @@ check_rdac(struct path * pp)
>         if (pp->bus != SYSFS_BUS_SCSI)
>                 return 0;
>         /* Avoid ioctl if this is likely not an RDAC array */
> -       if (__do_set_from_hwe(checker_name, pp, checker_name) &&
> +       if (!__do_set_from_hwe(checker_name, pp, checker_name) ||
>             strcmp(checker_name, RDAC))
>                 return 0;
>         len = get_vpd_sgio(pp->fd, 0xC9, 0, buff, 44);
> --
> 2.17.2

Reviewed-by: Steve Schremmer <sschremm@netapp.com>

Thanks,
Steve


--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [dm-devel] [PATCH 1/5] libmultipath: move fast_io_fail defines to structs.h
  2020-10-23 21:15 ` [dm-devel] [PATCH 1/5] libmultipath: move fast_io_fail defines to structs.h Benjamin Marzinski
@ 2020-10-30 20:43   ` Martin Wilck
  0 siblings, 0 replies; 25+ messages in thread
From: Martin Wilck @ 2020-10-30 20:43 UTC (permalink / raw)
  To: bmarzins, christophe.varoqui; +Cc: dm-devel

On Fri, 2020-10-23 at 16:15 -0500, Benjamin Marzinski wrote:
> Since fast_io_fail is part of the multipath struct, its symbolic
> values
> belong in structs.h. Also, make it an instance of a general enum,
> which
> will be used again in future patches, and change the set/print
> functions
> which use it to use the general enum instead.
> 
> Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>

> +enum undefined_off_zero {
> +	UOZ_UNDEF = 0,
> +	UOZ_OFF = -1,
> +	UOZ_ZERO = -2,
> +};

"UOZ" sounds kind of ugly to me. Anyway:

Reviewed-by: Martin Wilck <mwilck@suse.com>

-- 
Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
SUSE  Software Solutions Germany GmbH
HRB 36809, AG Nürnberg GF: Felix
Imendörffer



--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [dm-devel] [PATCH 2/5] libmultipath: add eh_deadline multipath.conf parameter
  2020-10-23 21:15 ` [dm-devel] [PATCH 2/5] libmultipath: add eh_deadline multipath.conf parameter Benjamin Marzinski
@ 2020-10-30 21:00   ` Martin Wilck
  2020-11-03  3:05     ` Benjamin Marzinski
  0 siblings, 1 reply; 25+ messages in thread
From: Martin Wilck @ 2020-10-30 21:00 UTC (permalink / raw)
  To: bmarzins, christophe.varoqui; +Cc: dm-devel

On Fri, 2020-10-23 at 16:15 -0500, Benjamin Marzinski wrote:
> There are times a fc rport is never lost, meaning that
> fast_io_fail_tmo
> and dev_loss_tmo never trigger, but scsi commands still hang. This
> can
> cause problems in cases where users have string timing requirements, 

Did you mean "strict" here?

> and
> the easiest way to solve these issues is to set eh_deadline. Since
> it's
> already possible to set fast_io_fail_tmo and dev_loss_tmo from
> multipath.conf, and have multipath take care of setting it correctly
> for
> the scsi devices in sysfs, it makes sense to allow users to set
> eh_deadline here as well.
> 
> Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> ---
>  libmultipath/config.c      |  2 ++
>  libmultipath/config.h      |  2 ++
>  libmultipath/configure.c   |  1 +
>  libmultipath/dict.c        | 10 ++++++
>  libmultipath/discovery.c   | 70 ++++++++++++++++++++++++++++++----
> ----
>  libmultipath/propsel.c     | 17 +++++++++
>  libmultipath/propsel.h     |  1 +
>  libmultipath/structs.h     |  7 ++++
>  multipath/multipath.conf.5 | 16 +++++++++
>  9 files changed, 111 insertions(+), 15 deletions(-)
> 
> diff --git a/libmultipath/config.c b/libmultipath/config.c
> index 49e7fb81..9f3cb38d 100644
> --- a/libmultipath/config.c
> +++ b/libmultipath/config.c
> @@ -424,6 +424,7 @@ merge_hwe (struct hwentry * dst, struct hwentry *
> src)
>  	merge_num(flush_on_last_del);
>  	merge_num(fast_io_fail);
>  	merge_num(dev_loss);
> +	merge_num(eh_deadline);
>  	merge_num(user_friendly_names);
>  	merge_num(retain_hwhandler);
>  	merge_num(detect_prio);
> @@ -579,6 +580,7 @@ store_hwe (vector hwtable, struct hwentry * dhwe)
>  	hwe->flush_on_last_del = dhwe->flush_on_last_del;
>  	hwe->fast_io_fail = dhwe->fast_io_fail;
>  	hwe->dev_loss = dhwe->dev_loss;
> +	hwe->eh_deadline = dhwe->eh_deadline;
>  	hwe->user_friendly_names = dhwe->user_friendly_names;
>  	hwe->retain_hwhandler = dhwe->retain_hwhandler;
>  	hwe->detect_prio = dhwe->detect_prio;
> diff --git a/libmultipath/config.h b/libmultipath/config.h
> index 661dd586..9ce37f16 100644
> --- a/libmultipath/config.h
> +++ b/libmultipath/config.h
> @@ -63,6 +63,7 @@ struct hwentry {
>  	int flush_on_last_del;
>  	int fast_io_fail;
>  	unsigned int dev_loss;
> +	int eh_deadline;
>  	int user_friendly_names;
>  	int retain_hwhandler;
>  	int detect_prio;
> @@ -148,6 +149,7 @@ struct config {
>  	int attribute_flags;
>  	int fast_io_fail;
>  	unsigned int dev_loss;
> +	int eh_deadline;
>  	int log_checker_err;
>  	int allow_queueing;
>  	int allow_usb_devices;
> diff --git a/libmultipath/configure.c b/libmultipath/configure.c
> index 1c8aac08..a878d870 100644
> --- a/libmultipath/configure.c
> +++ b/libmultipath/configure.c
> @@ -368,6 +368,7 @@ int setup_map(struct multipath *mpp, char
> *params, int params_size,
>  	select_gid(conf, mpp);
>  	select_fast_io_fail(conf, mpp);
>  	select_dev_loss(conf, mpp);
> +	select_eh_deadline(conf, mpp);
>  	select_reservation_key(conf, mpp);
>  	select_deferred_remove(conf, mpp);
>  	select_marginal_path_err_sample_time(conf, mpp);
> diff --git a/libmultipath/dict.c b/libmultipath/dict.c
> index f4357da1..bab96146 100644
> --- a/libmultipath/dict.c
> +++ b/libmultipath/dict.c
> @@ -899,6 +899,13 @@ declare_ovr_snprint(dev_loss, print_dev_loss)
>  declare_hw_handler(dev_loss, set_dev_loss)
>  declare_hw_snprint(dev_loss, print_dev_loss)
>  
> +declare_def_handler(eh_deadline, set_undef_off_zero)
> +declare_def_snprint(eh_deadline, print_undef_off_zero)
> +declare_ovr_handler(eh_deadline, set_undef_off_zero)
> +declare_ovr_snprint(eh_deadline, print_undef_off_zero)
> +declare_hw_handler(eh_deadline, set_undef_off_zero)
> +declare_hw_snprint(eh_deadline, print_undef_off_zero)
> +
>  static int
>  set_pgpolicy(vector strvec, void *ptr)
>  {
> @@ -1771,6 +1778,7 @@ init_keywords(vector keywords)
>  	install_keyword("gid", &def_gid_handler, &snprint_def_gid);
>  	install_keyword("fast_io_fail_tmo", &def_fast_io_fail_handler,
> &snprint_def_fast_io_fail);
>  	install_keyword("dev_loss_tmo", &def_dev_loss_handler,
> &snprint_def_dev_loss);
> +	install_keyword("eh_deadline", &def_eh_deadline_handler,
> &snprint_def_eh_deadline);
>  	install_keyword("bindings_file", &def_bindings_file_handler,
> &snprint_def_bindings_file);
>  	install_keyword("wwids_file", &def_wwids_file_handler,
> &snprint_def_wwids_file);
>  	install_keyword("prkeys_file", &def_prkeys_file_handler,
> &snprint_def_prkeys_file);
> @@ -1880,6 +1888,7 @@ init_keywords(vector keywords)
>  	install_keyword("flush_on_last_del",
> &hw_flush_on_last_del_handler, &snprint_hw_flush_on_last_del);
>  	install_keyword("fast_io_fail_tmo", &hw_fast_io_fail_handler,
> &snprint_hw_fast_io_fail);
>  	install_keyword("dev_loss_tmo", &hw_dev_loss_handler,
> &snprint_hw_dev_loss);
> +	install_keyword("eh_deadline", &hw_eh_deadline_handler,
> &snprint_hw_eh_deadline);
>  	install_keyword("user_friendly_names",
> &hw_user_friendly_names_handler, &snprint_hw_user_friendly_names);
>  	install_keyword("retain_attached_hw_handler",
> &hw_retain_hwhandler_handler, &snprint_hw_retain_hwhandler);
>  	install_keyword("detect_prio", &hw_detect_prio_handler,
> &snprint_hw_detect_prio);
> @@ -1920,6 +1929,7 @@ init_keywords(vector keywords)
>  	install_keyword("flush_on_last_del",
> &ovr_flush_on_last_del_handler, &snprint_ovr_flush_on_last_del);
>  	install_keyword("fast_io_fail_tmo", &ovr_fast_io_fail_handler,
> &snprint_ovr_fast_io_fail);
>  	install_keyword("dev_loss_tmo", &ovr_dev_loss_handler,
> &snprint_ovr_dev_loss);
> +	install_keyword("eh_deadline", &ovr_eh_deadline_handler,
> &snprint_ovr_eh_deadline);
>  	install_keyword("user_friendly_names",
> &ovr_user_friendly_names_handler, &snprint_ovr_user_friendly_names);
>  	install_keyword("retain_attached_hw_handler",
> &ovr_retain_hwhandler_handler, &snprint_ovr_retain_hwhandler);
>  	install_keyword("detect_prio", &ovr_detect_prio_handler,
> &snprint_ovr_detect_prio);
> diff --git a/libmultipath/discovery.c b/libmultipath/discovery.c
> index 950b1586..ef9a9a23 100644
> --- a/libmultipath/discovery.c
> +++ b/libmultipath/discovery.c
> @@ -585,6 +585,42 @@ sysfs_get_asymmetric_access_state(struct path
> *pp, char *buff, int buflen)
>  	return !!preferred;
>  }
>  
> +static int
> +sysfs_set_eh_deadline(struct multipath *mpp, struct path *pp)
> +{
> +	struct udev_device *hostdev;
> +	char host_name[HOST_NAME_LEN], value[16];
> +	int ret;
> +
> +	if (mpp->eh_deadline == EH_DEADLINE_UNSET)
> +		return 0;
> +
> +	sprintf(host_name, "host%d", pp->sg_id.host_no);
> +	hostdev = udev_device_new_from_subsystem_sysname(udev,
> +			"scsi_host", host_name);
> +	if (!hostdev)
> +		return 1;
> +
> +	if (mpp->eh_deadline == EH_DEADLINE_OFF)
> +		sprintf(value, "off");
> +	else if (mpp->eh_deadline == EH_DEADLINE_ZERO)
> +		sprintf(value, "0");
> +	else
> +		snprintf(value, 16, "%u", mpp->eh_deadline);

Hm, mpp->eh_deadline is an "int". This should cause a compiler warning,
does it not?

Nitpick: You're certain to not need more than 11 bytes of string
buffer, so you could simply use sprintf(). If you want to use
snprintf(), please pass sizeof(value) rather than an explicit number.

> +
> +	ret = sysfs_attr_set_value(hostdev, "eh_deadline",
> +				   value, strlen(value));

Nitpick: you could use the return value of sprintf() here rather than
calling strlen() again.

> +	/*
> +	 * not all scsi drivers support setting eh_deadline, so failing
> +	 * is totally reasonable
> +	 */
> +	if (ret <= 0)
> +		condlog(4, "%s: failed to set eh_deadline to %s, error
> %d", udev_device_get_sysname(hostdev), value, -ret);

Loglevel 3 might still be justified. After all, the user did set
eh_deadline, and might wonder why her settings aren't applied.

> +
> +	udev_device_unref(hostdev);
> +	return (ret <= 0);
> +}
> +
>  static void
>  sysfs_set_rport_tmo(struct multipath *mpp, struct path *pp)
>  {
> @@ -799,7 +835,8 @@ sysfs_set_scsi_tmo (struct multipath *mpp,
> unsigned int checkint)
>  		mpp->fast_io_fail = MP_FAST_IO_FAIL_OFF;
>  	}
>  	if (mpp->dev_loss == DEV_LOSS_TMO_UNSET &&
> -	    mpp->fast_io_fail == MP_FAST_IO_FAIL_UNSET)
> +	    mpp->fast_io_fail == MP_FAST_IO_FAIL_UNSET &&
> +	    mpp->eh_deadline == EH_DEADLINE_UNSET)
>  		return 0;
>  
>  	vector_foreach_slot(mpp->paths, pp, i) {
> @@ -808,21 +845,24 @@ sysfs_set_scsi_tmo (struct multipath *mpp,
> unsigned int checkint)
>  				err_path = pp;
>  			continue;
>  		}
> -
> -		switch (pp->sg_id.proto_id) {
> -		case SCSI_PROTOCOL_FCP:
> -			sysfs_set_rport_tmo(mpp, pp);
> -			continue;
> -		case SCSI_PROTOCOL_ISCSI:
> -			sysfs_set_session_tmo(mpp, pp);
> -			continue;
> -		case SCSI_PROTOCOL_SAS:
> -			sysfs_set_nexus_loss_tmo(mpp, pp);
> -			continue;
> -		default:
> -			if (!err_path)
> -				err_path = pp;
> +		if (mpp->dev_loss != DEV_LOSS_TMO_UNSET ||
> +		    mpp->fast_io_fail != MP_FAST_IO_FAIL_UNSET) {

Perhaps we should rather check these in the called functions then
introduce another conditional here. Some callees do check these
already. We've got the likely shortcut (none set) above already.

> +			switch (pp->sg_id.proto_id) {
> +			case SCSI_PROTOCOL_FCP:
> +				sysfs_set_rport_tmo(mpp, pp);
> +				break;
> +			case SCSI_PROTOCOL_ISCSI:
> +				sysfs_set_session_tmo(mpp, pp);
> +				break;
> +			case SCSI_PROTOCOL_SAS:
> +				sysfs_set_nexus_loss_tmo(mpp, pp);
> +				break;
> +			default:
> +				if (!err_path)
> +					err_path = pp;
> +			}
>  		}
> +		sysfs_set_eh_deadline(mpp, pp);
>  	}
>  
>  	if (err_path) {
> diff --git a/libmultipath/propsel.c b/libmultipath/propsel.c
> index 67d025cf..fa4ac5d9 100644
> --- a/libmultipath/propsel.c
> +++ b/libmultipath/propsel.c
> @@ -775,6 +775,23 @@ out:
>  	return 0;
>  }
>  
> +int select_eh_deadline(struct config *conf, struct multipath *mp)
> +{
> +	const char *origin;
> +	char buff[12];
> +
> +	mp_set_ovr(eh_deadline);
> +	mp_set_hwe(eh_deadline);
> +	mp_set_conf(eh_deadline);
> +	mp->eh_deadline = EH_DEADLINE_UNSET;
> +	/* not changing sysfs in default cause, so don't print anything
> */
> +	return 0;
> +out:
> +	print_undef_off_zero(buff, 12, mp->eh_deadline);
> +	condlog(3, "%s: eh_deadline = %s %s", mp->alias, buff, origin);
> +	return 0;
> +}
> +
>  int select_flush_on_last_del(struct config *conf, struct multipath
> *mp)
>  {
>  	const char *origin;
> diff --git a/libmultipath/propsel.h b/libmultipath/propsel.h
> index 3d6edd8a..a68bacf0 100644
> --- a/libmultipath/propsel.h
> +++ b/libmultipath/propsel.h
> @@ -17,6 +17,7 @@ int select_uid(struct config *conf, struct
> multipath *mp);
>  int select_gid(struct config *conf, struct multipath *mp);
>  int select_fast_io_fail(struct config *conf, struct multipath *mp);
>  int select_dev_loss(struct config *conf, struct multipath *mp);
> +int select_eh_deadline(struct config *conf, struct multipath *mp);
>  int select_reservation_key(struct config *conf, struct multipath
> *mp);
>  int select_retain_hwhandler (struct config *conf, struct multipath *
> mp);
>  int select_detect_prio(struct config *conf, struct path * pp);
> diff --git a/libmultipath/structs.h b/libmultipath/structs.h
> index cfa7b649..d6ff6762 100644
> --- a/libmultipath/structs.h
> +++ b/libmultipath/structs.h
> @@ -236,6 +236,12 @@ enum fast_io_fail_states {
>  	MP_FAST_IO_FAIL_ZERO = UOZ_ZERO,
>  };
>  
> +enum eh_deadline_states {
> +	EH_DEADLINE_UNSET = UOZ_UNDEF,
> +	EH_DEADLINE_OFF = UOZ_OFF,
> +	EH_DEADLINE_ZERO = UOZ_ZERO,
> +};
> +
>  struct vpd_vendor_page {
>  	int pg;
>  	const char *name;
> @@ -356,6 +362,7 @@ struct multipath {
>  	int ghost_delay;
>  	int ghost_delay_tick;
>  	unsigned int dev_loss;
> +	int eh_deadline;
>  	uid_t uid;
>  	gid_t gid;
>  	mode_t mode;
> diff --git a/multipath/multipath.conf.5 b/multipath/multipath.conf.5
> index d2101ed6..cf05c1ab 100644
> --- a/multipath/multipath.conf.5
> +++ b/multipath/multipath.conf.5
> @@ -717,6 +717,22 @@ The default is: \fB600\fR
>  .
>  .
>  .TP
> +.B eh_deadline
> +Specify the maximum number of seconds the SCSI layer will spend
> doing error
> +handling when scsi devices fail. After this timeout the scsi layer
> will perform
> +a full HBA reset. Setting this may be necessary in cases where the
> rport is
> +never lost, so \fIfast_io_fail_tmo\fR and \fIdev_loss_tmo\fR will
> never
> +trigger, but (frequently do to load) scsi commands still hang.
> \fBNote:\fR when
> +the scsi error handler performs the HBA reset, all target paths on
> that HBA
> +will be affected. eh_deadline should only be set in cases where all
> targets on
> +the affected HBAs are multipathed.
> +.RS
> +.TP
> +The default is: \fB<unset>\fR
> +.RE
> +.
> +.
> +.TP
>  .B bindings_file
>  The full pathname of the binding file to be used when the
> user_friendly_names
>  option is set.

-- 
Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
SUSE  Software Solutions Germany GmbH
HRB 36809, AG Nürnberg GF: Felix
Imendörffer



--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [dm-devel] [PATCH 3/5] multipathd: remove redundant vector_free() int configure
  2020-10-23 21:15 ` [dm-devel] [PATCH 3/5] multipathd: remove redundant vector_free() int configure Benjamin Marzinski
@ 2020-10-30 21:01   ` Martin Wilck
  0 siblings, 0 replies; 25+ messages in thread
From: Martin Wilck @ 2020-10-30 21:01 UTC (permalink / raw)
  To: bmarzins, christophe.varoqui; +Cc: dm-devel

On Fri, 2020-10-23 at 16:15 -0500, Benjamin Marzinski wrote:
> remove_maps(vecs) already calls vector_free(vecs->mpvec)
> 
> Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Reviewed-by: Martin Wilck <mwilck@suse.com>


-- 
Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
SUSE  Software Solutions Germany GmbH
HRB 36809, AG Nürnberg GF: Felix
Imendörffer



--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [dm-devel] [PATCH 4/5] libmultipath: only read 0xc9 vpd page for devices with rdac checker
  2020-10-23 21:15 ` [dm-devel] [PATCH 4/5] libmultipath: only read 0xc9 vpd page for devices with rdac checker Benjamin Marzinski
  2020-10-26 15:52   ` Schremmer, Steven
@ 2020-10-30 21:12   ` Martin Wilck
  2020-11-03  1:11     ` Benjamin Marzinski
  1 sibling, 1 reply; 25+ messages in thread
From: Martin Wilck @ 2020-10-30 21:12 UTC (permalink / raw)
  To: bmarzins, christophe.varoqui, sschremm
  Cc: ng-eseries-upstream-maintainers, dm-devel

On Fri, 2020-10-23 at 16:15 -0500, Benjamin Marzinski wrote:
> Only rdac arrays support 0xC9 vpd page inquiries. All other arrays
> will
> return a failure. Since all rdac arrays in the the built-in device
> configuration explicitly set the RDAC path checker, and almost all
> other
> scsi arrays do not set a path checker, it makes sense to only do the
> rdac inquiry when detecting array capabilities if the array's path
> checker is explicitly set to rdac.
> 
> Multipath was doing the check if either the path checker was set to
> rdac, or no path checker was set.  This means that for almost all
> non-rdac arrays, multipath was issuing a bad inquiry. This was
> annoying
> users.

This is sort of funny. We created the entire check_rdac() feature in
order to autodetect RDAC arrays. If we limit the autodetection to those
arrays that have "rdac" set already, why would we do this at all? We
could simply go with the hwtable / config file settings, as we used to
before check_rdac() was created. All we'd need to do is use "alua" prio
for arrays with "rdac" checker. What am I missing here?

I thought that this autodetection was intended because there are many
Netapp OEMs which we may not all have included in the hwtable (thus
having no path_checker set). For those, we'd choose the wrong settings
by default with this patch, only to avoid some error messages about
unsupported VPDs. I'm not convinced that that's a good idea.

Suggestion: we could try to retrieve VPD 0 (supported VPDs) before
checking VPD 0xc9. That would avoid the errors on non-Netapp hardware,
while still allowing us to autodetect RDAC systems that are missing in
the hwtable.

Regards
Martin

> 
> Cc: Steve Schremmer <sschremm@netapp.com>
> Cc: NetApp RDAC team <ng-eseries-upstream-maintainers@netapp.com>
> Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> ---
>  libmultipath/propsel.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/libmultipath/propsel.c b/libmultipath/propsel.c
> index fa4ac5d9..90a77d77 100644
> --- a/libmultipath/propsel.c
> +++ b/libmultipath/propsel.c
> @@ -501,7 +501,7 @@ check_rdac(struct path * pp)
>  	if (pp->bus != SYSFS_BUS_SCSI)
>  		return 0;
>  	/* Avoid ioctl if this is likely not an RDAC array */
> -	if (__do_set_from_hwe(checker_name, pp, checker_name) &&
> +	if (!__do_set_from_hwe(checker_name, pp, checker_name) ||
>  	    strcmp(checker_name, RDAC))
>  		return 0;
>  	len = get_vpd_sgio(pp->fd, 0xC9, 0, buff, 44);

-- 
Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
SUSE  Software Solutions Germany GmbH
HRB 36809, AG Nürnberg GF: Felix
Imendörffer



--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [dm-devel] [PATCH 5/5] libmultipath: don't dlclose tur checker DSO
  2020-10-23 21:15 ` [dm-devel] [PATCH 5/5] libmultipath: don't dlclose tur checker DSO Benjamin Marzinski
@ 2020-10-30 21:15   ` Martin Wilck
  2020-11-02 18:27     ` Benjamin Marzinski
  2020-11-03 18:48     ` Benjamin Marzinski
  0 siblings, 2 replies; 25+ messages in thread
From: Martin Wilck @ 2020-10-30 21:15 UTC (permalink / raw)
  To: bmarzins, christophe.varoqui; +Cc: dm-devel

On Fri, 2020-10-23 at 16:15 -0500, Benjamin Marzinski wrote:
> The multipathd tur checker thread is designed to be able to finish at
> any time, even after the tur checker itself has been freed. The
> multipathd shutdown code makes sure all the checkers have been freed
> before freeing the checker_class and calling dlclose() to unload the
> DSO, but this doesn't guarantee that the checker threads have
> finished.
> If one hasn't, the DSO will get unloaded while the thread still
> running
> code from it, causing a segfault. Unfortunately, it's not possible to
> be
> sure that all tur checker threads have ended during shutdown, without
> making them joinable.
> 
> However, since libmultipath will never be reinitialized after it has
> been uninitialzed, not dlclosing the tur checker DSO once a thread is
> started has minimal cost (keeping the DSO code around until the
> program
> exits, which usually happens right after freeing the checkers).

I'm not against this, but have you considered using an atomic  refcount
for the DSO? With every tur thread starting, we could increase it, and
decrease it in the cleanup function of the thread when it exits. That
should be safe. If the refcount was positive when we exit, we could
refrain from unloading the DSO.

Regards,
Martin


--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [dm-devel] [PATCH 5/5] libmultipath: don't dlclose tur checker DSO
  2020-10-30 21:15   ` Martin Wilck
@ 2020-11-02 18:27     ` Benjamin Marzinski
  2020-11-04 22:39       ` Martin Wilck
  2020-11-03 18:48     ` Benjamin Marzinski
  1 sibling, 1 reply; 25+ messages in thread
From: Benjamin Marzinski @ 2020-11-02 18:27 UTC (permalink / raw)
  To: Martin Wilck; +Cc: dm-devel

On Fri, Oct 30, 2020 at 09:15:39PM +0000, Martin Wilck wrote:
> On Fri, 2020-10-23 at 16:15 -0500, Benjamin Marzinski wrote:
> > The multipathd tur checker thread is designed to be able to finish at
> > any time, even after the tur checker itself has been freed. The
> > multipathd shutdown code makes sure all the checkers have been freed
> > before freeing the checker_class and calling dlclose() to unload the
> > DSO, but this doesn't guarantee that the checker threads have
> > finished.
> > If one hasn't, the DSO will get unloaded while the thread still
> > running
> > code from it, causing a segfault. Unfortunately, it's not possible to
> > be
> > sure that all tur checker threads have ended during shutdown, without
> > making them joinable.
> > 
> > However, since libmultipath will never be reinitialized after it has
> > been uninitialzed, not dlclosing the tur checker DSO once a thread is
> > started has minimal cost (keeping the DSO code around until the
> > program
> > exits, which usually happens right after freeing the checkers).
> 
> I'm not against this, but have you considered using an atomic  refcount
> for the DSO? With every tur thread starting, we could increase it, and
> decrease it in the cleanup function of the thread when it exits. That
> should be safe. If the refcount was positive when we exit, we could
> refrain from unloading the DSO.

I tried exactly that, and I would still get crashes, even if it put the
code that decrements the atomic variable in a function that's not part
of the DSO, and put a pthread_exit() before the final
pthread_cleanup_pop() that would decrement it in tur_thread, so that
after the cleanup code is called the thread should never return to code
that is in the DSO. I had to add sleeps in code to force various
orderings, but I couldn't find any way that worked for all possible
orderings.  I would love it if this worked, and you're free to try, but
I could not get this method to work.

-Ben
 
> Regards,
> Martin

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [dm-devel] [PATCH 4/5] libmultipath: only read 0xc9 vpd page for devices with rdac checker
  2020-10-30 21:12   ` Martin Wilck
@ 2020-11-03  1:11     ` Benjamin Marzinski
  2020-11-03  5:09       ` Benjamin Marzinski
  2020-11-03  8:59       ` Martin Wilck
  0 siblings, 2 replies; 25+ messages in thread
From: Benjamin Marzinski @ 2020-11-03  1:11 UTC (permalink / raw)
  To: Martin Wilck; +Cc: dm-devel, sschremm, ng-eseries-upstream-maintainers

On Fri, Oct 30, 2020 at 09:12:46PM +0000, Martin Wilck wrote:
> On Fri, 2020-10-23 at 16:15 -0500, Benjamin Marzinski wrote:
> > Only rdac arrays support 0xC9 vpd page inquiries. All other arrays
> > will
> > return a failure. Since all rdac arrays in the the built-in device
> > configuration explicitly set the RDAC path checker, and almost all
> > other
> > scsi arrays do not set a path checker, it makes sense to only do the
> > rdac inquiry when detecting array capabilities if the array's path
> > checker is explicitly set to rdac.
> > 
> > Multipath was doing the check if either the path checker was set to
> > rdac, or no path checker was set.  This means that for almost all
> > non-rdac arrays, multipath was issuing a bad inquiry. This was
> > annoying
> > users.
> 
> This is sort of funny. We created the entire check_rdac() feature in
> order to autodetect RDAC arrays. If we limit the autodetection to those
> arrays that have "rdac" set already, why would we do this at all? We
> could simply go with the hwtable / config file settings, as we used to
> before check_rdac() was created. All we'd need to do is use "alua" prio
> for arrays with "rdac" checker. What am I missing here?
> 
> I thought that this autodetection was intended because there are many
> Netapp OEMs which we may not all have included in the hwtable (thus
> having no path_checker set). For those, we'd choose the wrong settings
> by default with this patch, only to avoid some error messages about
> unsupported VPDs. I'm not convinced that that's a good idea.
> 
> Suggestion: we could try to retrieve VPD 0 (supported VPDs) before
> checking VPD 0xc9. That would avoid the errors on non-Netapp hardware,
> while still allowing us to autodetect RDAC systems that are missing in
> the hwtable.

When this idea was brought up in discussions about how to solve this,
there was a worry that older rdac devices might not not support vpd page
0 correctly. I'm not sure how valid that worry is.

-Ben

> 
> Regards
> Martin
> 
> > 
> > Cc: Steve Schremmer <sschremm@netapp.com>
> > Cc: NetApp RDAC team <ng-eseries-upstream-maintainers@netapp.com>
> > Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> > ---
> >  libmultipath/propsel.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/libmultipath/propsel.c b/libmultipath/propsel.c
> > index fa4ac5d9..90a77d77 100644
> > --- a/libmultipath/propsel.c
> > +++ b/libmultipath/propsel.c
> > @@ -501,7 +501,7 @@ check_rdac(struct path * pp)
> >  	if (pp->bus != SYSFS_BUS_SCSI)
> >  		return 0;
> >  	/* Avoid ioctl if this is likely not an RDAC array */
> > -	if (__do_set_from_hwe(checker_name, pp, checker_name) &&
> > +	if (!__do_set_from_hwe(checker_name, pp, checker_name) ||
> >  	    strcmp(checker_name, RDAC))
> >  		return 0;
> >  	len = get_vpd_sgio(pp->fd, 0xC9, 0, buff, 44);
> 
> -- 
> Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
> SUSE  Software Solutions Germany GmbH
> HRB 36809, AG Nürnberg GF: Felix
> Imendörffer
> 

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [dm-devel] [PATCH 2/5] libmultipath: add eh_deadline multipath.conf parameter
  2020-10-30 21:00   ` Martin Wilck
@ 2020-11-03  3:05     ` Benjamin Marzinski
  0 siblings, 0 replies; 25+ messages in thread
From: Benjamin Marzinski @ 2020-11-03  3:05 UTC (permalink / raw)
  To: Martin Wilck; +Cc: dm-devel

On Fri, Oct 30, 2020 at 09:00:48PM +0000, Martin Wilck wrote:
> On Fri, 2020-10-23 at 16:15 -0500, Benjamin Marzinski wrote:
> > There are times a fc rport is never lost, meaning that
> > fast_io_fail_tmo
> > and dev_loss_tmo never trigger, but scsi commands still hang. This
> > can
> > cause problems in cases where users have string timing requirements, 
> 
> Did you mean "strict" here?
> 

Whoops. yes.

> > and
> > the easiest way to solve these issues is to set eh_deadline. Since
> > it's
> > already possible to set fast_io_fail_tmo and dev_loss_tmo from
> > multipath.conf, and have multipath take care of setting it correctly
> > for
> > the scsi devices in sysfs, it makes sense to allow users to set
> > eh_deadline here as well.
> > 
> > Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> > ---
> >  libmultipath/config.c      |  2 ++
> >  libmultipath/config.h      |  2 ++
> >  libmultipath/configure.c   |  1 +
> >  libmultipath/dict.c        | 10 ++++++
> >  libmultipath/discovery.c   | 70 ++++++++++++++++++++++++++++++----
> > ----
> >  libmultipath/propsel.c     | 17 +++++++++
> >  libmultipath/propsel.h     |  1 +
> >  libmultipath/structs.h     |  7 ++++
> >  multipath/multipath.conf.5 | 16 +++++++++
> >  9 files changed, 111 insertions(+), 15 deletions(-)
> > 
> > diff --git a/libmultipath/config.c b/libmultipath/config.c
> > index 49e7fb81..9f3cb38d 100644
> > --- a/libmultipath/config.c
> > +++ b/libmultipath/config.c
> > @@ -424,6 +424,7 @@ merge_hwe (struct hwentry * dst, struct hwentry *
> > src)
> >  	merge_num(flush_on_last_del);
> >  	merge_num(fast_io_fail);
> >  	merge_num(dev_loss);
> > +	merge_num(eh_deadline);
> >  	merge_num(user_friendly_names);
> >  	merge_num(retain_hwhandler);
> >  	merge_num(detect_prio);
> > @@ -579,6 +580,7 @@ store_hwe (vector hwtable, struct hwentry * dhwe)
> >  	hwe->flush_on_last_del = dhwe->flush_on_last_del;
> >  	hwe->fast_io_fail = dhwe->fast_io_fail;
> >  	hwe->dev_loss = dhwe->dev_loss;
> > +	hwe->eh_deadline = dhwe->eh_deadline;
> >  	hwe->user_friendly_names = dhwe->user_friendly_names;
> >  	hwe->retain_hwhandler = dhwe->retain_hwhandler;
> >  	hwe->detect_prio = dhwe->detect_prio;
> > diff --git a/libmultipath/config.h b/libmultipath/config.h
> > index 661dd586..9ce37f16 100644
> > --- a/libmultipath/config.h
> > +++ b/libmultipath/config.h
> > @@ -63,6 +63,7 @@ struct hwentry {
> >  	int flush_on_last_del;
> >  	int fast_io_fail;
> >  	unsigned int dev_loss;
> > +	int eh_deadline;
> >  	int user_friendly_names;
> >  	int retain_hwhandler;
> >  	int detect_prio;
> > @@ -148,6 +149,7 @@ struct config {
> >  	int attribute_flags;
> >  	int fast_io_fail;
> >  	unsigned int dev_loss;
> > +	int eh_deadline;
> >  	int log_checker_err;
> >  	int allow_queueing;
> >  	int allow_usb_devices;
> > diff --git a/libmultipath/configure.c b/libmultipath/configure.c
> > index 1c8aac08..a878d870 100644
> > --- a/libmultipath/configure.c
> > +++ b/libmultipath/configure.c
> > @@ -368,6 +368,7 @@ int setup_map(struct multipath *mpp, char
> > *params, int params_size,
> >  	select_gid(conf, mpp);
> >  	select_fast_io_fail(conf, mpp);
> >  	select_dev_loss(conf, mpp);
> > +	select_eh_deadline(conf, mpp);
> >  	select_reservation_key(conf, mpp);
> >  	select_deferred_remove(conf, mpp);
> >  	select_marginal_path_err_sample_time(conf, mpp);
> > diff --git a/libmultipath/dict.c b/libmultipath/dict.c
> > index f4357da1..bab96146 100644
> > --- a/libmultipath/dict.c
> > +++ b/libmultipath/dict.c
> > @@ -899,6 +899,13 @@ declare_ovr_snprint(dev_loss, print_dev_loss)
> >  declare_hw_handler(dev_loss, set_dev_loss)
> >  declare_hw_snprint(dev_loss, print_dev_loss)
> >  
> > +declare_def_handler(eh_deadline, set_undef_off_zero)
> > +declare_def_snprint(eh_deadline, print_undef_off_zero)
> > +declare_ovr_handler(eh_deadline, set_undef_off_zero)
> > +declare_ovr_snprint(eh_deadline, print_undef_off_zero)
> > +declare_hw_handler(eh_deadline, set_undef_off_zero)
> > +declare_hw_snprint(eh_deadline, print_undef_off_zero)
> > +
> >  static int
> >  set_pgpolicy(vector strvec, void *ptr)
> >  {
> > @@ -1771,6 +1778,7 @@ init_keywords(vector keywords)
> >  	install_keyword("gid", &def_gid_handler, &snprint_def_gid);
> >  	install_keyword("fast_io_fail_tmo", &def_fast_io_fail_handler,
> > &snprint_def_fast_io_fail);
> >  	install_keyword("dev_loss_tmo", &def_dev_loss_handler,
> > &snprint_def_dev_loss);
> > +	install_keyword("eh_deadline", &def_eh_deadline_handler,
> > &snprint_def_eh_deadline);
> >  	install_keyword("bindings_file", &def_bindings_file_handler,
> > &snprint_def_bindings_file);
> >  	install_keyword("wwids_file", &def_wwids_file_handler,
> > &snprint_def_wwids_file);
> >  	install_keyword("prkeys_file", &def_prkeys_file_handler,
> > &snprint_def_prkeys_file);
> > @@ -1880,6 +1888,7 @@ init_keywords(vector keywords)
> >  	install_keyword("flush_on_last_del",
> > &hw_flush_on_last_del_handler, &snprint_hw_flush_on_last_del);
> >  	install_keyword("fast_io_fail_tmo", &hw_fast_io_fail_handler,
> > &snprint_hw_fast_io_fail);
> >  	install_keyword("dev_loss_tmo", &hw_dev_loss_handler,
> > &snprint_hw_dev_loss);
> > +	install_keyword("eh_deadline", &hw_eh_deadline_handler,
> > &snprint_hw_eh_deadline);
> >  	install_keyword("user_friendly_names",
> > &hw_user_friendly_names_handler, &snprint_hw_user_friendly_names);
> >  	install_keyword("retain_attached_hw_handler",
> > &hw_retain_hwhandler_handler, &snprint_hw_retain_hwhandler);
> >  	install_keyword("detect_prio", &hw_detect_prio_handler,
> > &snprint_hw_detect_prio);
> > @@ -1920,6 +1929,7 @@ init_keywords(vector keywords)
> >  	install_keyword("flush_on_last_del",
> > &ovr_flush_on_last_del_handler, &snprint_ovr_flush_on_last_del);
> >  	install_keyword("fast_io_fail_tmo", &ovr_fast_io_fail_handler,
> > &snprint_ovr_fast_io_fail);
> >  	install_keyword("dev_loss_tmo", &ovr_dev_loss_handler,
> > &snprint_ovr_dev_loss);
> > +	install_keyword("eh_deadline", &ovr_eh_deadline_handler,
> > &snprint_ovr_eh_deadline);
> >  	install_keyword("user_friendly_names",
> > &ovr_user_friendly_names_handler, &snprint_ovr_user_friendly_names);
> >  	install_keyword("retain_attached_hw_handler",
> > &ovr_retain_hwhandler_handler, &snprint_ovr_retain_hwhandler);
> >  	install_keyword("detect_prio", &ovr_detect_prio_handler,
> > &snprint_ovr_detect_prio);
> > diff --git a/libmultipath/discovery.c b/libmultipath/discovery.c
> > index 950b1586..ef9a9a23 100644
> > --- a/libmultipath/discovery.c
> > +++ b/libmultipath/discovery.c
> > @@ -585,6 +585,42 @@ sysfs_get_asymmetric_access_state(struct path
> > *pp, char *buff, int buflen)
> >  	return !!preferred;
> >  }
> >  
> > +static int
> > +sysfs_set_eh_deadline(struct multipath *mpp, struct path *pp)
> > +{
> > +	struct udev_device *hostdev;
> > +	char host_name[HOST_NAME_LEN], value[16];
> > +	int ret;
> > +
> > +	if (mpp->eh_deadline == EH_DEADLINE_UNSET)
> > +		return 0;
> > +
> > +	sprintf(host_name, "host%d", pp->sg_id.host_no);
> > +	hostdev = udev_device_new_from_subsystem_sysname(udev,
> > +			"scsi_host", host_name);
> > +	if (!hostdev)
> > +		return 1;
> > +
> > +	if (mpp->eh_deadline == EH_DEADLINE_OFF)
> > +		sprintf(value, "off");
> > +	else if (mpp->eh_deadline == EH_DEADLINE_ZERO)
> > +		sprintf(value, "0");
> > +	else
> > +		snprintf(value, 16, "%u", mpp->eh_deadline);
> 
> Hm, mpp->eh_deadline is an "int". This should cause a compiler warning,
> does it not?

um.. no. Not on either gcc-8 or gcc-10.  But I'll fix it.

> Nitpick: You're certain to not need more than 11 bytes of string
> buffer, so you could simply use sprintf(). If you want to use
> snprintf(), please pass sizeof(value) rather than an explicit number.

Sure.

> > +
> > +	ret = sysfs_attr_set_value(hostdev, "eh_deadline",
> > +				   value, strlen(value));
> 
> Nitpick: you could use the return value of sprintf() here rather than
> calling strlen() again.

Sure

> > +	/*
> > +	 * not all scsi drivers support setting eh_deadline, so failing
> > +	 * is totally reasonable
> > +	 */
> > +	if (ret <= 0)
> > +		condlog(4, "%s: failed to set eh_deadline to %s, error
> > %d", udev_device_get_sysname(hostdev), value, -ret);
> 
> Loglevel 3 might still be justified. After all, the user did set
> eh_deadline, and might wonder why her settings aren't applied.

makes sense.

> > +
> > +	udev_device_unref(hostdev);
> > +	return (ret <= 0);
> > +}
> > +
> >  static void
> >  sysfs_set_rport_tmo(struct multipath *mpp, struct path *pp)
> >  {
> > @@ -799,7 +835,8 @@ sysfs_set_scsi_tmo (struct multipath *mpp,
> > unsigned int checkint)
> >  		mpp->fast_io_fail = MP_FAST_IO_FAIL_OFF;
> >  	}
> >  	if (mpp->dev_loss == DEV_LOSS_TMO_UNSET &&
> > -	    mpp->fast_io_fail == MP_FAST_IO_FAIL_UNSET)
> > +	    mpp->fast_io_fail == MP_FAST_IO_FAIL_UNSET &&
> > +	    mpp->eh_deadline == EH_DEADLINE_UNSET)
> >  		return 0;
> >  
> >  	vector_foreach_slot(mpp->paths, pp, i) {
> > @@ -808,21 +845,24 @@ sysfs_set_scsi_tmo (struct multipath *mpp,
> > unsigned int checkint)
> >  				err_path = pp;
> >  			continue;
> >  		}
> > -
> > -		switch (pp->sg_id.proto_id) {
> > -		case SCSI_PROTOCOL_FCP:
> > -			sysfs_set_rport_tmo(mpp, pp);
> > -			continue;
> > -		case SCSI_PROTOCOL_ISCSI:
> > -			sysfs_set_session_tmo(mpp, pp);
> > -			continue;
> > -		case SCSI_PROTOCOL_SAS:
> > -			sysfs_set_nexus_loss_tmo(mpp, pp);
> > -			continue;
> > -		default:
> > -			if (!err_path)
> > -				err_path = pp;
> > +		if (mpp->dev_loss != DEV_LOSS_TMO_UNSET ||
> > +		    mpp->fast_io_fail != MP_FAST_IO_FAIL_UNSET) {
> 
> Perhaps we should rather check these in the called functions then
> introduce another conditional here. Some callees do check these
> already. We've got the likely shortcut (none set) above already.

Sure.

> > +			switch (pp->sg_id.proto_id) {
> > +			case SCSI_PROTOCOL_FCP:
> > +				sysfs_set_rport_tmo(mpp, pp);
> > +				break;
> > +			case SCSI_PROTOCOL_ISCSI:
> > +				sysfs_set_session_tmo(mpp, pp);
> > +				break;
> > +			case SCSI_PROTOCOL_SAS:
> > +				sysfs_set_nexus_loss_tmo(mpp, pp);
> > +				break;
> > +			default:
> > +				if (!err_path)
> > +					err_path = pp;
> > +			}
> >  		}
> > +		sysfs_set_eh_deadline(mpp, pp);
> >  	}
> >  
> >  	if (err_path) {
> > diff --git a/libmultipath/propsel.c b/libmultipath/propsel.c
> > index 67d025cf..fa4ac5d9 100644
> > --- a/libmultipath/propsel.c
> > +++ b/libmultipath/propsel.c
> > @@ -775,6 +775,23 @@ out:
> >  	return 0;
> >  }
> >  
> > +int select_eh_deadline(struct config *conf, struct multipath *mp)
> > +{
> > +	const char *origin;
> > +	char buff[12];
> > +
> > +	mp_set_ovr(eh_deadline);
> > +	mp_set_hwe(eh_deadline);
> > +	mp_set_conf(eh_deadline);
> > +	mp->eh_deadline = EH_DEADLINE_UNSET;
> > +	/* not changing sysfs in default cause, so don't print anything
> > */
> > +	return 0;
> > +out:
> > +	print_undef_off_zero(buff, 12, mp->eh_deadline);
> > +	condlog(3, "%s: eh_deadline = %s %s", mp->alias, buff, origin);
> > +	return 0;
> > +}
> > +
> >  int select_flush_on_last_del(struct config *conf, struct multipath
> > *mp)
> >  {
> >  	const char *origin;
> > diff --git a/libmultipath/propsel.h b/libmultipath/propsel.h
> > index 3d6edd8a..a68bacf0 100644
> > --- a/libmultipath/propsel.h
> > +++ b/libmultipath/propsel.h
> > @@ -17,6 +17,7 @@ int select_uid(struct config *conf, struct
> > multipath *mp);
> >  int select_gid(struct config *conf, struct multipath *mp);
> >  int select_fast_io_fail(struct config *conf, struct multipath *mp);
> >  int select_dev_loss(struct config *conf, struct multipath *mp);
> > +int select_eh_deadline(struct config *conf, struct multipath *mp);
> >  int select_reservation_key(struct config *conf, struct multipath
> > *mp);
> >  int select_retain_hwhandler (struct config *conf, struct multipath *
> > mp);
> >  int select_detect_prio(struct config *conf, struct path * pp);
> > diff --git a/libmultipath/structs.h b/libmultipath/structs.h
> > index cfa7b649..d6ff6762 100644
> > --- a/libmultipath/structs.h
> > +++ b/libmultipath/structs.h
> > @@ -236,6 +236,12 @@ enum fast_io_fail_states {
> >  	MP_FAST_IO_FAIL_ZERO = UOZ_ZERO,
> >  };
> >  
> > +enum eh_deadline_states {
> > +	EH_DEADLINE_UNSET = UOZ_UNDEF,
> > +	EH_DEADLINE_OFF = UOZ_OFF,
> > +	EH_DEADLINE_ZERO = UOZ_ZERO,
> > +};
> > +
> >  struct vpd_vendor_page {
> >  	int pg;
> >  	const char *name;
> > @@ -356,6 +362,7 @@ struct multipath {
> >  	int ghost_delay;
> >  	int ghost_delay_tick;
> >  	unsigned int dev_loss;
> > +	int eh_deadline;
> >  	uid_t uid;
> >  	gid_t gid;
> >  	mode_t mode;
> > diff --git a/multipath/multipath.conf.5 b/multipath/multipath.conf.5
> > index d2101ed6..cf05c1ab 100644
> > --- a/multipath/multipath.conf.5
> > +++ b/multipath/multipath.conf.5
> > @@ -717,6 +717,22 @@ The default is: \fB600\fR
> >  .
> >  .
> >  .TP
> > +.B eh_deadline
> > +Specify the maximum number of seconds the SCSI layer will spend
> > doing error
> > +handling when scsi devices fail. After this timeout the scsi layer
> > will perform
> > +a full HBA reset. Setting this may be necessary in cases where the
> > rport is
> > +never lost, so \fIfast_io_fail_tmo\fR and \fIdev_loss_tmo\fR will
> > never
> > +trigger, but (frequently do to load) scsi commands still hang.
> > \fBNote:\fR when
> > +the scsi error handler performs the HBA reset, all target paths on
> > that HBA
> > +will be affected. eh_deadline should only be set in cases where all
> > targets on
> > +the affected HBAs are multipathed.
> > +.RS
> > +.TP
> > +The default is: \fB<unset>\fR
> > +.RE
> > +.
> > +.
> > +.TP
> >  .B bindings_file
> >  The full pathname of the binding file to be used when the
> > user_friendly_names
> >  option is set.
> 
> -- 
> Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
> SUSE  Software Solutions Germany GmbH
> HRB 36809, AG Nürnberg GF: Felix
> Imendörffer
> 

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [dm-devel] [PATCH 4/5] libmultipath: only read 0xc9 vpd page for devices with rdac checker
  2020-11-03  1:11     ` Benjamin Marzinski
@ 2020-11-03  5:09       ` Benjamin Marzinski
  2020-11-03  8:59       ` Martin Wilck
  1 sibling, 0 replies; 25+ messages in thread
From: Benjamin Marzinski @ 2020-11-03  5:09 UTC (permalink / raw)
  To: Martin Wilck; +Cc: ng-eseries-upstream-maintainers, dm-devel, sschremm

On Mon, Nov 02, 2020 at 07:11:06PM -0600, Benjamin Marzinski wrote:
> On Fri, Oct 30, 2020 at 09:12:46PM +0000, Martin Wilck wrote:
> > On Fri, 2020-10-23 at 16:15 -0500, Benjamin Marzinski wrote:
> > > Only rdac arrays support 0xC9 vpd page inquiries. All other arrays
> > > will
> > > return a failure. Since all rdac arrays in the the built-in device
> > > configuration explicitly set the RDAC path checker, and almost all
> > > other
> > > scsi arrays do not set a path checker, it makes sense to only do the
> > > rdac inquiry when detecting array capabilities if the array's path
> > > checker is explicitly set to rdac.
> > > 
> > > Multipath was doing the check if either the path checker was set to
> > > rdac, or no path checker was set.  This means that for almost all
> > > non-rdac arrays, multipath was issuing a bad inquiry. This was
> > > annoying
> > > users.
> > 
> > This is sort of funny. We created the entire check_rdac() feature in
> > order to autodetect RDAC arrays. If we limit the autodetection to those
> > arrays that have "rdac" set already, why would we do this at all? We
> > could simply go with the hwtable / config file settings, as we used to
> > before check_rdac() was created. All we'd need to do is use "alua" prio
> > for arrays with "rdac" checker. What am I missing here?
> > 
> > I thought that this autodetection was intended because there are many
> > Netapp OEMs which we may not all have included in the hwtable (thus
> > having no path_checker set). For those, we'd choose the wrong settings
> > by default with this patch, only to avoid some error messages about
> > unsupported VPDs. I'm not convinced that that's a good idea.
> > 
> > Suggestion: we could try to retrieve VPD 0 (supported VPDs) before
> > checking VPD 0xc9. That would avoid the errors on non-Netapp hardware,
> > while still allowing us to autodetect RDAC systems that are missing in
> > the hwtable.
> 
> When this idea was brought up in discussions about how to solve this,
> there was a worry that older rdac devices might not not support vpd page
> 0 correctly. I'm not sure how valid that worry is.

Of course, that doesn't mean it wouldn't be useful to always do the
check on vpd page c9 for devices that do indicate support in vpd 0. So,
yes I agree that makes sense so unless Netapp has any objections, I can
change the patch to also check c9 if vpd page 0 says it's supported.

-Ben

> 
> -Ben
> 
> > 
> > Regards
> > Martin
> > 
> > > 
> > > Cc: Steve Schremmer <sschremm@netapp.com>
> > > Cc: NetApp RDAC team <ng-eseries-upstream-maintainers@netapp.com>
> > > Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> > > ---
> > >  libmultipath/propsel.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/libmultipath/propsel.c b/libmultipath/propsel.c
> > > index fa4ac5d9..90a77d77 100644
> > > --- a/libmultipath/propsel.c
> > > +++ b/libmultipath/propsel.c
> > > @@ -501,7 +501,7 @@ check_rdac(struct path * pp)
> > >  	if (pp->bus != SYSFS_BUS_SCSI)
> > >  		return 0;
> > >  	/* Avoid ioctl if this is likely not an RDAC array */
> > > -	if (__do_set_from_hwe(checker_name, pp, checker_name) &&
> > > +	if (!__do_set_from_hwe(checker_name, pp, checker_name) ||
> > >  	    strcmp(checker_name, RDAC))
> > >  		return 0;
> > >  	len = get_vpd_sgio(pp->fd, 0xC9, 0, buff, 44);
> > 
> > -- 
> > Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
> > SUSE  Software Solutions Germany GmbH
> > HRB 36809, AG Nürnberg GF: Felix
> > Imendörffer
> > 
> 
> --
> dm-devel mailing list
> dm-devel@redhat.com
> https://www.redhat.com/mailman/listinfo/dm-devel

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [dm-devel] [PATCH 4/5] libmultipath: only read 0xc9 vpd page for devices with rdac checker
  2020-11-03  1:11     ` Benjamin Marzinski
  2020-11-03  5:09       ` Benjamin Marzinski
@ 2020-11-03  8:59       ` Martin Wilck
  2020-11-03 23:15         ` Schremmer, Steven
  1 sibling, 1 reply; 25+ messages in thread
From: Martin Wilck @ 2020-11-03  8:59 UTC (permalink / raw)
  To: bmarzins; +Cc: ng-eseries-upstream-maintainers, dm-devel, sschremm

On Mon, 2020-11-02 at 19:11 -0600, Benjamin Marzinski wrote:
> On Fri, Oct 30, 2020 at 09:12:46PM +0000, Martin Wilck wrote:
> > 
> > Suggestion: we could try to retrieve VPD 0 (supported VPDs) before
> > checking VPD 0xc9. That would avoid the errors on non-Netapp
> > hardware,
> > while still allowing us to autodetect RDAC systems that are missing
> > in
> > the hwtable.
> 
> When this idea was brought up in discussions about how to solve this,
> there was a worry that older rdac devices might not not support vpd
> page
> 0 correctly. I'm not sure how valid that worry is.

These old devices should be in the hwtable.

The algorithm would be something like this:

 A) if hwtable says "rdac", goto D)
 B) if hwtable says nothing, goto E)
 C) otherwise use hwtable settings, END
 D) try VPD 0xc9
    1) if it works as expected, use RDAC, END
    2) otherwise, goto F)
 E) try VPD 0
    1) if it fails, goto F)
    2) if it works and page c9 is listed, goto D)
    3) if it works and c9 is not listed, goto F)
 F) use defaults, END

I suppose "hwtable" means all built-in and user configuration here,
including "overrides". But I haven't thought that through yet.

Cheers,
Martin

-- 
Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
SUSE  Software Solutions Germany GmbH
HRB 36809, AG Nürnberg GF: Felix
Imendörffer



--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [dm-devel] [PATCH 5/5] libmultipath: don't dlclose tur checker DSO
  2020-10-30 21:15   ` Martin Wilck
  2020-11-02 18:27     ` Benjamin Marzinski
@ 2020-11-03 18:48     ` Benjamin Marzinski
  1 sibling, 0 replies; 25+ messages in thread
From: Benjamin Marzinski @ 2020-11-03 18:48 UTC (permalink / raw)
  To: Martin Wilck; +Cc: dm-devel

On Fri, Oct 30, 2020 at 09:15:39PM +0000, Martin Wilck wrote:
> On Fri, 2020-10-23 at 16:15 -0500, Benjamin Marzinski wrote:
> > The multipathd tur checker thread is designed to be able to finish at
> > any time, even after the tur checker itself has been freed. The
> > multipathd shutdown code makes sure all the checkers have been freed
> > before freeing the checker_class and calling dlclose() to unload the
> > DSO, but this doesn't guarantee that the checker threads have
> > finished.
> > If one hasn't, the DSO will get unloaded while the thread still
> > running
> > code from it, causing a segfault. Unfortunately, it's not possible to
> > be
> > sure that all tur checker threads have ended during shutdown, without
> > making them joinable.
> > 
> > However, since libmultipath will never be reinitialized after it has
> > been uninitialzed, not dlclosing the tur checker DSO once a thread is
> > started has minimal cost (keeping the DSO code around until the
> > program
> > exits, which usually happens right after freeing the checkers).
> 
> I'm not against this, but have you considered using an atomic  refcount
> for the DSO? With every tur thread starting, we could increase it, and
> decrease it in the cleanup function of the thread when it exits. That
> should be safe. If the refcount was positive when we exit, we could
> refrain from unloading the DSO.
> 
> Regards,
> Martin

NAK. I apparently forgot to commit the version file changes.

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [dm-devel] [PATCH 4/5] libmultipath: only read 0xc9 vpd page for devices with rdac checker
  2020-11-03  8:59       ` Martin Wilck
@ 2020-11-03 23:15         ` Schremmer, Steven
  0 siblings, 0 replies; 25+ messages in thread
From: Schremmer, Steven @ 2020-11-03 23:15 UTC (permalink / raw)
  To: Martin Wilck, bmarzins; +Cc: ng-eseries-upstream-maintainers, dm-devel



> -----Original Message-----
> From: Martin Wilck <martin.wilck@suse.com>
> Sent: Tuesday, November 3, 2020 3:00 AM
> To: bmarzins@redhat.com
> Cc: dm-devel@redhat.com; ng-eseries-upstream-maintainers <ng-eseries-upstream-maintainers@netapp.com>;
> christophe.varoqui@opensvc.com; Schremmer, Steven <Steve.Schremmer@netapp.com>
> Subject: Re: [PATCH 4/5] libmultipath: only read 0xc9 vpd page for devices with rdac checker
> 
> On Mon, 2020-11-02 at 19:11 -0600, Benjamin Marzinski wrote:
> > On Fri, Oct 30, 2020 at 09:12:46PM +0000, Martin Wilck wrote:
> > >
> > > Suggestion: we could try to retrieve VPD 0 (supported VPDs) before
> > > checking VPD 0xc9. That would avoid the errors on non-Netapp
> > > hardware,
> > > while still allowing us to autodetect RDAC systems that are missing
> > > in
> > > the hwtable.
> >
> > When this idea was brought up in discussions about how to solve this,
> > there was a worry that older rdac devices might not not support vpd
> > page
> > 0 correctly. I'm not sure how valid that worry is.
> 
> These old devices should be in the hwtable.
> 
> The algorithm would be something like this:
> 
>  A) if hwtable says "rdac", goto D)
>  B) if hwtable says nothing, goto E)
>  C) otherwise use hwtable settings, END
>  D) try VPD 0xc9
>     1) if it works as expected, use RDAC, END
>     2) otherwise, goto F)
>  E) try VPD 0
>     1) if it fails, goto F)
>     2) if it works and page c9 is listed, goto D)
>     3) if it works and c9 is not listed, goto F)
>  F) use defaults, END
> 
> I suppose "hwtable" means all built-in and user configuration here,
> including "overrides". But I haven't thought that through yet.
> 
> Cheers,
> Martin

I think Martin's algorithm would work, but it would also be safe to use
VPD 0 in all cases as rdac systems have supported page 0 for well over a 
decade. I think having "hwtable" include user configuration would be a
good choice, so that new rdac devices could be specified in multipath.conf.

Thanks,
Steve


--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [dm-devel] [PATCH 5/5] libmultipath: don't dlclose tur checker DSO
  2020-11-02 18:27     ` Benjamin Marzinski
@ 2020-11-04 22:39       ` Martin Wilck
  2020-11-04 23:27         ` Benjamin Marzinski
  2020-11-04 23:34         ` Martin Wilck
  0 siblings, 2 replies; 25+ messages in thread
From: Martin Wilck @ 2020-11-04 22:39 UTC (permalink / raw)
  To: bmarzins; +Cc: dm-devel

On Mon, 2020-11-02 at 12:27 -0600, Benjamin Marzinski wrote:
> On Fri, Oct 30, 2020 at 09:15:39PM +0000, Martin Wilck wrote:
> > On Fri, 2020-10-23 at 16:15 -0500, Benjamin Marzinski wrote:
> > > The multipathd tur checker thread is designed to be able to
> > > finish at
> > > any time, even after the tur checker itself has been freed. The
> > > multipathd shutdown code makes sure all the checkers have been
> > > freed
> > > before freeing the checker_class and calling dlclose() to unload
> > > the
> > > DSO, but this doesn't guarantee that the checker threads have
> > > finished.
> > > If one hasn't, the DSO will get unloaded while the thread still
> > > running
> > > code from it, causing a segfault. Unfortunately, it's not
> > > possible to
> > > be
> > > sure that all tur checker threads have ended during shutdown,
> > > without
> > > making them joinable.
> > > 
> > > However, since libmultipath will never be reinitialized after it
> > > has
> > > been uninitialzed, not dlclosing the tur checker DSO once a
> > > thread is
> > > started has minimal cost (keeping the DSO code around until the
> > > program
> > > exits, which usually happens right after freeing the checkers).
> > 
> > I'm not against this, but have you considered using an
> > atomic  refcount
> > for the DSO? With every tur thread starting, we could increase it,
> > and
> > decrease it in the cleanup function of the thread when it exits.
> > That
> > should be safe. If the refcount was positive when we exit, we could
> > refrain from unloading the DSO.
> 
> I tried exactly that, and I would still get crashes, even if it put
> the
> code that decrements the atomic variable in a function that's not
> part
> of the DSO, and put a pthread_exit() before the final
> pthread_cleanup_pop() that would decrement it in tur_thread, so that
> after the cleanup code is called the thread should never return to
> code
> that is in the DSO. I had to add sleeps in code to force various
> orderings, but I couldn't find any way that worked for all possible
> orderings.  I would love it if this worked, and you're free to try,
> but
> I could not get this method to work.

I've experimented a bit with a trivial test program, and I found that
it worked stably if decrementing the refcount is really the last thing
thread's cleanup function does. Could you provide some details about
the sleeps that you'd put in that made this approach fail?

Martin

-- 
Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
SUSE  Software Solutions Germany GmbH
HRB 36809, AG Nürnberg GF: Felix
Imendörffer



--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [dm-devel] [PATCH 5/5] libmultipath: don't dlclose tur checker DSO
  2020-11-04 22:39       ` Martin Wilck
@ 2020-11-04 23:27         ` Benjamin Marzinski
  2020-11-04 23:56           ` Martin Wilck
  2020-11-04 23:34         ` Martin Wilck
  1 sibling, 1 reply; 25+ messages in thread
From: Benjamin Marzinski @ 2020-11-04 23:27 UTC (permalink / raw)
  To: Martin Wilck; +Cc: dm-devel

On Wed, Nov 04, 2020 at 10:39:39PM +0000, Martin Wilck wrote:
> On Mon, 2020-11-02 at 12:27 -0600, Benjamin Marzinski wrote:
> > On Fri, Oct 30, 2020 at 09:15:39PM +0000, Martin Wilck wrote:
> > > On Fri, 2020-10-23 at 16:15 -0500, Benjamin Marzinski wrote:
> > > > The multipathd tur checker thread is designed to be able to
> > > > finish at
> > > > any time, even after the tur checker itself has been freed. The
> > > > multipathd shutdown code makes sure all the checkers have been
> > > > freed
> > > > before freeing the checker_class and calling dlclose() to unload
> > > > the
> > > > DSO, but this doesn't guarantee that the checker threads have
> > > > finished.
> > > > If one hasn't, the DSO will get unloaded while the thread still
> > > > running
> > > > code from it, causing a segfault. Unfortunately, it's not
> > > > possible to
> > > > be
> > > > sure that all tur checker threads have ended during shutdown,
> > > > without
> > > > making them joinable.
> > > > 
> > > > However, since libmultipath will never be reinitialized after it
> > > > has
> > > > been uninitialzed, not dlclosing the tur checker DSO once a
> > > > thread is
> > > > started has minimal cost (keeping the DSO code around until the
> > > > program
> > > > exits, which usually happens right after freeing the checkers).
> > > 
> > > I'm not against this, but have you considered using an
> > > atomic  refcount
> > > for the DSO? With every tur thread starting, we could increase it,
> > > and
> > > decrease it in the cleanup function of the thread when it exits.
> > > That
> > > should be safe. If the refcount was positive when we exit, we could
> > > refrain from unloading the DSO.
> > 
> > I tried exactly that, and I would still get crashes, even if it put
> > the
> > code that decrements the atomic variable in a function that's not
> > part
> > of the DSO, and put a pthread_exit() before the final
> > pthread_cleanup_pop() that would decrement it in tur_thread, so that
> > after the cleanup code is called the thread should never return to
> > code
> > that is in the DSO. I had to add sleeps in code to force various
> > orderings, but I couldn't find any way that worked for all possible
> > orderings.  I would love it if this worked, and you're free to try,
> > but
> > I could not get this method to work.
> 
> I've experimented a bit with a trivial test program, and I found that
> it worked stably if decrementing the refcount is really the last thing
> thread's cleanup function does. Could you provide some details about
> the sleeps that you'd put in that made this approach fail?

I believe the situation that continued to crash was where the
tur_thread() exitted naturally (so it set running to 0), although I'm
not sure that this is necessary, or if it would still crash when running
the cleanup function because of a cancel.  I put the cleanup function to
decrement the count in libmultipath, so that it wasn't part of the DSO,
and then I put a sleep(5) as the last line of the cleanup function, and
a sleep(10) as the last line of cleanup_checkers(). I also had to set
running to 0 before starting the thread, and then take out the code to
pause the thread if running was aleady 0, to make sure the thread acted
like it was the one to set running to 0. Then the idea is to stop
multipathd while there is a thread in its sleep period, so that
multipathd sees that the counter is 0, and closes the dso, and then
the thread finishes before multipathd shuts the rest of the way down. 

-Ben

> Martin
> 
> -- Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107 SUSE
> Software Solutions Germany GmbH HRB 36809, AG Nürnberg GF: Felix
> Imendörffer
> 

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [dm-devel] [PATCH 5/5] libmultipath: don't dlclose tur checker DSO
  2020-11-04 22:39       ` Martin Wilck
  2020-11-04 23:27         ` Benjamin Marzinski
@ 2020-11-04 23:34         ` Martin Wilck
  1 sibling, 0 replies; 25+ messages in thread
From: Martin Wilck @ 2020-11-04 23:34 UTC (permalink / raw)
  To: bmarzins; +Cc: dm-devel

On Wed, 2020-11-04 at 22:39 +0000, Martin Wilck wrote:
> On Mon, 2020-11-02 at 12:27 -0600, Benjamin Marzinski wrote:
> > On Fri, Oct 30, 2020 at 09:15:39PM +0000, Martin Wilck wrote:
> > > On Fri, 2020-10-23 at 16:15 -0500, Benjamin Marzinski wrote:
> > > > The multipathd tur checker thread is designed to be able to
> > > > finish at
> > > > any time, even after the tur checker itself has been freed. The
> > > > multipathd shutdown code makes sure all the checkers have been
> > > > freed
> > > > before freeing the checker_class and calling dlclose() to
> > > > unload
> > > > the
> > > > DSO, but this doesn't guarantee that the checker threads have
> > > > finished.
> > > > If one hasn't, the DSO will get unloaded while the thread still
> > > > running
> > > > code from it, causing a segfault. Unfortunately, it's not
> > > > possible to
> > > > be
> > > > sure that all tur checker threads have ended during shutdown,
> > > > without
> > > > making them joinable.
> > > > 
> > > > However, since libmultipath will never be reinitialized after
> > > > it
> > > > has
> > > > been uninitialzed, not dlclosing the tur checker DSO once a
> > > > thread is
> > > > started has minimal cost (keeping the DSO code around until the
> > > > program
> > > > exits, which usually happens right after freeing the checkers).
> > > 
> > > I'm not against this, but have you considered using an
> > > atomic  refcount
> > > for the DSO? With every tur thread starting, we could increase
> > > it,
> > > and
> > > decrease it in the cleanup function of the thread when it exits.
> > > That
> > > should be safe. If the refcount was positive when we exit, we
> > > could
> > > refrain from unloading the DSO.
> > 
> > I tried exactly that, and I would still get crashes, even if it put
> > the
> > code that decrements the atomic variable in a function that's not
> > part
> > of the DSO, and put a pthread_exit() before the final
> > pthread_cleanup_pop() that would decrement it in tur_thread, so
> > that
> > after the cleanup code is called the thread should never return to
> > code
> > that is in the DSO. I had to add sleeps in code to force various
> > orderings, but I couldn't find any way that worked for all possible
> > orderings.  I would love it if this worked, and you're free to try,
> > but
> > I could not get this method to work.
> 
> I've experimented a bit with a trivial test program, and I found that
> it worked stably if decrementing the refcount is really the last
> thing
> thread's cleanup function does. Could you provide some details about
> the sleeps that you'd put in that made this approach fail?

I've made some more experiments and if I repeat my test enough, I could
see SIGSEGV.

I was able to make it work by using a thread entrypoint outside the DSO
(calling the DSO's thread function), and decrementing the refcount in
the outermost cleanup function (which is also not in the DSO),
all using proper atomics and barriers.

Martin

-- 
Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
SUSE  Software Solutions Germany GmbH
HRB 36809, AG Nürnberg GF: Felix
Imendörffer



--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [dm-devel] [PATCH 5/5] libmultipath: don't dlclose tur checker DSO
  2020-11-04 23:27         ` Benjamin Marzinski
@ 2020-11-04 23:56           ` Martin Wilck
  2020-11-05  0:41             ` Benjamin Marzinski
  0 siblings, 1 reply; 25+ messages in thread
From: Martin Wilck @ 2020-11-04 23:56 UTC (permalink / raw)
  To: bmarzins; +Cc: dm-devel

On Wed, 2020-11-04 at 17:27 -0600, Benjamin Marzinski wrote:
> On Wed, Nov 04, 2020 at 10:39:39PM +0000, Martin Wilck wrote:
> > On Mon, 2020-11-02 at 12:27 -0600, Benjamin Marzinski wrote:
> > > On Fri, Oct 30, 2020 at 09:15:39PM +0000, Martin Wilck wrote:
> > > > On Fri, 2020-10-23 at 16:15 -0500, Benjamin Marzinski wrote:
> > > > > The multipathd tur checker thread is designed to be able to
> > > > > finish at
> > > > > any time, even after the tur checker itself has been freed.
> > > > > The
> > > > > multipathd shutdown code makes sure all the checkers have
> > > > > been
> > > > > freed
> > > > > before freeing the checker_class and calling dlclose() to
> > > > > unload
> > > > > the
> > > > > DSO, but this doesn't guarantee that the checker threads have
> > > > > finished.
> > > > > If one hasn't, the DSO will get unloaded while the thread
> > > > > still
> > > > > running
> > > > > code from it, causing a segfault. Unfortunately, it's not
> > > > > possible to
> > > > > be
> > > > > sure that all tur checker threads have ended during shutdown,
> > > > > without
> > > > > making them joinable.
> > > > > 
> > > > > However, since libmultipath will never be reinitialized after
> > > > > it
> > > > > has
> > > > > been uninitialzed, not dlclosing the tur checker DSO once a
> > > > > thread is
> > > > > started has minimal cost (keeping the DSO code around until
> > > > > the
> > > > > program
> > > > > exits, which usually happens right after freeing the
> > > > > checkers).
> > > > 
> > > > I'm not against this, but have you considered using an
> > > > atomic  refcount
> > > > for the DSO? With every tur thread starting, we could increase
> > > > it,
> > > > and
> > > > decrease it in the cleanup function of the thread when it
> > > > exits.
> > > > That
> > > > should be safe. If the refcount was positive when we exit, we
> > > > could
> > > > refrain from unloading the DSO.
> > > 
> > > I tried exactly that, and I would still get crashes, even if it
> > > put
> > > the
> > > code that decrements the atomic variable in a function that's not
> > > part
> > > of the DSO, and put a pthread_exit() before the final
> > > pthread_cleanup_pop() that would decrement it in tur_thread, so
> > > that
> > > after the cleanup code is called the thread should never return
> > > to
> > > code
> > > that is in the DSO. I had to add sleeps in code to force various
> > > orderings, but I couldn't find any way that worked for all
> > > possible
> > > orderings.  I would love it if this worked, and you're free to
> > > try,
> > > but
> > > I could not get this method to work.
> > 
> > I've experimented a bit with a trivial test program, and I found
> > that
> > it worked stably if decrementing the refcount is really the last
> > thing
> > thread's cleanup function does. Could you provide some details
> > about
> > the sleeps that you'd put in that made this approach fail?
> 
> I believe the situation that continued to crash was where the
> tur_thread() exitted naturally (so it set running to 0), although I'm
> not sure that this is necessary, or if it would still crash when
> running
> the cleanup function because of a cancel.  I put the cleanup function
> to
> decrement the count in libmultipath, so that it wasn't part of the
> DSO,
> and then I put a sleep(5) as the last line of the cleanup function,
> and
> a sleep(10) as the last line of cleanup_checkers(). I also had to set
> running to 0 before starting the thread, and then take out the code
> to
> pause the thread if running was aleady 0, to make sure the thread
> acted
> like it was the one to set running to 0. Then the idea is to stop
> multipathd while there is a thread in its sleep period, so that
> multipathd sees that the counter is 0, and closes the dso, and then
> the thread finishes before multipathd shuts the rest of the way
> down. 

I guess the key is that the thread's entry point must also be in
libmultipath (i.e. outside the DSO). In pseudo-code:

entrypoint() {
   refcount++;
   pthread_cleanup_push(refcount--);
   tur_thread(ct);
   pthread_cleanup_pop(1);
}

This way the thread can't be in DSO code any more when refcount goes to
zero.

Martin

-- 
Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
SUSE  Software Solutions Germany GmbH
HRB 36809, AG Nürnberg GF: Felix
Imendörffer



--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [dm-devel] [PATCH 5/5] libmultipath: don't dlclose tur checker DSO
  2020-11-04 23:56           ` Martin Wilck
@ 2020-11-05  0:41             ` Benjamin Marzinski
  2020-11-05 11:53               ` Martin Wilck
  0 siblings, 1 reply; 25+ messages in thread
From: Benjamin Marzinski @ 2020-11-05  0:41 UTC (permalink / raw)
  To: Martin Wilck; +Cc: dm-devel

On Wed, Nov 04, 2020 at 11:56:07PM +0000, Martin Wilck wrote:
> On Wed, 2020-11-04 at 17:27 -0600, Benjamin Marzinski wrote:
> > On Wed, Nov 04, 2020 at 10:39:39PM +0000, Martin Wilck wrote:
> > > On Mon, 2020-11-02 at 12:27 -0600, Benjamin Marzinski wrote:
> > > > On Fri, Oct 30, 2020 at 09:15:39PM +0000, Martin Wilck wrote:
> > > > > On Fri, 2020-10-23 at 16:15 -0500, Benjamin Marzinski wrote:
> > > > > > The multipathd tur checker thread is designed to be able to
> > > > > > finish at
> > > > > > any time, even after the tur checker itself has been freed.
> > > > > > The
> > > > > > multipathd shutdown code makes sure all the checkers have
> > > > > > been
> > > > > > freed
> > > > > > before freeing the checker_class and calling dlclose() to
> > > > > > unload
> > > > > > the
> > > > > > DSO, but this doesn't guarantee that the checker threads have
> > > > > > finished.
> > > > > > If one hasn't, the DSO will get unloaded while the thread
> > > > > > still
> > > > > > running
> > > > > > code from it, causing a segfault. Unfortunately, it's not
> > > > > > possible to
> > > > > > be
> > > > > > sure that all tur checker threads have ended during shutdown,
> > > > > > without
> > > > > > making them joinable.
> > > > > > 
> > > > > > However, since libmultipath will never be reinitialized after
> > > > > > it
> > > > > > has
> > > > > > been uninitialzed, not dlclosing the tur checker DSO once a
> > > > > > thread is
> > > > > > started has minimal cost (keeping the DSO code around until
> > > > > > the
> > > > > > program
> > > > > > exits, which usually happens right after freeing the
> > > > > > checkers).
> > > > > 
> > > > > I'm not against this, but have you considered using an
> > > > > atomic  refcount
> > > > > for the DSO? With every tur thread starting, we could increase
> > > > > it,
> > > > > and
> > > > > decrease it in the cleanup function of the thread when it
> > > > > exits.
> > > > > That
> > > > > should be safe. If the refcount was positive when we exit, we
> > > > > could
> > > > > refrain from unloading the DSO.
> > > > 
> > > > I tried exactly that, and I would still get crashes, even if it
> > > > put
> > > > the
> > > > code that decrements the atomic variable in a function that's not
> > > > part
> > > > of the DSO, and put a pthread_exit() before the final
> > > > pthread_cleanup_pop() that would decrement it in tur_thread, so
> > > > that
> > > > after the cleanup code is called the thread should never return
> > > > to
> > > > code
> > > > that is in the DSO. I had to add sleeps in code to force various
> > > > orderings, but I couldn't find any way that worked for all
> > > > possible
> > > > orderings.  I would love it if this worked, and you're free to
> > > > try,
> > > > but
> > > > I could not get this method to work.
> > > 
> > > I've experimented a bit with a trivial test program, and I found
> > > that
> > > it worked stably if decrementing the refcount is really the last
> > > thing
> > > thread's cleanup function does. Could you provide some details
> > > about
> > > the sleeps that you'd put in that made this approach fail?
> > 
> > I believe the situation that continued to crash was where the
> > tur_thread() exitted naturally (so it set running to 0), although I'm
> > not sure that this is necessary, or if it would still crash when
> > running
> > the cleanup function because of a cancel.  I put the cleanup function
> > to
> > decrement the count in libmultipath, so that it wasn't part of the
> > DSO,
> > and then I put a sleep(5) as the last line of the cleanup function,
> > and
> > a sleep(10) as the last line of cleanup_checkers(). I also had to set
> > running to 0 before starting the thread, and then take out the code
> > to
> > pause the thread if running was aleady 0, to make sure the thread
> > acted
> > like it was the one to set running to 0. Then the idea is to stop
> > multipathd while there is a thread in its sleep period, so that
> > multipathd sees that the counter is 0, and closes the dso, and then
> > the thread finishes before multipathd shuts the rest of the way
> > down. 
> 
> I guess the key is that the thread's entry point must also be in
> libmultipath (i.e. outside the DSO). In pseudo-code:
> 
> entrypoint() {
>    refcount++;
>    pthread_cleanup_push(refcount--);
>    tur_thread(ct);
>    pthread_cleanup_pop(1);
> }
> 
> This way the thread can't be in DSO code any more when refcount goes to
> zero.

Oh! I didn't think of solving it that way, but it makes sense. So, were
you planning on posting a patch?

-Ben


> 
> Martin
> 
> -- 
> Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
> SUSE  Software Solutions Germany GmbH
> HRB 36809, AG Nürnberg GF: Felix
> Imendörffer
> 

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [dm-devel] [PATCH 5/5] libmultipath: don't dlclose tur checker DSO
  2020-11-05  0:41             ` Benjamin Marzinski
@ 2020-11-05 11:53               ` Martin Wilck
  0 siblings, 0 replies; 25+ messages in thread
From: Martin Wilck @ 2020-11-05 11:53 UTC (permalink / raw)
  To: bmarzins; +Cc: dm-devel

On Wed, 2020-11-04 at 18:41 -0600, Benjamin Marzinski wrote:
> On Wed, Nov 04, 2020 at 11:56:07PM +0000, Martin Wilck wrote:
> > 
> > I guess the key is that the thread's entry point must also be in
> > libmultipath (i.e. outside the DSO). In pseudo-code:
> > 
> > entrypoint() {
> >    refcount++;
> >    pthread_cleanup_push(refcount--);
> >    tur_thread(ct);
> >    pthread_cleanup_pop(1);
> > }
> > 
> > This way the thread can't be in DSO code any more when refcount
> > goes to
> > zero.
> 
> Oh! I didn't think of solving it that way, but it makes sense. So,
> were
> you planning on posting a patch?

I just did ("libmultipath: prevent DSO unloading with astray checker
threads"). Please have a look and possibly test it using the setup
that failed for you before.

Regards
Martin

-- 
Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
SUSE  Software Solutions Germany GmbH
HRB 36809, AG Nürnberg GF: Felix
Imendörffer



--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2020-11-05 11:54 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-23 21:15 [dm-devel] [PATCH 0/5] Misc Multipath patches Benjamin Marzinski
2020-10-23 21:15 ` [dm-devel] [PATCH 1/5] libmultipath: move fast_io_fail defines to structs.h Benjamin Marzinski
2020-10-30 20:43   ` Martin Wilck
2020-10-23 21:15 ` [dm-devel] [PATCH 2/5] libmultipath: add eh_deadline multipath.conf parameter Benjamin Marzinski
2020-10-30 21:00   ` Martin Wilck
2020-11-03  3:05     ` Benjamin Marzinski
2020-10-23 21:15 ` [dm-devel] [PATCH 3/5] multipathd: remove redundant vector_free() int configure Benjamin Marzinski
2020-10-30 21:01   ` Martin Wilck
2020-10-23 21:15 ` [dm-devel] [PATCH 4/5] libmultipath: only read 0xc9 vpd page for devices with rdac checker Benjamin Marzinski
2020-10-26 15:52   ` Schremmer, Steven
2020-10-30 21:12   ` Martin Wilck
2020-11-03  1:11     ` Benjamin Marzinski
2020-11-03  5:09       ` Benjamin Marzinski
2020-11-03  8:59       ` Martin Wilck
2020-11-03 23:15         ` Schremmer, Steven
2020-10-23 21:15 ` [dm-devel] [PATCH 5/5] libmultipath: don't dlclose tur checker DSO Benjamin Marzinski
2020-10-30 21:15   ` Martin Wilck
2020-11-02 18:27     ` Benjamin Marzinski
2020-11-04 22:39       ` Martin Wilck
2020-11-04 23:27         ` Benjamin Marzinski
2020-11-04 23:56           ` Martin Wilck
2020-11-05  0:41             ` Benjamin Marzinski
2020-11-05 11:53               ` Martin Wilck
2020-11-04 23:34         ` Martin Wilck
2020-11-03 18:48     ` Benjamin Marzinski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).