DM-Devel Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH 00/23] libmultipath: improve cleanup on exit
@ 2020-09-24 13:40 mwilck
  2020-09-24 13:40 ` [PATCH 01/23] multipathd: uxlsnr: avoid deadlock " mwilck
                   ` (23 more replies)
  0 siblings, 24 replies; 37+ messages in thread
From: mwilck @ 2020-09-24 13:40 UTC (permalink / raw)
  To: Christophe Varoqui, Benjamin Marzinski; +Cc: lixiaokeng, dm-devel, Martin Wilck

From: Martin Wilck <mwilck@suse.com>

Hi Christophe, hi Ben, hi lixiaokeng,

this series was inspired by lixiaokeng's recent posting "[QUESTION] memory
leak in main (multipath)". It implements my first idea, registering
cleanup handlers with atexit(). However it turned out to be quite
complex. In particular multipathd has a lot of things to clean up.

This series is based on the previous series "multipath-tools: shutdown, 
libdevmapper races, globals".

While the bulk of the series is the cleanup handling, it also contains
some bug fixes for issues that I found while working on this.

Regards
Martin

Martin Wilck (23):
  multipathd: uxlsnr: avoid deadlock on exit
  multipathd: Fix liburcu memory leak
  multipathd: move handling of io_err_stat_attr into libmultipath
  multipathd: move vecs desctruction into cleanup function
  multipathd: make some globals static
  multipathd: move threads destruction into separate function
  multipathd: move conf destruction into separate function
  multipathd: move pid destruction into separate function
  multipathd: close pidfile on exit
  multipathd: add helper for systemd notification at exit
  multipathd: child(): call cleanups in failure case, too
  multipathd: unwatch_all_dmevents: check if waiter is initialized
  multipathd: print error message if config can't be loaded
  libmultipath: add libmp_dm_exit()
  multipathd: fixup libdm deinitialization
  libmultipath: log_thread_stop(): check if logarea is initialized
  multipathd: add cleanup_child() exit handler
  libmultipath: fix log_thread startup and teardown
  multipathd: move cleanup_{prio,checkers,foreign} to libmultipath_exit
  multipath: use atexit() for cleanup handlers
  mpathpersist: use atexit() for cleanup handlers
  multipath: fix leaks in check_path_valid()
  multipath-tools: mpath-tools.supp: file with valgrind suppressions

 libmpathpersist/mpath_persist.c       |   2 -
 libmultipath/config.c                 |   5 +
 libmultipath/config.h                 |   2 +
 libmultipath/devmapper.c              |  10 +
 libmultipath/devmapper.h              |   1 +
 libmultipath/io_err_stat.c            |   7 +-
 libmultipath/libmultipath.version     |  26 +--
 libmultipath/log_pthread.c            |  60 +++--
 mpathpersist/main.c                   |   5 +-
 multipath/main.c                      |  94 +++++---
 multipathd/dmevents.c                 |   2 +
 multipathd/main.c                     | 305 +++++++++++++++++---------
 multipathd/uxlsnr.c                   |  17 +-
 third-party/valgrind/mpath-tools.supp |  33 +++
 14 files changed, 389 insertions(+), 180 deletions(-)
 create mode 100644 third-party/valgrind/mpath-tools.supp

-- 
2.28.0

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH 01/23] multipathd: uxlsnr: avoid deadlock on exit
  2020-09-24 13:40 [PATCH 00/23] libmultipath: improve cleanup on exit mwilck
@ 2020-09-24 13:40 ` mwilck
  2020-09-26  1:52   ` Benjamin Marzinski
  2020-09-24 13:40 ` [PATCH 02/23] multipathd: Fix liburcu memory leak mwilck
                   ` (22 subsequent siblings)
  23 siblings, 1 reply; 37+ messages in thread
From: mwilck @ 2020-09-24 13:40 UTC (permalink / raw)
  To: Christophe Varoqui, Benjamin Marzinski; +Cc: lixiaokeng, dm-devel, Martin Wilck

From: Martin Wilck <mwilck@suse.com>

The uxlsnr wouldn't always release the client lock when cancelled,
causing a deadlock in uxsock_cleanup(). While this hasn't been
caused by commit 3d611a2, the deadlock seems to have become much
more likely after that patch. Solving this means that we have to
treat reallocation failure of the pollfd array differently.
We will now just ignore any clients above the last valid pfd index.
That's a minor problem, as we're in an OOM situation anyway.

Moreover, client_lock is not a "struct lock", but a plain
pthread_mutex_t.

Fixes: 3d611a2 ("multipathd: cancel threads early during shutdown")
Signed-off-by: Martin Wilck <mwilck@suse.com>
---
 multipathd/uxlsnr.c | 17 ++++++++++-------
 1 file changed, 10 insertions(+), 7 deletions(-)

diff --git a/multipathd/uxlsnr.c b/multipathd/uxlsnr.c
index 1c5ce9d..d47ba1a 100644
--- a/multipathd/uxlsnr.c
+++ b/multipathd/uxlsnr.c
@@ -35,6 +35,7 @@
 #include "config.h"
 #include "mpath_cmd.h"
 #include "time-util.h"
+#include "util.h"
 
 #include "main.h"
 #include "cli.h"
@@ -116,7 +117,7 @@ static void _dead_client(struct client *c)
 
 static void dead_client(struct client *c)
 {
-	pthread_cleanup_push(cleanup_lock, &client_lock);
+	pthread_cleanup_push(cleanup_mutex, &client_lock);
 	pthread_mutex_lock(&client_lock);
 	_dead_client(c);
 	pthread_cleanup_pop(1);
@@ -306,6 +307,7 @@ void * uxsock_listen(uxsock_trigger_fn uxsock_trigger, long ux_sock,
 
 		/* setup for a poll */
 		pthread_mutex_lock(&client_lock);
+		pthread_cleanup_push(cleanup_mutex, &client_lock);
 		num_clients = 0;
 		list_for_each_entry(c, &clients, node) {
 			num_clients++;
@@ -322,14 +324,13 @@ void * uxsock_listen(uxsock_trigger_fn uxsock_trigger, long ux_sock,
 						sizeof(struct pollfd));
 			}
 			if (!new) {
-				pthread_mutex_unlock(&client_lock);
 				condlog(0, "%s: failed to realloc %d poll fds",
 					"uxsock", 2 + num_clients);
-				sched_yield();
-				continue;
+				num_clients = old_clients;
+			} else {
+				old_clients = num_clients;
+				polls = new;
 			}
-			old_clients = num_clients;
-			polls = new;
 		}
 		polls[0].fd = ux_sock;
 		polls[0].events = POLLIN;
@@ -347,8 +348,10 @@ void * uxsock_listen(uxsock_trigger_fn uxsock_trigger, long ux_sock,
 			polls[i].fd = c->fd;
 			polls[i].events = POLLIN;
 			i++;
+			if (i >= 2 + num_clients)
+				break;
 		}
-		pthread_mutex_unlock(&client_lock);
+		pthread_cleanup_pop(1);
 
 		/* most of our life is spent in this call */
 		poll_count = ppoll(polls, i, &sleep_time, &mask);
-- 
2.28.0

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH 02/23] multipathd: Fix liburcu memory leak
  2020-09-24 13:40 [PATCH 00/23] libmultipath: improve cleanup on exit mwilck
  2020-09-24 13:40 ` [PATCH 01/23] multipathd: uxlsnr: avoid deadlock " mwilck
@ 2020-09-24 13:40 ` mwilck
  2020-09-24 13:40 ` [PATCH 03/23] multipathd: move handling of io_err_stat_attr into libmultipath mwilck
                   ` (21 subsequent siblings)
  23 siblings, 0 replies; 37+ messages in thread
From: mwilck @ 2020-09-24 13:40 UTC (permalink / raw)
  To: Christophe Varoqui, Benjamin Marzinski; +Cc: lixiaokeng, dm-devel, Martin Wilck

From: Martin Wilck <mwilck@suse.com>

Fix this leak in multipathd, reported by valgrind, that messes up
multipathd's otherwise clean leak report:

==23823== 336 bytes in 1 blocks are possibly lost in loss record 3 of 3
==23823==    at 0x483AB65: calloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==23823==    by 0x4012F16: _dl_allocate_tls (in /lib64/ld-2.31.so)
==23823==    by 0x493BB8E: pthread_create@@GLIBC_2.2.5 (in /lib64/libpthread-2.31.so)
==23823==    by 0x492A9A9: call_rcu_data_init (urcu-call-rcu-impl.h:437)
==23823==    by 0x492AD2F: UnknownInlinedFun (urcu-call-rcu-impl.h:492)
==23823==    by 0x492AD2F: create_call_rcu_data_memb (urcu-call-rcu-impl.h:504)
==23823==    by 0x1164E3: child.constprop.0.isra.0 (main.c:2915)
==23823==    by 0x10F50C: main (main.c:3335)
==23823==
==23823== LEAK SUMMARY:
==23823==    definitely lost: 0 bytes in 0 blocks
==23823==    indirectly lost: 0 bytes in 0 blocks
==23823==      possibly lost: 336 bytes in 1 blocks

The problem is caused by using liburcu's default RCU call handler,
which liburcu refuses to stop/join. See comments in the code.

Signed-off-by: Martin Wilck <mwilck@suse.com>
---
 multipathd/main.c | 45 ++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 44 insertions(+), 1 deletion(-)

diff --git a/multipathd/main.c b/multipathd/main.c
index c5c374b..be1b5ae 100644
--- a/multipathd/main.c
+++ b/multipathd/main.c
@@ -2889,6 +2889,47 @@ set_oom_adj (void)
 	condlog(0, "couldn't adjust oom score");
 }
 
+/*
+ * Use a non-default call_rcu_data for child().
+ *
+ * We do this to avoid a memory leak from liburcu.
+ * liburcu never frees the default rcu handler (see comments on
+ * call_rcu_data_free() in urcu-call-rcu-impl.h), its thread
+ * can't be joined with pthread_join(), leaving a memory leak.
+ *
+ * Therefore we create our own, which can be destroyed and joined.
+ */
+static struct call_rcu_data *setup_rcu(void)
+{
+	struct call_rcu_data *crdp;
+
+	rcu_init();
+	rcu_register_thread();
+	crdp = create_call_rcu_data(0UL, -1);
+	if (crdp != NULL)
+		set_thread_call_rcu_data(crdp);
+	return crdp;
+}
+
+static void cleanup_rcu(int dummy __attribute__((unused)), void *arg)
+{
+	struct call_rcu_data *crdp = arg;
+	pthread_t rcu_thread;
+
+	/* Wait for any pending RCU calls */
+	rcu_barrier();
+	if (crdp != NULL) {
+		rcu_thread = get_call_rcu_thread(crdp);
+		/* detach this thread from the RCU thread */
+		set_thread_call_rcu_data(NULL);
+		synchronize_rcu();
+		/* tell RCU thread to exit */
+		call_rcu_data_free(crdp);
+		pthread_join(rcu_thread, NULL);
+	}
+	rcu_unregister_thread();
+}
+
 static int
 child (__attribute__((unused)) void *param)
 {
@@ -2903,10 +2944,12 @@ child (__attribute__((unused)) void *param)
 	char *envp;
 	int queue_without_daemon;
 	enum daemon_status state;
+	struct call_rcu_data *crdp;
 
 	mlockall(MCL_CURRENT | MCL_FUTURE);
 	signal_init();
-	rcu_init();
+	crdp = setup_rcu();
+	on_exit(cleanup_rcu, crdp);
 
 	setup_thread_attr(&misc_attr, 64 * 1024, 0);
 	setup_thread_attr(&uevent_attr, DEFAULT_UEVENT_STACKSIZE * 1024, 0);
-- 
2.28.0

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH 03/23] multipathd: move handling of io_err_stat_attr into libmultipath
  2020-09-24 13:40 [PATCH 00/23] libmultipath: improve cleanup on exit mwilck
  2020-09-24 13:40 ` [PATCH 01/23] multipathd: uxlsnr: avoid deadlock " mwilck
  2020-09-24 13:40 ` [PATCH 02/23] multipathd: Fix liburcu memory leak mwilck
@ 2020-09-24 13:40 ` mwilck
  2020-09-24 13:40 ` [PATCH 04/23] multipathd: move vecs desctruction into cleanup function mwilck
                   ` (20 subsequent siblings)
  23 siblings, 0 replies; 37+ messages in thread
From: mwilck @ 2020-09-24 13:40 UTC (permalink / raw)
  To: Christophe Varoqui, Benjamin Marzinski; +Cc: lixiaokeng, dm-devel, Martin Wilck

From: Martin Wilck <mwilck@suse.com>

This thread attribute can be dynamically initialized and destroyed.
No need to carry it along in multipathd. Removal of the symbol
requires a new ABI version.

Signed-off-by: Martin Wilck <mwilck@suse.com>
---
 libmultipath/io_err_stat.c        |  7 +++++--
 libmultipath/libmultipath.version | 23 ++++++++---------------
 multipathd/main.c                 |  2 --
 3 files changed, 13 insertions(+), 19 deletions(-)

diff --git a/libmultipath/io_err_stat.c b/libmultipath/io_err_stat.c
index 58bc1dd..5363049 100644
--- a/libmultipath/io_err_stat.c
+++ b/libmultipath/io_err_stat.c
@@ -34,6 +34,7 @@
 #include "lock.h"
 #include "time-util.h"
 #include "io_err_stat.h"
+#include "util.h"
 
 #define TIMEOUT_NO_IO_NSEC		10000000 /*10ms = 10000000ns*/
 #define FLAKY_PATHFAIL_THRESHOLD	2
@@ -70,8 +71,7 @@ struct io_err_stat_path {
 	int		err_rate_threshold;
 };
 
-pthread_t		io_err_stat_thr;
-pthread_attr_t		io_err_stat_attr;
+static pthread_t	io_err_stat_thr;
 
 static pthread_mutex_t io_err_thread_lock = PTHREAD_MUTEX_INITIALIZER;
 static pthread_cond_t io_err_thread_cond = PTHREAD_COND_INITIALIZER;
@@ -727,6 +727,7 @@ static void *io_err_stat_loop(void *data)
 int start_io_err_stat_thread(void *data)
 {
 	int ret;
+	pthread_attr_t io_err_stat_attr;
 
 	if (uatomic_read(&io_err_thread_running) == 1)
 		return 0;
@@ -739,6 +740,7 @@ int start_io_err_stat_thread(void *data)
 	if (!paths)
 		goto destroy_ctx;
 
+	setup_thread_attr(&io_err_stat_attr, 32 * 1024, 0);
 	pthread_mutex_lock(&io_err_thread_lock);
 	pthread_cleanup_push(cleanup_unlock, &io_err_thread_lock);
 
@@ -750,6 +752,7 @@ int start_io_err_stat_thread(void *data)
 				 &io_err_thread_lock) == 0);
 
 	pthread_cleanup_pop(1);
+	pthread_attr_destroy(&io_err_stat_attr);
 
 	if (ret) {
 		io_err_stat_log(0, "cannot create io_error statistic thread");
diff --git a/libmultipath/libmultipath.version b/libmultipath/libmultipath.version
index 2e531ef..9abdb22 100644
--- a/libmultipath/libmultipath.version
+++ b/libmultipath/libmultipath.version
@@ -1,4 +1,4 @@
-LIBMULTIPATH_0.8.4.1 {
+LIBMULTIPATH_0.8.4.5 {
 global:
 	/* symbols referenced by multipath and multipathd */
 	add_foreign;
@@ -88,7 +88,6 @@ global:
 	init_checkers;
 	init_foreign;
 	init_prio;
-	io_err_stat_attr;
 	io_err_stat_handle_pathfail;
 	is_path_valid;
 	is_quote;
@@ -209,30 +208,24 @@ global:
 	free_scandir_result;
 	sysfs_attr_get_value;
 
-local:
-	*;
-};
-
-LIBMULTIPATH_0.8.4.2 {
-global:
+	/* added in 0.8.4.2 */
 	libmp_dm_task_run;
 	cleanup_mutex;
-} LIBMULTIPATH_0.8.4.1;
 
-LIBMULTIPATH_0.8.4.3 {
-global:
+	/* added in 0.8.4.3 */
 	libmp_get_multipath_config;
 	get_multipath_config;
 	libmp_put_multipath_config;
 	put_multipath_config;
 	init_config;
 	uninit_config;
-} LIBMULTIPATH_0.8.4.2;
 
-LIBMULTIPATH_0.8.4.4 {
-global:
+	/* added in 0.8.4.4 */
 	udev;
 	logsink;
 	libmultipath_init;
 	libmultipath_exit;
-} LIBMULTIPATH_0.8.4.3;
+
+local:
+	*;
+};
diff --git a/multipathd/main.c b/multipathd/main.c
index be1b5ae..4d714e8 100644
--- a/multipathd/main.c
+++ b/multipathd/main.c
@@ -2954,7 +2954,6 @@ child (__attribute__((unused)) void *param)
 	setup_thread_attr(&misc_attr, 64 * 1024, 0);
 	setup_thread_attr(&uevent_attr, DEFAULT_UEVENT_STACKSIZE * 1024, 0);
 	setup_thread_attr(&waiter_attr, 32 * 1024, 1);
-	setup_thread_attr(&io_err_stat_attr, 32 * 1024, 0);
 
 	if (logsink == 1) {
 		setup_thread_attr(&log_attr, 64 * 1024, 0);
@@ -3164,7 +3163,6 @@ child (__attribute__((unused)) void *param)
 	rcu_assign_pointer(multipath_conf, NULL);
 	call_rcu(&conf->rcu, rcu_free_config);
 	pthread_attr_destroy(&waiter_attr);
-	pthread_attr_destroy(&io_err_stat_attr);
 #ifdef _DEBUG_
 	dbg_free_final(NULL);
 #endif
-- 
2.28.0

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH 04/23] multipathd: move vecs desctruction into cleanup function
  2020-09-24 13:40 [PATCH 00/23] libmultipath: improve cleanup on exit mwilck
                   ` (2 preceding siblings ...)
  2020-09-24 13:40 ` [PATCH 03/23] multipathd: move handling of io_err_stat_attr into libmultipath mwilck
@ 2020-09-24 13:40 ` mwilck
  2020-09-24 13:40 ` [PATCH 05/23] multipathd: make some globals static mwilck
                   ` (19 subsequent siblings)
  23 siblings, 0 replies; 37+ messages in thread
From: mwilck @ 2020-09-24 13:40 UTC (permalink / raw)
  To: Christophe Varoqui, Benjamin Marzinski; +Cc: lixiaokeng, dm-devel, Martin Wilck

From: Martin Wilck <mwilck@suse.com>

This will make it easer to move the stuff around later.
The only functional change is that map destuction now happens after
joining all threads, which should actually improve robustness.

Signed-off-by: Martin Wilck <mwilck@suse.com>
---
 multipathd/main.c | 64 +++++++++++++++++++++++++++++------------------
 1 file changed, 40 insertions(+), 24 deletions(-)

diff --git a/multipathd/main.c b/multipathd/main.c
index 4d714e8..2642570 100644
--- a/multipathd/main.c
+++ b/multipathd/main.c
@@ -148,7 +148,7 @@ int should_exit(void)
 /*
  * global copy of vecs for use in sig handlers
  */
-struct vectors * gvecs;
+static struct vectors * gvecs;
 
 struct config *multipath_conf;
 
@@ -2889,6 +2889,44 @@ set_oom_adj (void)
 	condlog(0, "couldn't adjust oom score");
 }
 
+static void cleanup_maps(struct vectors *vecs)
+{
+	int queue_without_daemon, i;
+	struct multipath *mpp;
+	struct config *conf;
+
+	conf = get_multipath_config();
+	queue_without_daemon = conf->queue_without_daemon;
+	put_multipath_config(conf);
+	if (queue_without_daemon == QUE_NO_DAEMON_OFF)
+		vector_foreach_slot(vecs->mpvec, mpp, i)
+			dm_queue_if_no_path(mpp->alias, 0);
+	remove_maps_and_stop_waiters(vecs);
+	vecs->mpvec = NULL;
+}
+
+static void cleanup_paths(struct vectors *vecs)
+{
+	free_pathvec(vecs->pathvec, FREE_PATHS);
+	vecs->pathvec = NULL;
+}
+
+static void cleanup_vecs(void)
+{
+	if (!gvecs)
+		return;
+	/*
+	 * We can't take the vecs lock here, because exit() may
+	 * have been called from the child() thread, holding the lock already.
+	 * Anyway, by the time we get here, all threads that might access
+	 * vecs should have been joined already (in cleanup_threads).
+	 */
+	cleanup_maps(gvecs);
+	cleanup_paths(gvecs);
+	pthread_mutex_destroy(&gvecs->lock.mutex);
+	FREE(gvecs);
+}
+
 /*
  * Use a non-default call_rcu_data for child().
  *
@@ -2936,13 +2974,10 @@ child (__attribute__((unused)) void *param)
 	pthread_t check_thr, uevent_thr, uxlsnr_thr, uevq_thr, dmevent_thr;
 	pthread_attr_t log_attr, misc_attr, uevent_attr;
 	struct vectors * vecs;
-	struct multipath * mpp;
-	int i;
 	int rc;
 	int pid_fd = -1;
 	struct config *conf;
 	char *envp;
-	int queue_without_daemon;
 	enum daemon_status state;
 	struct call_rcu_data *crdp;
 
@@ -3108,17 +3143,6 @@ child (__attribute__((unused)) void *param)
 	if (poll_dmevents)
 		pthread_cancel(dmevent_thr);
 
-	conf = get_multipath_config();
-	queue_without_daemon = conf->queue_without_daemon;
-	put_multipath_config(conf);
-
-	lock(&vecs->lock);
-	if (queue_without_daemon == QUE_NO_DAEMON_OFF)
-		vector_foreach_slot(vecs->mpvec, mpp, i)
-			dm_queue_if_no_path(mpp->alias, 0);
-	remove_maps_and_stop_waiters(vecs);
-	unlock(&vecs->lock);
-
 	pthread_join(check_thr, NULL);
 	pthread_join(uevent_thr, NULL);
 	pthread_join(uxlsnr_thr, NULL);
@@ -3128,15 +3152,7 @@ child (__attribute__((unused)) void *param)
 
 	stop_io_err_stat_thread();
 
-	lock(&vecs->lock);
-	free_pathvec(vecs->pathvec, FREE_PATHS);
-	vecs->pathvec = NULL;
-	unlock(&vecs->lock);
-
-	pthread_mutex_destroy(&vecs->lock.mutex);
-	FREE(vecs);
-	vecs = NULL;
-
+	cleanup_vecs();
 	cleanup_foreign();
 	cleanup_checkers();
 	cleanup_prio();
-- 
2.28.0

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH 05/23] multipathd: make some globals static
  2020-09-24 13:40 [PATCH 00/23] libmultipath: improve cleanup on exit mwilck
                   ` (3 preceding siblings ...)
  2020-09-24 13:40 ` [PATCH 04/23] multipathd: move vecs desctruction into cleanup function mwilck
@ 2020-09-24 13:40 ` mwilck
  2020-09-24 13:40 ` [PATCH 06/23] multipathd: move threads destruction into separate function mwilck
                   ` (18 subsequent siblings)
  23 siblings, 0 replies; 37+ messages in thread
From: mwilck @ 2020-09-24 13:40 UTC (permalink / raw)
  To: Christophe Varoqui, Benjamin Marzinski; +Cc: lixiaokeng, dm-devel, Martin Wilck

From: Martin Wilck <mwilck@suse.com>

Signed-off-by: Martin Wilck <mwilck@suse.com>
---
 multipathd/main.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/multipathd/main.c b/multipathd/main.c
index 2642570..1cdbff1 100644
--- a/multipathd/main.c
+++ b/multipathd/main.c
@@ -116,19 +116,19 @@ struct mpath_event_param
 };
 
 int uxsock_timeout;
-int verbosity;
-int bindings_read_only;
+static int verbosity;
+static int bindings_read_only;
 int ignore_new_devs;
 #ifdef NO_DMEVENTS_POLL
-int poll_dmevents = 0;
+static int poll_dmevents = 0;
 #else
-int poll_dmevents = 1;
+static int poll_dmevents = 1;
 #endif
 /* Don't access this variable without holding config_lock */
-volatile enum daemon_status running_state = DAEMON_INIT;
+static volatile enum daemon_status running_state = DAEMON_INIT;
 pid_t daemon_pid;
-pthread_mutex_t config_lock = PTHREAD_MUTEX_INITIALIZER;
-pthread_cond_t config_cond;
+static pthread_mutex_t config_lock = PTHREAD_MUTEX_INITIALIZER;
+static pthread_cond_t config_cond;
 
 static inline enum daemon_status get_running_state(void)
 {
-- 
2.28.0

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH 06/23] multipathd: move threads destruction into separate function
  2020-09-24 13:40 [PATCH 00/23] libmultipath: improve cleanup on exit mwilck
                   ` (4 preceding siblings ...)
  2020-09-24 13:40 ` [PATCH 05/23] multipathd: make some globals static mwilck
@ 2020-09-24 13:40 ` mwilck
  2020-09-24 13:40 ` [PATCH 07/23] multipathd: move conf " mwilck
                   ` (17 subsequent siblings)
  23 siblings, 0 replies; 37+ messages in thread
From: mwilck @ 2020-09-24 13:40 UTC (permalink / raw)
  To: Christophe Varoqui, Benjamin Marzinski; +Cc: lixiaokeng, dm-devel, Martin Wilck

From: Martin Wilck <mwilck@suse.com>

Also, introduce booleans that indicate a certain thread has
been started successfully. Using these booleans, we can avoid
crashing by cancelling threads that have never been started.

Signed-off-by: Martin Wilck <mwilck@suse.com>
---
 multipathd/main.c | 76 +++++++++++++++++++++++++++++++----------------
 1 file changed, 51 insertions(+), 25 deletions(-)

diff --git a/multipathd/main.c b/multipathd/main.c
index 1cdbff1..8b9df55 100644
--- a/multipathd/main.c
+++ b/multipathd/main.c
@@ -129,6 +129,9 @@ static volatile enum daemon_status running_state = DAEMON_INIT;
 pid_t daemon_pid;
 static pthread_mutex_t config_lock = PTHREAD_MUTEX_INITIALIZER;
 static pthread_cond_t config_cond;
+static pthread_t check_thr, uevent_thr, uxlsnr_thr, uevq_thr, dmevent_thr;
+static bool check_thr_started, uevent_thr_started, uxlsnr_thr_started,
+	uevq_thr_started, dmevent_thr_started;
 
 static inline enum daemon_status get_running_state(void)
 {
@@ -2927,6 +2930,39 @@ static void cleanup_vecs(void)
 	FREE(gvecs);
 }
 
+static void cleanup_threads(void)
+{
+	stop_io_err_stat_thread();
+
+	if (check_thr_started)
+		pthread_cancel(check_thr);
+	if (uevent_thr_started)
+		pthread_cancel(uevent_thr);
+	if (uxlsnr_thr_started)
+		pthread_cancel(uxlsnr_thr);
+	if (uevq_thr_started)
+		pthread_cancel(uevq_thr);
+	if (dmevent_thr_started)
+		pthread_cancel(dmevent_thr);
+
+	if (check_thr_started)
+		pthread_join(check_thr, NULL);
+	if (uevent_thr_started)
+		pthread_join(uevent_thr, NULL);
+	if (uxlsnr_thr_started)
+		pthread_join(uxlsnr_thr, NULL);
+	if (uevq_thr_started)
+		pthread_join(uevq_thr, NULL);
+	if (dmevent_thr_started)
+		pthread_join(dmevent_thr, NULL);
+
+	/*
+	 * As all threads are joined now, and we're in DAEMON_SHUTDOWN
+	 * state, no new waiter threads will be created any more.
+	 */
+	pthread_attr_destroy(&waiter_attr);
+}
+
 /*
  * Use a non-default call_rcu_data for child().
  *
@@ -2971,7 +3007,6 @@ static void cleanup_rcu(int dummy __attribute__((unused)), void *arg)
 static int
 child (__attribute__((unused)) void *param)
 {
-	pthread_t check_thr, uevent_thr, uxlsnr_thr, uevq_thr, dmevent_thr;
 	pthread_attr_t log_attr, misc_attr, uevent_attr;
 	struct vectors * vecs;
 	int rc;
@@ -3070,9 +3105,12 @@ child (__attribute__((unused)) void *param)
 		condlog(0, "failed to create cli listener: %d", rc);
 		goto failed;
 	}
-	else if (state != DAEMON_CONFIGURE) {
-		condlog(0, "cli listener failed to start");
-		goto failed;
+	else {
+		uxlsnr_thr_started = true;
+		if (state != DAEMON_CONFIGURE) {
+			condlog(0, "cli listener failed to start");
+			goto failed;
+		}
 	}
 
 	if (poll_dmevents) {
@@ -3085,7 +3123,8 @@ child (__attribute__((unused)) void *param)
 			condlog(0, "failed to create dmevent waiter thread: %d",
 				rc);
 			goto failed;
-		}
+		} else
+			dmevent_thr_started = true;
 	}
 
 	/*
@@ -3094,7 +3133,8 @@ child (__attribute__((unused)) void *param)
 	if ((rc = pthread_create(&uevent_thr, &uevent_attr, ueventloop, udev))) {
 		condlog(0, "failed to create uevent thread: %d", rc);
 		goto failed;
-	}
+	} else
+		uevent_thr_started = true;
 	pthread_attr_destroy(&uevent_attr);
 
 	/*
@@ -3103,11 +3143,13 @@ child (__attribute__((unused)) void *param)
 	if ((rc = pthread_create(&check_thr, &misc_attr, checkerloop, vecs))) {
 		condlog(0,"failed to create checker loop thread: %d", rc);
 		goto failed;
-	}
+	} else
+		check_thr_started = true;
 	if ((rc = pthread_create(&uevq_thr, &misc_attr, uevqloop, vecs))) {
 		condlog(0, "failed to create uevent dispatcher: %d", rc);
 		goto failed;
-	}
+	} else
+		uevq_thr_started = true;
 	pthread_attr_destroy(&misc_attr);
 
 	while (1) {
@@ -3136,22 +3178,7 @@ child (__attribute__((unused)) void *param)
 		}
 	}
 
-	pthread_cancel(check_thr);
-	pthread_cancel(uevent_thr);
-	pthread_cancel(uxlsnr_thr);
-	pthread_cancel(uevq_thr);
-	if (poll_dmevents)
-		pthread_cancel(dmevent_thr);
-
-	pthread_join(check_thr, NULL);
-	pthread_join(uevent_thr, NULL);
-	pthread_join(uxlsnr_thr, NULL);
-	pthread_join(uevq_thr, NULL);
-	if (poll_dmevents)
-		pthread_join(dmevent_thr, NULL);
-
-	stop_io_err_stat_thread();
-
+	cleanup_threads();
 	cleanup_vecs();
 	cleanup_foreign();
 	cleanup_checkers();
@@ -3178,7 +3205,6 @@ child (__attribute__((unused)) void *param)
 	conf = rcu_dereference(multipath_conf);
 	rcu_assign_pointer(multipath_conf, NULL);
 	call_rcu(&conf->rcu, rcu_free_config);
-	pthread_attr_destroy(&waiter_attr);
 #ifdef _DEBUG_
 	dbg_free_final(NULL);
 #endif
-- 
2.28.0

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH 07/23] multipathd: move conf destruction into separate function
  2020-09-24 13:40 [PATCH 00/23] libmultipath: improve cleanup on exit mwilck
                   ` (5 preceding siblings ...)
  2020-09-24 13:40 ` [PATCH 06/23] multipathd: move threads destruction into separate function mwilck
@ 2020-09-24 13:40 ` mwilck
  2020-09-28 18:03   ` Benjamin Marzinski
  2020-09-24 13:40 ` [PATCH 08/23] multipathd: move pid " mwilck
                   ` (16 subsequent siblings)
  23 siblings, 1 reply; 37+ messages in thread
From: mwilck @ 2020-09-24 13:40 UTC (permalink / raw)
  To: Christophe Varoqui, Benjamin Marzinski; +Cc: lixiaokeng, dm-devel, Martin Wilck

From: Martin Wilck <mwilck@suse.com>

Also removing the comment about dlog() and dm_write_log().
dlog() can cope with get_multipath_config() returning NULL,
and dm_write_log() hasn't accessed the configuration for a while.

Signed-off-by: Martin Wilck <mwilck@suse.com>
---
 multipathd/main.c | 24 ++++++++++++++----------
 1 file changed, 14 insertions(+), 10 deletions(-)

diff --git a/multipathd/main.c b/multipathd/main.c
index 8b9df55..4d5b40b 100644
--- a/multipathd/main.c
+++ b/multipathd/main.c
@@ -2892,6 +2892,16 @@ set_oom_adj (void)
 	condlog(0, "couldn't adjust oom score");
 }
 
+static void cleanup_conf(void) {
+	struct config *conf;
+
+	conf = rcu_dereference(multipath_conf);
+	if (!conf)
+		return;
+	rcu_assign_pointer(multipath_conf, NULL);
+	call_rcu(&conf->rcu, rcu_free_config);
+}
+
 static void cleanup_maps(struct vectors *vecs)
 {
 	int queue_without_daemon, i;
@@ -3194,17 +3204,11 @@ child (__attribute__((unused)) void *param)
 
 	condlog(2, "--------shut down-------");
 
-	if (logsink == 1)
+	if (logsink == 1) {
+		logsink = 0;
 		log_thread_stop();
-
-	/*
-	 * Freeing config must be done after condlog() and dm_lib_exit(),
-	 * because logging functions like dlog() and dm_write_log()
-	 * reference the config.
-	 */
-	conf = rcu_dereference(multipath_conf);
-	rcu_assign_pointer(multipath_conf, NULL);
-	call_rcu(&conf->rcu, rcu_free_config);
+	}
+	cleanup_conf();
 #ifdef _DEBUG_
 	dbg_free_final(NULL);
 #endif
-- 
2.28.0

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH 08/23] multipathd: move pid destruction into separate function
  2020-09-24 13:40 [PATCH 00/23] libmultipath: improve cleanup on exit mwilck
                   ` (6 preceding siblings ...)
  2020-09-24 13:40 ` [PATCH 07/23] multipathd: move conf " mwilck
@ 2020-09-24 13:40 ` mwilck
  2020-09-24 13:40 ` [PATCH 09/23] multipathd: close pidfile on exit mwilck
                   ` (15 subsequent siblings)
  23 siblings, 0 replies; 37+ messages in thread
From: mwilck @ 2020-09-24 13:40 UTC (permalink / raw)
  To: Christophe Varoqui, Benjamin Marzinski; +Cc: lixiaokeng, dm-devel, Martin Wilck

From: Martin Wilck <mwilck@suse.com>

Signed-off-by: Martin Wilck <mwilck@suse.com>
---
 multipathd/main.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/multipathd/main.c b/multipathd/main.c
index 4d5b40b..3db051b 100644
--- a/multipathd/main.c
+++ b/multipathd/main.c
@@ -2892,6 +2892,12 @@ set_oom_adj (void)
 	condlog(0, "couldn't adjust oom score");
 }
 
+static void cleanup_pidfile(void)
+{
+	condlog(3, "unlink pidfile");
+	unlink(DEFAULT_PIDFILE);
+}
+
 static void cleanup_conf(void) {
 	struct config *conf;
 
@@ -3199,9 +3205,7 @@ child (__attribute__((unused)) void *param)
 	dm_lib_exit();
 
 	/* We're done here */
-	condlog(3, "unlink pidfile");
-	unlink(DEFAULT_PIDFILE);
-
+	cleanup_pidfile();
 	condlog(2, "--------shut down-------");
 
 	if (logsink == 1) {
-- 
2.28.0

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH 09/23] multipathd: close pidfile on exit
  2020-09-24 13:40 [PATCH 00/23] libmultipath: improve cleanup on exit mwilck
                   ` (7 preceding siblings ...)
  2020-09-24 13:40 ` [PATCH 08/23] multipathd: move pid " mwilck
@ 2020-09-24 13:40 ` mwilck
  2020-09-24 13:40 ` [PATCH 10/23] multipathd: add helper for systemd notification at exit mwilck
                   ` (14 subsequent siblings)
  23 siblings, 0 replies; 37+ messages in thread
From: mwilck @ 2020-09-24 13:40 UTC (permalink / raw)
  To: Christophe Varoqui, Benjamin Marzinski; +Cc: lixiaokeng, dm-devel, Martin Wilck

From: Martin Wilck <mwilck@suse.com>

It seems we've been doing this only in the failure case, for ages.

Signed-off-by: Martin Wilck <mwilck@suse.com>
---
 multipathd/main.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/multipathd/main.c b/multipathd/main.c
index 3db051b..44c1bd1 100644
--- a/multipathd/main.c
+++ b/multipathd/main.c
@@ -132,6 +132,7 @@ static pthread_cond_t config_cond;
 static pthread_t check_thr, uevent_thr, uxlsnr_thr, uevq_thr, dmevent_thr;
 static bool check_thr_started, uevent_thr_started, uxlsnr_thr_started,
 	uevq_thr_started, dmevent_thr_started;
+static int pid_fd = -1;
 
 static inline enum daemon_status get_running_state(void)
 {
@@ -2894,6 +2895,8 @@ set_oom_adj (void)
 
 static void cleanup_pidfile(void)
 {
+	if (pid_fd >= 0)
+		close(pid_fd);
 	condlog(3, "unlink pidfile");
 	unlink(DEFAULT_PIDFILE);
 }
@@ -3026,7 +3029,6 @@ child (__attribute__((unused)) void *param)
 	pthread_attr_t log_attr, misc_attr, uevent_attr;
 	struct vectors * vecs;
 	int rc;
-	int pid_fd = -1;
 	struct config *conf;
 	char *envp;
 	enum daemon_status state;
-- 
2.28.0

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH 10/23] multipathd: add helper for systemd notification at exit
  2020-09-24 13:40 [PATCH 00/23] libmultipath: improve cleanup on exit mwilck
                   ` (8 preceding siblings ...)
  2020-09-24 13:40 ` [PATCH 09/23] multipathd: close pidfile on exit mwilck
@ 2020-09-24 13:40 ` mwilck
  2020-09-24 13:40 ` [PATCH 11/23] multipathd: child(): call cleanups in failure case, too mwilck
                   ` (13 subsequent siblings)
  23 siblings, 0 replies; 37+ messages in thread
From: mwilck @ 2020-09-24 13:40 UTC (permalink / raw)
  To: Christophe Varoqui, Benjamin Marzinski; +Cc: lixiaokeng, dm-devel, Martin Wilck

From: Martin Wilck <mwilck@suse.com>

Add sd_notify_exit().

Signed-off-by: Martin Wilck <mwilck@suse.com>
---
 multipathd/main.c | 22 +++++++++++++---------
 1 file changed, 13 insertions(+), 9 deletions(-)

diff --git a/multipathd/main.c b/multipathd/main.c
index 44c1bd1..e742aa5 100644
--- a/multipathd/main.c
+++ b/multipathd/main.c
@@ -3023,6 +3023,17 @@ static void cleanup_rcu(int dummy __attribute__((unused)), void *arg)
 	rcu_unregister_thread();
 }
 
+static int sd_notify_exit(int err)
+{
+#ifdef USE_SYSTEMD
+	char msg[24];
+
+	snprintf(msg, sizeof(msg), "ERRNO=%d", err);
+	sd_notify(0, msg);
+#endif
+	return err;
+}
+
 static int
 child (__attribute__((unused)) void *param)
 {
@@ -3218,19 +3229,12 @@ child (__attribute__((unused)) void *param)
 #ifdef _DEBUG_
 	dbg_free_final(NULL);
 #endif
-
-#ifdef USE_SYSTEMD
-	sd_notify(0, "ERRNO=0");
-#endif
-	exit(0);
+	exit(sd_notify_exit(0));
 
 failed:
-#ifdef USE_SYSTEMD
-	sd_notify(0, "ERRNO=1");
-#endif
 	if (pid_fd >= 0)
 		close(pid_fd);
-	exit(1);
+	exit(sd_notify_exit(1));
 }
 
 static int
-- 
2.28.0

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH 11/23] multipathd: child(): call cleanups in failure case, too
  2020-09-24 13:40 [PATCH 00/23] libmultipath: improve cleanup on exit mwilck
                   ` (9 preceding siblings ...)
  2020-09-24 13:40 ` [PATCH 10/23] multipathd: add helper for systemd notification at exit mwilck
@ 2020-09-24 13:40 ` mwilck
  2020-09-24 13:40 ` [PATCH 12/23] multipathd: unwatch_all_dmevents: check if waiter is initialized mwilck
                   ` (12 subsequent siblings)
  23 siblings, 0 replies; 37+ messages in thread
From: mwilck @ 2020-09-24 13:40 UTC (permalink / raw)
  To: Christophe Varoqui, Benjamin Marzinski; +Cc: lixiaokeng, dm-devel, Martin Wilck

From: Martin Wilck <mwilck@suse.com>

So far we haven't called any cleanup code if child() failed.
Fix it.

Signed-off-by: Martin Wilck <mwilck@suse.com>
---
 multipathd/main.c | 10 ++++------
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/multipathd/main.c b/multipathd/main.c
index e742aa5..3c8f893 100644
--- a/multipathd/main.c
+++ b/multipathd/main.c
@@ -3044,6 +3044,7 @@ child (__attribute__((unused)) void *param)
 	char *envp;
 	enum daemon_status state;
 	struct call_rcu_data *crdp;
+	int exit_code = 1;
 
 	mlockall(MCL_CURRENT | MCL_FUTURE);
 	signal_init();
@@ -3207,6 +3208,8 @@ child (__attribute__((unused)) void *param)
 		}
 	}
 
+	exit_code = 0;
+failed:
 	cleanup_threads();
 	cleanup_vecs();
 	cleanup_foreign();
@@ -3229,12 +3232,7 @@ child (__attribute__((unused)) void *param)
 #ifdef _DEBUG_
 	dbg_free_final(NULL);
 #endif
-	exit(sd_notify_exit(0));
-
-failed:
-	if (pid_fd >= 0)
-		close(pid_fd);
-	exit(sd_notify_exit(1));
+	return sd_notify_exit(exit_code);
 }
 
 static int
-- 
2.28.0

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH 12/23] multipathd: unwatch_all_dmevents: check if waiter is initialized
  2020-09-24 13:40 [PATCH 00/23] libmultipath: improve cleanup on exit mwilck
                   ` (10 preceding siblings ...)
  2020-09-24 13:40 ` [PATCH 11/23] multipathd: child(): call cleanups in failure case, too mwilck
@ 2020-09-24 13:40 ` mwilck
  2020-09-24 13:40 ` [PATCH 13/23] multipathd: print error message if config can't be loaded mwilck
                   ` (11 subsequent siblings)
  23 siblings, 0 replies; 37+ messages in thread
From: mwilck @ 2020-09-24 13:40 UTC (permalink / raw)
  To: Christophe Varoqui, Benjamin Marzinski; +Cc: lixiaokeng, dm-devel, Martin Wilck

From: Martin Wilck <mwilck@suse.com>

Signed-off-by: Martin Wilck <mwilck@suse.com>
---
 multipathd/dmevents.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/multipathd/dmevents.c b/multipathd/dmevents.c
index b561cbf..f52f597 100644
--- a/multipathd/dmevents.c
+++ b/multipathd/dmevents.c
@@ -257,6 +257,8 @@ void unwatch_all_dmevents(void)
 	struct dev_event *dev_evt;
 	int i;
 
+	if (!waiter)
+		return;
 	pthread_mutex_lock(&waiter->events_lock);
 	vector_foreach_slot(waiter->events, dev_evt, i)
 		free(dev_evt);
-- 
2.28.0

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH 13/23] multipathd: print error message if config can't be loaded
  2020-09-24 13:40 [PATCH 00/23] libmultipath: improve cleanup on exit mwilck
                   ` (11 preceding siblings ...)
  2020-09-24 13:40 ` [PATCH 12/23] multipathd: unwatch_all_dmevents: check if waiter is initialized mwilck
@ 2020-09-24 13:40 ` mwilck
  2020-09-24 13:40 ` [PATCH 14/23] libmultipath: add libmp_dm_exit() mwilck
                   ` (10 subsequent siblings)
  23 siblings, 0 replies; 37+ messages in thread
From: mwilck @ 2020-09-24 13:40 UTC (permalink / raw)
  To: Christophe Varoqui, Benjamin Marzinski; +Cc: lixiaokeng, dm-devel, Martin Wilck

From: Martin Wilck <mwilck@suse.com>

Signed-off-by: Martin Wilck <mwilck@suse.com>
---
 multipathd/main.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/multipathd/main.c b/multipathd/main.c
index 3c8f893..483b3c7 100644
--- a/multipathd/main.c
+++ b/multipathd/main.c
@@ -3074,8 +3074,10 @@ child (__attribute__((unused)) void *param)
 	condlog(2, "read " DEFAULT_CONFIGFILE);
 
 	conf = load_config(DEFAULT_CONFIGFILE);
-	if (!conf)
+	if (!conf) {
+		condlog(0, "failed to load configuration");
 		goto failed;
+	}
 
 	if (verbosity)
 		conf->verbosity = verbosity;
-- 
2.28.0

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH 14/23] libmultipath: add libmp_dm_exit()
  2020-09-24 13:40 [PATCH 00/23] libmultipath: improve cleanup on exit mwilck
                   ` (12 preceding siblings ...)
  2020-09-24 13:40 ` [PATCH 13/23] multipathd: print error message if config can't be loaded mwilck
@ 2020-09-24 13:40 ` mwilck
  2020-09-28 18:41   ` Benjamin Marzinski
  2020-09-24 13:40 ` [PATCH 15/23] multipathd: fixup libdm deinitialization mwilck
                   ` (9 subsequent siblings)
  23 siblings, 1 reply; 37+ messages in thread
From: mwilck @ 2020-09-24 13:40 UTC (permalink / raw)
  To: Christophe Varoqui, Benjamin Marzinski; +Cc: lixiaokeng, dm-devel, Martin Wilck

From: Martin Wilck <mwilck@suse.com>

This function prepares for calling dm_lib_exit() on program exit.
It undoes changes to libdm internals done by libmultipath.
It doesn't call dm_lib_exit(), as the caller may want to keep
libdm active.

Signed-off-by: Martin Wilck <mwilck@suse.com>
---
 libmultipath/config.c    |  1 +
 libmultipath/config.h    |  2 ++
 libmultipath/devmapper.c | 10 ++++++++++
 libmultipath/devmapper.h |  1 +
 4 files changed, 14 insertions(+)

diff --git a/libmultipath/config.c b/libmultipath/config.c
index fbb66b3..8097838 100644
--- a/libmultipath/config.c
+++ b/libmultipath/config.c
@@ -65,6 +65,7 @@ int libmultipath_init(void)
 static void _libmultipath_exit(void)
 {
 	libmultipath_exit_called = true;
+	libmp_dm_exit();
 	udev_unref(udev);
 }
 
diff --git a/libmultipath/config.h b/libmultipath/config.h
index dac4e8f..fc614f0 100644
--- a/libmultipath/config.h
+++ b/libmultipath/config.h
@@ -270,6 +270,8 @@ int libmultipath_init(void);
  *
  * This function un-initializes libmultipath data structures.
  * It is recommended to call this function at program exit.
+ * If the application also calls dm_lib_exit(), it should do so
+ * after libmultipath_exit().
  *
  * Calls to libmultipath_init() after libmultipath_exit() will fail
  * (in other words, libmultipath can't be re-initialized).
diff --git a/libmultipath/devmapper.c b/libmultipath/devmapper.c
index 2e3ae8a..5a2e1e7 100644
--- a/libmultipath/devmapper.c
+++ b/libmultipath/devmapper.c
@@ -333,6 +333,16 @@ void libmp_udev_set_sync_support(int on)
 	libmp_dm_udev_sync = !!on;
 }
 
+void libmp_dm_exit(void)
+{
+	/* switch back to default libdm logging */
+	dm_log_init(NULL);
+#ifdef LIBDM_API_HOLD_CONTROL
+	/* make sure control fd is closed in dm_lib_release() */
+	dm_hold_control_dev(0);
+#endif
+}
+
 static void libmp_dm_init(void)
 {
 	struct config *conf;
diff --git a/libmultipath/devmapper.h b/libmultipath/devmapper.h
index 158057e..ecf2e66 100644
--- a/libmultipath/devmapper.h
+++ b/libmultipath/devmapper.h
@@ -35,6 +35,7 @@ enum {
 
 int dm_prereq(unsigned int *v);
 void skip_libmp_dm_init(void);
+void libmp_dm_exit(void);
 void libmp_udev_set_sync_support(int on);
 struct dm_task *libmp_dm_task_create(int task);
 int dm_simplecmd_flush (int, const char *, uint16_t);
-- 
2.28.0

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH 15/23] multipathd: fixup libdm deinitialization
  2020-09-24 13:40 [PATCH 00/23] libmultipath: improve cleanup on exit mwilck
                   ` (13 preceding siblings ...)
  2020-09-24 13:40 ` [PATCH 14/23] libmultipath: add libmp_dm_exit() mwilck
@ 2020-09-24 13:40 ` mwilck
  2020-09-24 13:40 ` [PATCH 16/23] libmultipath: log_thread_stop(): check if logarea is initialized mwilck
                   ` (8 subsequent siblings)
  23 siblings, 0 replies; 37+ messages in thread
From: mwilck @ 2020-09-24 13:40 UTC (permalink / raw)
  To: Christophe Varoqui, Benjamin Marzinski; +Cc: lixiaokeng, dm-devel, Martin Wilck

From: Martin Wilck <mwilck@suse.com>

With libmp_dm_exit() in place, we can make sure that the
calls are made in the right order.

Signed-off-by: Martin Wilck <mwilck@suse.com>
---
 multipathd/main.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/multipathd/main.c b/multipathd/main.c
index 483b3c7..f9749e5 100644
--- a/multipathd/main.c
+++ b/multipathd/main.c
@@ -3220,8 +3220,6 @@ failed:
 	if (poll_dmevents)
 		cleanup_dmevent_waiter();
 
-	dm_lib_exit();
-
 	/* We're done here */
 	cleanup_pidfile();
 	condlog(2, "--------shut down-------");
@@ -3320,6 +3318,9 @@ main (int argc, char *argv[])
 
 	pthread_cond_init_mono(&config_cond);
 
+	if (atexit(dm_lib_exit))
+		condlog(3, "failed to register exit handler for libdm");
+
 	libmultipath_init();
 	if (atexit(libmultipath_exit))
 		condlog(3, "failed to register exit handler for libmultipath");
-- 
2.28.0

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH 16/23] libmultipath: log_thread_stop(): check if logarea is initialized
  2020-09-24 13:40 [PATCH 00/23] libmultipath: improve cleanup on exit mwilck
                   ` (14 preceding siblings ...)
  2020-09-24 13:40 ` [PATCH 15/23] multipathd: fixup libdm deinitialization mwilck
@ 2020-09-24 13:40 ` mwilck
  2020-09-24 13:40 ` [PATCH 17/23] multipathd: add cleanup_child() exit handler mwilck
                   ` (7 subsequent siblings)
  23 siblings, 0 replies; 37+ messages in thread
From: mwilck @ 2020-09-24 13:40 UTC (permalink / raw)
  To: Christophe Varoqui, Benjamin Marzinski; +Cc: lixiaokeng, dm-devel, Martin Wilck

From: Martin Wilck <mwilck@suse.com>

Signed-off-by: Martin Wilck <mwilck@suse.com>
---
 libmultipath/log_pthread.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/libmultipath/log_pthread.c b/libmultipath/log_pthread.c
index 15baef8..0c327ff 100644
--- a/libmultipath/log_pthread.c
+++ b/libmultipath/log_pthread.c
@@ -112,6 +112,9 @@ void log_thread_reset (void)
 
 void log_thread_stop (void)
 {
+	if (!la)
+		return;
+
 	logdbg(stderr,"enter log_thread_stop\n");
 
 	pthread_mutex_lock(&logev_lock);
-- 
2.28.0

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH 17/23] multipathd: add cleanup_child() exit handler
  2020-09-24 13:40 [PATCH 00/23] libmultipath: improve cleanup on exit mwilck
                   ` (15 preceding siblings ...)
  2020-09-24 13:40 ` [PATCH 16/23] libmultipath: log_thread_stop(): check if logarea is initialized mwilck
@ 2020-09-24 13:40 ` mwilck
  2020-09-24 13:40 ` [PATCH 18/23] libmultipath: fix log_thread startup and teardown mwilck
                   ` (6 subsequent siblings)
  23 siblings, 0 replies; 37+ messages in thread
From: mwilck @ 2020-09-24 13:40 UTC (permalink / raw)
  To: Christophe Varoqui, Benjamin Marzinski; +Cc: lixiaokeng, dm-devel, Martin Wilck

From: Martin Wilck <mwilck@suse.com>

cleanup_child() calls all cleanups in the right order, in an
exit handler.

Signed-off-by: Martin Wilck <mwilck@suse.com>
---
 multipathd/main.c | 48 +++++++++++++++++++++++++++--------------------
 1 file changed, 28 insertions(+), 20 deletions(-)

diff --git a/multipathd/main.c b/multipathd/main.c
index f9749e5..e7b479a 100644
--- a/multipathd/main.c
+++ b/multipathd/main.c
@@ -3023,6 +3023,28 @@ static void cleanup_rcu(int dummy __attribute__((unused)), void *arg)
 	rcu_unregister_thread();
 }
 
+static void cleanup_child(void)
+{
+	cleanup_threads();
+	cleanup_vecs();
+	cleanup_foreign();
+	cleanup_checkers();
+	cleanup_prio();
+	if (poll_dmevents)
+		cleanup_dmevent_waiter();
+
+	cleanup_pidfile();
+	if (logsink == 1) {
+		logsink = 0;
+		log_thread_stop();
+	}
+	cleanup_conf();
+
+#ifdef _DEBUG_
+	dbg_free_final(NULL);
+#endif
+}
+
 static int sd_notify_exit(int err)
 {
 #ifdef USE_SYSTEMD
@@ -3049,7 +3071,11 @@ child (__attribute__((unused)) void *param)
 	mlockall(MCL_CURRENT | MCL_FUTURE);
 	signal_init();
 	crdp = setup_rcu();
-	on_exit(cleanup_rcu, crdp);
+
+	if (on_exit(cleanup_rcu, crdp) ||
+	    atexit(cleanup_child)) {
+		fprintf(stderr, "failed to register cleanup handlers\n");
+	}
 
 	setup_thread_attr(&misc_attr, 64 * 1024, 0);
 	setup_thread_attr(&uevent_attr, DEFAULT_UEVENT_STACKSIZE * 1024, 0);
@@ -3212,26 +3238,8 @@ child (__attribute__((unused)) void *param)
 
 	exit_code = 0;
 failed:
-	cleanup_threads();
-	cleanup_vecs();
-	cleanup_foreign();
-	cleanup_checkers();
-	cleanup_prio();
-	if (poll_dmevents)
-		cleanup_dmevent_waiter();
-
-	/* We're done here */
-	cleanup_pidfile();
 	condlog(2, "--------shut down-------");
-
-	if (logsink == 1) {
-		logsink = 0;
-		log_thread_stop();
-	}
-	cleanup_conf();
-#ifdef _DEBUG_
-	dbg_free_final(NULL);
-#endif
+	/* All cleanup is done in the cleanup_child() exit handler */
 	return sd_notify_exit(exit_code);
 }
 
-- 
2.28.0

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH 18/23] libmultipath: fix log_thread startup and teardown
  2020-09-24 13:40 [PATCH 00/23] libmultipath: improve cleanup on exit mwilck
                   ` (16 preceding siblings ...)
  2020-09-24 13:40 ` [PATCH 17/23] multipathd: add cleanup_child() exit handler mwilck
@ 2020-09-24 13:40 ` mwilck
  2020-09-28 20:15   ` Benjamin Marzinski
  2020-09-24 13:40 ` [PATCH 19/23] multipathd: move cleanup_{prio, checkers, foreign} to libmultipath_exit mwilck
                   ` (5 subsequent siblings)
  23 siblings, 1 reply; 37+ messages in thread
From: mwilck @ 2020-09-24 13:40 UTC (permalink / raw)
  To: Christophe Varoqui, Benjamin Marzinski; +Cc: lixiaokeng, dm-devel, Martin Wilck

From: Martin Wilck <mwilck@suse.com>

This fixes several issues with the log_thread. First, the running
flag logq_running should be set by the thread itself, not by
log_thread_start()/_stop(). Second, the thread was both cancelled and
terminated via a flag (again, logq_running). It's sufficient,
and better, to just cancel it and use logq_running as indication for
successful termination. Third, the locking wasn't cancel-safe in some
places. Forth, log_thread_start() and log_thread_stop() didn't wait for
startup/teardown properly. Fifth, using (pthread_t)0 is wrong (pthread_t is
opaque; there's no guarantee that 0 is not a valid pthread_t value). Finally,
pthread_cancel() was called under logq_lock, which doesn't make sense to
me at all.

Signed-off-by: Martin Wilck <mwilck@suse.com>
---
 libmultipath/log_pthread.c | 59 ++++++++++++++++++++++++++++----------
 1 file changed, 44 insertions(+), 15 deletions(-)

diff --git a/libmultipath/log_pthread.c b/libmultipath/log_pthread.c
index 0c327ff..5825e66 100644
--- a/libmultipath/log_pthread.c
+++ b/libmultipath/log_pthread.c
@@ -13,6 +13,7 @@
 #include "log_pthread.h"
 #include "log.h"
 #include "lock.h"
+#include "util.h"
 
 static pthread_t log_thr;
 
@@ -56,28 +57,45 @@ static void flush_logqueue (void)
 	} while (empty == 0);
 }
 
+static void cleanup_log_thread(__attribute((unused)) void *arg)
+{
+	logdbg(stderr, "log thread exiting");
+	pthread_mutex_lock(&logev_lock);
+	logq_running = 0;
+	pthread_cond_signal(&logev_cond);
+	pthread_mutex_unlock(&logev_lock);
+}
+
 static void * log_thread (__attribute__((unused)) void * et)
 {
 	int running;
 
 	pthread_mutex_lock(&logev_lock);
-	logq_running = 1;
+	running = logq_running;
+	if (!running)
+		logq_running = 1;
+	pthread_cond_signal(&logev_cond);
 	pthread_mutex_unlock(&logev_lock);
+	if (running)
+		/* already started */
+		return NULL;
+	pthread_cleanup_push(cleanup_log_thread, NULL);
 
 	mlockall(MCL_CURRENT | MCL_FUTURE);
 	logdbg(stderr,"enter log_thread\n");
 
 	while (1) {
 		pthread_mutex_lock(&logev_lock);
-		if (logq_running && !log_messages_pending)
+		pthread_cleanup_push(cleanup_mutex, &logev_lock);
+		while (!log_messages_pending)
+			/* this is a cancellation point */
 			pthread_cond_wait(&logev_cond, &logev_lock);
 		log_messages_pending = 0;
-		running = logq_running;
-		pthread_mutex_unlock(&logev_lock);
-		if (!running)
-			break;
+		pthread_cleanup_pop(1);
+
 		flush_logqueue();
 	}
+	pthread_cleanup_pop(1);
 	return NULL;
 }
 
@@ -98,6 +116,12 @@ void log_thread_start (pthread_attr_t *attr)
 		exit(1);
 	}
 
+	pthread_mutex_lock(&logev_lock);
+	pthread_cleanup_push(cleanup_mutex, &logev_lock);
+	/* wait for thread startup */
+	while (!logq_running)
+		pthread_cond_wait(&logev_cond, &logev_lock);
+	pthread_cleanup_pop(1);
 	return;
 }
 
@@ -112,21 +136,26 @@ void log_thread_reset (void)
 
 void log_thread_stop (void)
 {
+	int running;
+
 	if (!la)
 		return;
-
 	logdbg(stderr,"enter log_thread_stop\n");
 
 	pthread_mutex_lock(&logev_lock);
-	logq_running = 0;
-	pthread_cond_signal(&logev_cond);
-	pthread_mutex_unlock(&logev_lock);
+	pthread_cleanup_push(cleanup_mutex, &logev_lock);
+	running = logq_running;
+	if (running) {
+		pthread_cancel(log_thr);
+		pthread_cond_signal(&logev_cond);
+	}
+	while (logq_running)
+		pthread_cond_wait(&logev_cond, &logev_lock);
+	pthread_cleanup_pop(1);
+
+	if (running)
+		pthread_join(log_thr, NULL);
 
-	pthread_mutex_lock(&logq_lock);
-	pthread_cancel(log_thr);
-	pthread_mutex_unlock(&logq_lock);
-	pthread_join(log_thr, NULL);
-	log_thr = (pthread_t)0;
 
 	flush_logqueue();
 
-- 
2.28.0

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH 19/23] multipathd: move cleanup_{prio, checkers, foreign} to libmultipath_exit
  2020-09-24 13:40 [PATCH 00/23] libmultipath: improve cleanup on exit mwilck
                   ` (17 preceding siblings ...)
  2020-09-24 13:40 ` [PATCH 18/23] libmultipath: fix log_thread startup and teardown mwilck
@ 2020-09-24 13:40 ` mwilck
  2020-09-28 20:26   ` Benjamin Marzinski
  2020-09-24 13:40 ` [PATCH 20/23] multipath: use atexit() for cleanup handlers mwilck
                   ` (4 subsequent siblings)
  23 siblings, 1 reply; 37+ messages in thread
From: mwilck @ 2020-09-24 13:40 UTC (permalink / raw)
  To: Christophe Varoqui, Benjamin Marzinski; +Cc: lixiaokeng, dm-devel, Martin Wilck

From: Martin Wilck <mwilck@suse.com>

Signed-off-by: Martin Wilck <mwilck@suse.com>
---
 libmpathpersist/mpath_persist.c   | 2 --
 libmultipath/config.c             | 4 ++++
 libmultipath/libmultipath.version | 5 +----
 multipathd/main.c                 | 3 ---
 4 files changed, 5 insertions(+), 9 deletions(-)

diff --git a/libmpathpersist/mpath_persist.c b/libmpathpersist/mpath_persist.c
index 873b419..af56a95 100644
--- a/libmpathpersist/mpath_persist.c
+++ b/libmpathpersist/mpath_persist.c
@@ -78,8 +78,6 @@ mpath_lib_init (void)
 
 static void libmpathpersist_cleanup(void)
 {
-	cleanup_prio();
-	cleanup_checkers();
 	libmultipath_exit();
 	dm_lib_exit();
 }
diff --git a/libmultipath/config.c b/libmultipath/config.c
index 8097838..f115ac2 100644
--- a/libmultipath/config.c
+++ b/libmultipath/config.c
@@ -26,6 +26,7 @@
 #include "devmapper.h"
 #include "mpath_cmd.h"
 #include "propsel.h"
+#include "foreign.h"
 
 /*
  * We don't support re-initialization after
@@ -65,6 +66,9 @@ int libmultipath_init(void)
 static void _libmultipath_exit(void)
 {
 	libmultipath_exit_called = true;
+	cleanup_foreign();
+	cleanup_checkers();
+	cleanup_prio();
 	libmp_dm_exit();
 	udev_unref(udev);
 }
diff --git a/libmultipath/libmultipath.version b/libmultipath/libmultipath.version
index 9abdb22..80f1950 100644
--- a/libmultipath/libmultipath.version
+++ b/libmultipath/libmultipath.version
@@ -1,4 +1,4 @@
-LIBMULTIPATH_0.8.4.5 {
+LIBMULTIPATH_0.8.4.6 {
 global:
 	/* symbols referenced by multipath and multipathd */
 	add_foreign;
@@ -18,10 +18,7 @@ global:
 	checker_name;
 	checker_state_name;
 	check_foreign;
-	cleanup_checkers;
-	cleanup_foreign;
 	cleanup_lock;
-	cleanup_prio;
 	close_fd;
 	coalesce_paths;
 	convert_dev;
diff --git a/multipathd/main.c b/multipathd/main.c
index e7b479a..efed56e 100644
--- a/multipathd/main.c
+++ b/multipathd/main.c
@@ -3027,9 +3027,6 @@ static void cleanup_child(void)
 {
 	cleanup_threads();
 	cleanup_vecs();
-	cleanup_foreign();
-	cleanup_checkers();
-	cleanup_prio();
 	if (poll_dmevents)
 		cleanup_dmevent_waiter();
 
-- 
2.28.0

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH 20/23] multipath: use atexit() for cleanup handlers
  2020-09-24 13:40 [PATCH 00/23] libmultipath: improve cleanup on exit mwilck
                   ` (18 preceding siblings ...)
  2020-09-24 13:40 ` [PATCH 19/23] multipathd: move cleanup_{prio, checkers, foreign} to libmultipath_exit mwilck
@ 2020-09-24 13:40 ` mwilck
  2020-09-24 13:40 ` [PATCH 21/23] mpathpersist: " mwilck
                   ` (3 subsequent siblings)
  23 siblings, 0 replies; 37+ messages in thread
From: mwilck @ 2020-09-24 13:40 UTC (permalink / raw)
  To: Christophe Varoqui, Benjamin Marzinski; +Cc: lixiaokeng, dm-devel, Martin Wilck

From: Martin Wilck <mwilck@suse.com>

Signed-off-by: Martin Wilck <mwilck@suse.com>
---
 multipath/main.c | 39 ++++++++++++++++++---------------------
 1 file changed, 18 insertions(+), 21 deletions(-)

diff --git a/multipath/main.c b/multipath/main.c
index 9ae46ed..049a36f 100644
--- a/multipath/main.c
+++ b/multipath/main.c
@@ -452,13 +452,21 @@ static bool released_to_systemd(void)
 	return ret;
 }
 
+static void cleanup_vecs(__attribute__((unused)) int dummy, void *arg)
+{
+	struct vectors *vecs = arg;
+
+	free_multipathvec(vecs->mpvec, KEEP_PATHS);
+	free_pathvec(vecs->pathvec, FREE_PATHS);
+}
+
 static int
 configure (struct config *conf, enum mpath_cmds cmd,
 	   enum devtypes dev_type, char *devpath)
 {
 	vector curmp = NULL;
 	vector pathvec = NULL;
-	struct vectors vecs;
+	static struct vectors vecs;
 	int r = RTVL_FAIL, rc;
 	int di_flag = 0;
 	char * refwwid = NULL;
@@ -469,6 +477,7 @@ configure (struct config *conf, enum mpath_cmds cmd,
 	 */
 	curmp = vector_alloc();
 	pathvec = vector_alloc();
+	on_exit(cleanup_vecs, &vecs);
 
 	if (!curmp || !pathvec) {
 		condlog(0, "can not allocate memory");
@@ -580,9 +589,6 @@ out:
 	if (refwwid)
 		FREE(refwwid);
 
-	free_multipathvec(curmp, KEEP_PATHS);
-	free_pathvec(pathvec, FREE_PATHS);
-
 	return r;
 }
 
@@ -808,9 +814,13 @@ main (int argc, char *argv[])
 	bool enable_foreign = false;
 
 	libmultipath_init();
+	if (atexit(dm_lib_exit) || atexit(libmultipath_exit))
+		condlog(1, "failed to register cleanup handler for libmultipath: %m");
 	logsink = 0;
 	if (init_config(DEFAULT_CONFIGFILE))
 		exit(RTVL_FAIL);
+	if (atexit(uninit_config))
+		condlog(1, "failed to register cleanup handler for config: %m");
 	conf = get_multipath_config();
 	conf->retrigger_tries = 0;
 	conf->force_sync = 1;
@@ -887,7 +897,7 @@ main (int argc, char *argv[])
 			break;
 		case 't':
 			r = dump_config(conf, NULL, NULL) ? RTVL_FAIL : RTVL_OK;
-			goto out_free_config;
+			goto out;
 		case 'T':
 			cmd = CMD_DUMP_CONFIG;
 			break;
@@ -1048,26 +1058,13 @@ main (int argc, char *argv[])
 		condlog(3, "restart multipath configuration process");
 
 out:
-	dm_lib_exit();
-
-	cleanup_foreign();
-	cleanup_prio();
-	cleanup_checkers();
+	put_multipath_config(conf);
+	if (dev)
+		FREE(dev);
 
 	if (dev_type == DEV_UEVENT)
 		closelog();
 
-out_free_config:
-	/*
-	 * Freeing config must be done after dm_lib_exit(), because
-	 * the logging function (dm_write_log()), which is called there,
-	 * references the config.
-	 */
-	put_multipath_config(conf);
-	uninit_config();
-	libmultipath_exit();
-	if (dev)
-		FREE(dev);
 #ifdef _DEBUG_
 	dbg_free_final(NULL);
 #endif
-- 
2.28.0

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH 21/23] mpathpersist: use atexit() for cleanup handlers
  2020-09-24 13:40 [PATCH 00/23] libmultipath: improve cleanup on exit mwilck
                   ` (19 preceding siblings ...)
  2020-09-24 13:40 ` [PATCH 20/23] multipath: use atexit() for cleanup handlers mwilck
@ 2020-09-24 13:40 ` mwilck
  2020-09-24 13:40 ` [PATCH 22/23] multipath: fix leaks in check_path_valid() mwilck
                   ` (2 subsequent siblings)
  23 siblings, 0 replies; 37+ messages in thread
From: mwilck @ 2020-09-24 13:40 UTC (permalink / raw)
  To: Christophe Varoqui, Benjamin Marzinski; +Cc: lixiaokeng, dm-devel, Martin Wilck

From: Martin Wilck <mwilck@suse.com>

Signed-off-by: Martin Wilck <mwilck@suse.com>
---
 mpathpersist/main.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/mpathpersist/main.c b/mpathpersist/main.c
index 3c2e657..14245cc 100644
--- a/mpathpersist/main.c
+++ b/mpathpersist/main.c
@@ -641,11 +641,10 @@ int main(int argc, char *argv[])
 	if (libmpathpersist_init()) {
 		exit(1);
 	}
+	if (atexit((void(*)(void))libmpathpersist_exit))
+		fprintf(stderr, "failed to register cleanup handler for libmpathpersist: %m");
 
 	ret = handle_args(argc, argv, 0);
-
-	libmpathpersist_exit();
-
 	return (ret >= 0) ? ret : MPATH_PR_OTHER;
 }
 
-- 
2.28.0

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH 22/23] multipath: fix leaks in check_path_valid()
  2020-09-24 13:40 [PATCH 00/23] libmultipath: improve cleanup on exit mwilck
                   ` (20 preceding siblings ...)
  2020-09-24 13:40 ` [PATCH 21/23] mpathpersist: " mwilck
@ 2020-09-24 13:40 ` mwilck
  2020-09-24 13:40 ` [PATCH 23/23] multipath-tools: mpath-tools.supp: file with valgrind suppressions mwilck
  2020-09-28 20:51 ` [PATCH 00/23] libmultipath: improve cleanup on exit Benjamin Marzinski
  23 siblings, 0 replies; 37+ messages in thread
From: mwilck @ 2020-09-24 13:40 UTC (permalink / raw)
  To: Christophe Varoqui, Benjamin Marzinski; +Cc: lixiaokeng, dm-devel, Martin Wilck

From: Martin Wilck <mwilck@suse.com>

There were two leaks in check_path_valid(): if path status was
successfully determined before calling store_pathvec(), free_path()
wasn't called. Also, if an error exit occured, neither cleanup
function was called.

This patch fixes both, at the cost of using "static" for the pp and
pathvec variables.

Signed-off-by: Martin Wilck <mwilck@suse.com>
---
 multipath/main.c | 55 +++++++++++++++++++++++++++++++++++++-----------
 1 file changed, 43 insertions(+), 12 deletions(-)

diff --git a/multipath/main.c b/multipath/main.c
index 049a36f..9974993 100644
--- a/multipath/main.c
+++ b/multipath/main.c
@@ -93,7 +93,7 @@ void rcu_register_thread_memb(void) {}
 void rcu_unregister_thread_memb(void) {}
 
 static int
-filter_pathvec (vector pathvec, char * refwwid)
+filter_pathvec (vector pathvec, const char *refwwid)
 {
 	int i;
 	struct path * pp;
@@ -592,12 +592,37 @@ out:
 	return r;
 }
 
+static void cleanup_pathvec(__attribute__((unused)) int dummy, void *arg)
+{
+	vector *ppv = arg;
+
+	if (ppv && *ppv) {
+		free_pathvec(*ppv, FREE_PATHS);
+		*ppv = NULL;
+	}
+}
+
+static void cleanup_path(__attribute__((unused)) int dummy, void *arg)
+{
+	struct path **ppp = arg;
+
+	if (ppp && *ppp) {
+		free_path(*ppp);
+		*ppp = NULL;
+	}
+}
+
 static int
 check_path_valid(const char *name, struct config *conf, bool is_uevent)
 {
 	int fd, r = PATH_IS_ERROR;
-	struct path *pp = NULL;
-	vector pathvec = NULL;
+	static struct path *pp = NULL;
+	static vector pathvec = NULL;
+	const char *wwid;
+
+	/* register these as exit handlers in case we exit irregularly */
+	on_exit(cleanup_path, &pp);
+	on_exit(cleanup_pathvec, &pathvec);
 
 	pp = alloc_path();
 	if (!pp)
@@ -667,13 +692,17 @@ check_path_valid(const char *name, struct config *conf, bool is_uevent)
 	if (store_path(pathvec, pp) != 0) {
 		free_path(pp);
 		goto fail;
+	} else {
+		/* make sure path isn't freed twice */
+		wwid = pp->wwid;
+		pp = NULL;
 	}
 
 	/* For find_multipaths = SMART, if there is more than one path
 	 * matching the refwwid, then the path is valid */
 	if (path_discovery(pathvec, DI_SYSFS | DI_WWID) < 0)
 		goto fail;
-	filter_pathvec(pathvec, pp->wwid);
+	filter_pathvec(pathvec, wwid);
 	if (VECTOR_SIZE(pathvec) > 1)
 		r = PATH_IS_VALID;
 	else
@@ -681,21 +710,23 @@ check_path_valid(const char *name, struct config *conf, bool is_uevent)
 
 out:
 	r = print_cmd_valid(r, pathvec, conf);
-	free_pathvec(pathvec, FREE_PATHS);
 	/*
 	 * multipath -u must exit with status 0, otherwise udev won't
 	 * import its output.
 	 */
 	if (!is_uevent && r == PATH_IS_NOT_VALID)
-		return RTVL_FAIL;
-	return RTVL_OK;
+		r = RTVL_FAIL;
+	else
+		r = RTVL_OK;
+	goto cleanup;
 
 fail:
-	if (pathvec)
-		free_pathvec(pathvec, FREE_PATHS);
-	else
-		free_path(pp);
-	return RTVL_FAIL;
+	r = RTVL_FAIL;
+
+cleanup:
+	cleanup_path(0, &pp);
+	cleanup_pathvec(0, &pathvec);
+	return r;
 }
 
 static int
-- 
2.28.0

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH 23/23] multipath-tools: mpath-tools.supp: file with valgrind suppressions
  2020-09-24 13:40 [PATCH 00/23] libmultipath: improve cleanup on exit mwilck
                   ` (21 preceding siblings ...)
  2020-09-24 13:40 ` [PATCH 22/23] multipath: fix leaks in check_path_valid() mwilck
@ 2020-09-24 13:40 ` mwilck
  2020-09-24 20:58   ` Benjamin Marzinski
  2020-09-28 20:51 ` [PATCH 00/23] libmultipath: improve cleanup on exit Benjamin Marzinski
  23 siblings, 1 reply; 37+ messages in thread
From: mwilck @ 2020-09-24 13:40 UTC (permalink / raw)
  To: Christophe Varoqui, Benjamin Marzinski; +Cc: lixiaokeng, dm-devel, Martin Wilck

From: Martin Wilck <mwilck@suse.com>

These leaks are caused by other libraries (libsystemd, glibc,
libgcrypt) and should be ignored when debugging with valgrind

Usage example:

valgrind --suppressions=mpath-tools.supp \
    --leak-check=full --show-leak-kinds=all $COMMAND

Signed-off-by: Martin Wilck <mwilck@suse.com>
---
 third-party/valgrind/mpath-tools.supp | 33 +++++++++++++++++++++++++++
 1 file changed, 33 insertions(+)
 create mode 100644 third-party/valgrind/mpath-tools.supp

diff --git a/third-party/valgrind/mpath-tools.supp b/third-party/valgrind/mpath-tools.supp
new file mode 100644
index 0000000..621463a
--- /dev/null
+++ b/third-party/valgrind/mpath-tools.supp
@@ -0,0 +1,33 @@
+{
+   glibc _dlerror_run leak: https://stackoverflow.com/questions/1542457/memory-leak-reported-by-valgrind-in-dlopen
+   Memcheck:Leak
+   match-leak-kinds: reachable
+   fun:calloc
+   fun:_dlerror_run
+   fun:dlopen*
+}
+
+{
+   systemd mempools are never freed: https://bugzilla.redhat.com/show_bug.cgi?id=1215670
+   Memcheck:Leak
+   match-leak-kinds: reachable
+   fun:malloc
+   fun:mempool_alloc_tile
+   fun:mempool_alloc0_tile
+   fun:hashmap_base_new
+   fun:hashmap_base_ensure_allocated
+}
+
+{
+   libgcrypt library initialization
+   Memcheck:Leak
+   match-leak-kinds: reachable
+   fun:malloc
+   ...
+   fun:_gcry_xmalloc
+   ...
+   fun:global_init.*
+   ...
+   fun:_dl_init
+}
+
-- 
2.28.0

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 23/23] multipath-tools: mpath-tools.supp: file with valgrind suppressions
  2020-09-24 13:40 ` [PATCH 23/23] multipath-tools: mpath-tools.supp: file with valgrind suppressions mwilck
@ 2020-09-24 20:58   ` Benjamin Marzinski
  2020-09-25  9:53     ` Martin Wilck
  0 siblings, 1 reply; 37+ messages in thread
From: Benjamin Marzinski @ 2020-09-24 20:58 UTC (permalink / raw)
  To: mwilck; +Cc: lixiaokeng, dm-devel

On Thu, Sep 24, 2020 at 03:40:54PM +0200, mwilck@suse.com wrote:
> From: Martin Wilck <mwilck@suse.com>
> 

Git complains about the extra blank line at the end of the file

third-party/valgrind/mpath-tools.supp:33: new blank line at EOF.

-Ben

> These leaks are caused by other libraries (libsystemd, glibc,
> libgcrypt) and should be ignored when debugging with valgrind
> 
> Usage example:
> 
> valgrind --suppressions=mpath-tools.supp \
>     --leak-check=full --show-leak-kinds=all $COMMAND
> 
> Signed-off-by: Martin Wilck <mwilck@suse.com>
> ---
>  third-party/valgrind/mpath-tools.supp | 33 +++++++++++++++++++++++++++
>  1 file changed, 33 insertions(+)
>  create mode 100644 third-party/valgrind/mpath-tools.supp
> 
> diff --git a/third-party/valgrind/mpath-tools.supp b/third-party/valgrind/mpath-tools.supp
> new file mode 100644
> index 0000000..621463a
> --- /dev/null
> +++ b/third-party/valgrind/mpath-tools.supp
> @@ -0,0 +1,33 @@
> +{
> +   glibc _dlerror_run leak: https://stackoverflow.com/questions/1542457/memory-leak-reported-by-valgrind-in-dlopen
> +   Memcheck:Leak
> +   match-leak-kinds: reachable
> +   fun:calloc
> +   fun:_dlerror_run
> +   fun:dlopen*
> +}
> +
> +{
> +   systemd mempools are never freed: https://bugzilla.redhat.com/show_bug.cgi?id=1215670
> +   Memcheck:Leak
> +   match-leak-kinds: reachable
> +   fun:malloc
> +   fun:mempool_alloc_tile
> +   fun:mempool_alloc0_tile
> +   fun:hashmap_base_new
> +   fun:hashmap_base_ensure_allocated
> +}
> +
> +{
> +   libgcrypt library initialization
> +   Memcheck:Leak
> +   match-leak-kinds: reachable
> +   fun:malloc
> +   ...
> +   fun:_gcry_xmalloc
> +   ...
> +   fun:global_init.*
> +   ...
> +   fun:_dl_init
> +}
> +
> -- 
> 2.28.0

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 23/23] multipath-tools: mpath-tools.supp: file with valgrind suppressions
  2020-09-24 20:58   ` Benjamin Marzinski
@ 2020-09-25  9:53     ` Martin Wilck
  0 siblings, 0 replies; 37+ messages in thread
From: Martin Wilck @ 2020-09-25  9:53 UTC (permalink / raw)
  To: Benjamin Marzinski; +Cc: lixiaokeng, dm-devel

On Thu, 2020-09-24 at 15:58 -0500, Benjamin Marzinski wrote:
> On Thu, Sep 24, 2020 at 03:40:54PM +0200, mwilck@suse.com wrote:
> > From: Martin Wilck <mwilck@suse.com>
> > 
> 
> Git complains about the extra blank line at the end of the file

I'm sorry. At least this one will be easy to fix :-)

Martin

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 01/23] multipathd: uxlsnr: avoid deadlock on exit
  2020-09-24 13:40 ` [PATCH 01/23] multipathd: uxlsnr: avoid deadlock " mwilck
@ 2020-09-26  1:52   ` Benjamin Marzinski
  2020-09-26 13:00     ` Martin Wilck
  0 siblings, 1 reply; 37+ messages in thread
From: Benjamin Marzinski @ 2020-09-26  1:52 UTC (permalink / raw)
  To: mwilck; +Cc: lixiaokeng, dm-devel

On Thu, Sep 24, 2020 at 03:40:32PM +0200, mwilck@suse.com wrote:
> From: Martin Wilck <mwilck@suse.com>
> 
> The uxlsnr wouldn't always release the client lock when cancelled,
> causing a deadlock in uxsock_cleanup(). While this hasn't been
> caused by commit 3d611a2, the deadlock seems to have become much
> more likely after that patch. Solving this means that we have to
> treat reallocation failure of the pollfd array differently.
> We will now just ignore any clients above the last valid pfd index.
> That's a minor problem, as we're in an OOM situation anyway.
> 
> Moreover, client_lock is not a "struct lock", but a plain
> pthread_mutex_t.
> 
> Fixes: 3d611a2 ("multipathd: cancel threads early during shutdown")
> Signed-off-by: Martin Wilck <mwilck@suse.com>
> ---
>  multipathd/uxlsnr.c | 17 ++++++++++-------
>  1 file changed, 10 insertions(+), 7 deletions(-)
> 
> diff --git a/multipathd/uxlsnr.c b/multipathd/uxlsnr.c
> index 1c5ce9d..d47ba1a 100644
> --- a/multipathd/uxlsnr.c
> +++ b/multipathd/uxlsnr.c
> @@ -35,6 +35,7 @@
>  #include "config.h"
>  #include "mpath_cmd.h"
>  #include "time-util.h"
> +#include "util.h"
>  
>  #include "main.h"
>  #include "cli.h"
> @@ -116,7 +117,7 @@ static void _dead_client(struct client *c)
>  
>  static void dead_client(struct client *c)
>  {
> -	pthread_cleanup_push(cleanup_lock, &client_lock);
> +	pthread_cleanup_push(cleanup_mutex, &client_lock);
>  	pthread_mutex_lock(&client_lock);
>  	_dead_client(c);
>  	pthread_cleanup_pop(1);
> @@ -306,6 +307,7 @@ void * uxsock_listen(uxsock_trigger_fn uxsock_trigger, long ux_sock,
>  
>  		/* setup for a poll */
>  		pthread_mutex_lock(&client_lock);
> +		pthread_cleanup_push(cleanup_mutex, &client_lock);
>  		num_clients = 0;
>  		list_for_each_entry(c, &clients, node) {
>  			num_clients++;
> @@ -322,14 +324,13 @@ void * uxsock_listen(uxsock_trigger_fn uxsock_trigger, long ux_sock,
>  						sizeof(struct pollfd));
>  			}
>  			if (!new) {
> -				pthread_mutex_unlock(&client_lock);
>  				condlog(0, "%s: failed to realloc %d poll fds",
>  					"uxsock", 2 + num_clients);
> -				sched_yield();
> -				continue;
> +				num_clients = old_clients;

O.k. I'm getting way into the theoretical weeds here, but I believe that
realloc() is technically allowed to return NULL when it shrinks
allocated memory. In this case num_clients would be too big.  Later in
this function, when we loop through num_clients

                for (i = 2; i < num_clients + 2; i++) {
                        if (polls[i].revents & POLLIN) {
 
We could look at an unused polls entry, since its revents doesn't get
cleared. It's also possible that the fd of this unused entry matches the
fd of an existing client. Then we could try to get a packet from a
client that isn't sending one, and kill that client. Yeah, this will
almost certainly never happen.  But we could just zero out the revents
field, or loop over the actual number of structures we polled, and then
it can't happen.

-Ben

> +			} else {
> +				old_clients = num_clients;
> +				polls = new;
>  			}
> -			old_clients = num_clients;
> -			polls = new;
>  		}
>  		polls[0].fd = ux_sock;
>  		polls[0].events = POLLIN;
> @@ -347,8 +348,10 @@ void * uxsock_listen(uxsock_trigger_fn uxsock_trigger, long ux_sock,
>  			polls[i].fd = c->fd;
>  			polls[i].events = POLLIN;
>  			i++;
> +			if (i >= 2 + num_clients)
> +				break;
>  		}
> -		pthread_mutex_unlock(&client_lock);
> +		pthread_cleanup_pop(1);
>  
>  		/* most of our life is spent in this call */
>  		poll_count = ppoll(polls, i, &sleep_time, &mask);
> -- 
> 2.28.0

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 01/23] multipathd: uxlsnr: avoid deadlock on exit
  2020-09-26  1:52   ` Benjamin Marzinski
@ 2020-09-26 13:00     ` Martin Wilck
  0 siblings, 0 replies; 37+ messages in thread
From: Martin Wilck @ 2020-09-26 13:00 UTC (permalink / raw)
  To: Benjamin Marzinski; +Cc: lixiaokeng, dm-devel

On Fri, 2020-09-25 at 20:52 -0500, Benjamin Marzinski wrote:
> 
> O.k. I'm getting way into the theoretical weeds here, but I believe
> that
> realloc() is technically allowed to return NULL when it shrinks
> allocated memory. In this case num_clients would be too big.  Later
> in
> this function, when we loop through num_clients
> 
>                 for (i = 2; i < num_clients + 2; i++) {
>                         if (polls[i].revents & POLLIN) {
>  
> We could look at an unused polls entry, since its revents doesn't get
> cleared. It's also possible that the fd of this unused entry matches
> the
> fd of an existing client. Then we could try to get a packet from a
> client that isn't sending one, and kill that client. Yeah, this will
> almost certainly never happen.  But we could just zero out the
> revents
> field, or loop over the actual number of structures we polled, and
> then
> it can't happen.

I'll fix this up in this patch, and add another patch to sanitize this.
calling realloc() when the number of fds shrinks really isn't buying us
much.

Martin

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 07/23] multipathd: move conf destruction into separate function
  2020-09-24 13:40 ` [PATCH 07/23] multipathd: move conf " mwilck
@ 2020-09-28 18:03   ` Benjamin Marzinski
  2020-09-29  9:12     ` Martin Wilck
  0 siblings, 1 reply; 37+ messages in thread
From: Benjamin Marzinski @ 2020-09-28 18:03 UTC (permalink / raw)
  To: mwilck; +Cc: lixiaokeng, dm-devel

On Thu, Sep 24, 2020 at 03:40:38PM +0200, mwilck@suse.com wrote:
> From: Martin Wilck <mwilck@suse.com>
> 
> Also removing the comment about dlog() and dm_write_log().
> dlog() can cope with get_multipath_config() returning NULL,
> and dm_write_log() hasn't accessed the configuration for a while.
> 
> Signed-off-by: Martin Wilck <mwilck@suse.com>
> ---
>  multipathd/main.c | 24 ++++++++++++++----------
>  1 file changed, 14 insertions(+), 10 deletions(-)
> 
> diff --git a/multipathd/main.c b/multipathd/main.c
> index 8b9df55..4d5b40b 100644
> --- a/multipathd/main.c
> +++ b/multipathd/main.c
> @@ -2892,6 +2892,16 @@ set_oom_adj (void)
>  	condlog(0, "couldn't adjust oom score");
>  }
>  
> +static void cleanup_conf(void) {
> +	struct config *conf;
> +
> +	conf = rcu_dereference(multipath_conf);
> +	if (!conf)
> +		return;
> +	rcu_assign_pointer(multipath_conf, NULL);
> +	call_rcu(&conf->rcu, rcu_free_config);
> +}
> +
>  static void cleanup_maps(struct vectors *vecs)
>  {
>  	int queue_without_daemon, i;
> @@ -3194,17 +3204,11 @@ child (__attribute__((unused)) void *param)
>  
>  	condlog(2, "--------shut down-------");
>  
> -	if (logsink == 1)
> +	if (logsink == 1) {
> +		logsink = 0;
>  		log_thread_stop();

It seems like log_thread_stop() could just do something like

pthread_t log_thr_save = log_thr;
log_thr = (pthread_t)0;

at the start, and then you would continue to get syslog logging, even
when the log thread stopped. It's racy, but all the other threads
(except the log_thread, obviously) should be stopped. Or am I not
understanding the purpose of doing this?

> -
> -	/*
> -	 * Freeing config must be done after condlog() and dm_lib_exit(),
> -	 * because logging functions like dlog() and dm_write_log()
> -	 * reference the config.
> -	 */
> -	conf = rcu_dereference(multipath_conf);
> -	rcu_assign_pointer(multipath_conf, NULL);
> -	call_rcu(&conf->rcu, rcu_free_config);
> +	}
> +	cleanup_conf();
>  #ifdef _DEBUG_
>  	dbg_free_final(NULL);
>  #endif
> -- 
> 2.28.0

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 14/23] libmultipath: add libmp_dm_exit()
  2020-09-24 13:40 ` [PATCH 14/23] libmultipath: add libmp_dm_exit() mwilck
@ 2020-09-28 18:41   ` Benjamin Marzinski
  0 siblings, 0 replies; 37+ messages in thread
From: Benjamin Marzinski @ 2020-09-28 18:41 UTC (permalink / raw)
  To: mwilck; +Cc: lixiaokeng, dm-devel

On Thu, Sep 24, 2020 at 03:40:45PM +0200, mwilck@suse.com wrote:
> From: Martin Wilck <mwilck@suse.com>
> 
> This function prepares for calling dm_lib_exit() on program exit.
> It undoes changes to libdm internals done by libmultipath.
> It doesn't call dm_lib_exit(), as the caller may want to keep
> libdm active.
> 
> Signed-off-by: Martin Wilck <mwilck@suse.com>
> ---
>  libmultipath/config.c    |  1 +
>  libmultipath/config.h    |  2 ++
>  libmultipath/devmapper.c | 10 ++++++++++
>  libmultipath/devmapper.h |  1 +
>  4 files changed, 14 insertions(+)
> 
> diff --git a/libmultipath/config.c b/libmultipath/config.c
> index fbb66b3..8097838 100644
> --- a/libmultipath/config.c
> +++ b/libmultipath/config.c
> @@ -65,6 +65,7 @@ int libmultipath_init(void)
>  static void _libmultipath_exit(void)
>  {
>  	libmultipath_exit_called = true;
> +	libmp_dm_exit();
>  	udev_unref(udev);
>  }

What if the caller wanted to control the dm initialization themselves,
and used skip_libmp_dm_init()? We need to track that and not undo
their changes.

-Ben

>  
> diff --git a/libmultipath/config.h b/libmultipath/config.h
> index dac4e8f..fc614f0 100644
> --- a/libmultipath/config.h
> +++ b/libmultipath/config.h
> @@ -270,6 +270,8 @@ int libmultipath_init(void);
>   *
>   * This function un-initializes libmultipath data structures.
>   * It is recommended to call this function at program exit.
> + * If the application also calls dm_lib_exit(), it should do so
> + * after libmultipath_exit().
>   *
>   * Calls to libmultipath_init() after libmultipath_exit() will fail
>   * (in other words, libmultipath can't be re-initialized).
> diff --git a/libmultipath/devmapper.c b/libmultipath/devmapper.c
> index 2e3ae8a..5a2e1e7 100644
> --- a/libmultipath/devmapper.c
> +++ b/libmultipath/devmapper.c
> @@ -333,6 +333,16 @@ void libmp_udev_set_sync_support(int on)
>  	libmp_dm_udev_sync = !!on;
>  }
>  
> +void libmp_dm_exit(void)
> +{
> +	/* switch back to default libdm logging */
> +	dm_log_init(NULL);
> +#ifdef LIBDM_API_HOLD_CONTROL
> +	/* make sure control fd is closed in dm_lib_release() */
> +	dm_hold_control_dev(0);
> +#endif
> +}
> +
>  static void libmp_dm_init(void)
>  {
>  	struct config *conf;
> diff --git a/libmultipath/devmapper.h b/libmultipath/devmapper.h
> index 158057e..ecf2e66 100644
> --- a/libmultipath/devmapper.h
> +++ b/libmultipath/devmapper.h
> @@ -35,6 +35,7 @@ enum {
>  
>  int dm_prereq(unsigned int *v);
>  void skip_libmp_dm_init(void);
> +void libmp_dm_exit(void);
>  void libmp_udev_set_sync_support(int on);
>  struct dm_task *libmp_dm_task_create(int task);
>  int dm_simplecmd_flush (int, const char *, uint16_t);
> -- 
> 2.28.0

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 18/23] libmultipath: fix log_thread startup and teardown
  2020-09-24 13:40 ` [PATCH 18/23] libmultipath: fix log_thread startup and teardown mwilck
@ 2020-09-28 20:15   ` Benjamin Marzinski
  2020-09-29  9:18     ` Martin Wilck
  0 siblings, 1 reply; 37+ messages in thread
From: Benjamin Marzinski @ 2020-09-28 20:15 UTC (permalink / raw)
  To: mwilck; +Cc: lixiaokeng, dm-devel

On Thu, Sep 24, 2020 at 03:40:49PM +0200, mwilck@suse.com wrote:
> From: Martin Wilck <mwilck@suse.com>
> 
> This fixes several issues with the log_thread. First, the running
> flag logq_running should be set by the thread itself, not by
> log_thread_start()/_stop(). Second, the thread was both cancelled and
> terminated via a flag (again, logq_running). It's sufficient,
> and better, to just cancel it and use logq_running as indication for
> successful termination. Third, the locking wasn't cancel-safe in some
> places. Forth, log_thread_start() and log_thread_stop() didn't wait for
> startup/teardown properly. Fifth, using (pthread_t)0 is wrong (pthread_t is
> opaque; there's no guarantee that 0 is not a valid pthread_t value). Finally,
> pthread_cancel() was called under logq_lock, which doesn't make sense to
> me at all.
> 
> Signed-off-by: Martin Wilck <mwilck@suse.com>
> ---
>  libmultipath/log_pthread.c | 59 ++++++++++++++++++++++++++++----------
>  1 file changed, 44 insertions(+), 15 deletions(-)
> 
> diff --git a/libmultipath/log_pthread.c b/libmultipath/log_pthread.c
> index 0c327ff..5825e66 100644
> --- a/libmultipath/log_pthread.c
> +++ b/libmultipath/log_pthread.c
> @@ -13,6 +13,7 @@
>  #include "log_pthread.h"
>  #include "log.h"
>  #include "lock.h"
> +#include "util.h"
>  
>  static pthread_t log_thr;
>  
> @@ -56,28 +57,45 @@ static void flush_logqueue (void)
>  	} while (empty == 0);
>  }
>  
> +static void cleanup_log_thread(__attribute((unused)) void *arg)
> +{
> +	logdbg(stderr, "log thread exiting");
> +	pthread_mutex_lock(&logev_lock);
> +	logq_running = 0;
> +	pthread_cond_signal(&logev_cond);
> +	pthread_mutex_unlock(&logev_lock);
> +}
> +
>  static void * log_thread (__attribute__((unused)) void * et)
>  {
>  	int running;
>  
>  	pthread_mutex_lock(&logev_lock);
> -	logq_running = 1;
> +	running = logq_running;
> +	if (!running)
> +		logq_running = 1;
> +	pthread_cond_signal(&logev_cond);
>  	pthread_mutex_unlock(&logev_lock);
> +	if (running)
> +		/* already started */
> +		return NULL;
> +	pthread_cleanup_push(cleanup_log_thread, NULL);
>  
>  	mlockall(MCL_CURRENT | MCL_FUTURE);
>  	logdbg(stderr,"enter log_thread\n");
>  
>  	while (1) {
>  		pthread_mutex_lock(&logev_lock);
> -		if (logq_running && !log_messages_pending)
> +		pthread_cleanup_push(cleanup_mutex, &logev_lock);
> +		while (!log_messages_pending)
> +			/* this is a cancellation point */
>  			pthread_cond_wait(&logev_cond, &logev_lock);
>  		log_messages_pending = 0;
> -		running = logq_running;
> -		pthread_mutex_unlock(&logev_lock);
> -		if (!running)
> -			break;
> +		pthread_cleanup_pop(1);
> +
>  		flush_logqueue();
>  	}
> +	pthread_cleanup_pop(1);
>  	return NULL;
>  }
>  
> @@ -98,6 +116,12 @@ void log_thread_start (pthread_attr_t *attr)
>  		exit(1);
>  	}
>  
> +	pthread_mutex_lock(&logev_lock);
> +	pthread_cleanup_push(cleanup_mutex, &logev_lock);
> +	/* wait for thread startup */
> +	while (!logq_running)
> +		pthread_cond_wait(&logev_cond, &logev_lock);
> +	pthread_cleanup_pop(1);
>  	return;
>  }
>  
> @@ -112,21 +136,26 @@ void log_thread_reset (void)
>  
>  void log_thread_stop (void)
>  {
> +	int running;
> +
>  	if (!la)
>  		return;
> -
>  	logdbg(stderr,"enter log_thread_stop\n");
>  
>  	pthread_mutex_lock(&logev_lock);
> -	logq_running = 0;
> -	pthread_cond_signal(&logev_cond);
> -	pthread_mutex_unlock(&logev_lock);
> +	pthread_cleanup_push(cleanup_mutex, &logev_lock);
> +	running = logq_running;
> +	if (running) {
> +		pthread_cancel(log_thr);
> +		pthread_cond_signal(&logev_cond);
> +	}
> +	while (logq_running)
> +		pthread_cond_wait(&logev_cond, &logev_lock);

If we're already going to join the thread, why do we have to wait for
the cleanup handler first? Won't the join do the waiting we need?

-Ben

> +	pthread_cleanup_pop(1);
> +
> +	if (running)
> +		pthread_join(log_thr, NULL);
>  
> -	pthread_mutex_lock(&logq_lock);
> -	pthread_cancel(log_thr);
> -	pthread_mutex_unlock(&logq_lock);
> -	pthread_join(log_thr, NULL);
> -	log_thr = (pthread_t)0;
>  
>  	flush_logqueue();
>  
> -- 
> 2.28.0

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 19/23] multipathd: move cleanup_{prio, checkers, foreign} to libmultipath_exit
  2020-09-24 13:40 ` [PATCH 19/23] multipathd: move cleanup_{prio, checkers, foreign} to libmultipath_exit mwilck
@ 2020-09-28 20:26   ` Benjamin Marzinski
  2020-09-29  9:31     ` Martin Wilck
  0 siblings, 1 reply; 37+ messages in thread
From: Benjamin Marzinski @ 2020-09-28 20:26 UTC (permalink / raw)
  To: mwilck; +Cc: lixiaokeng, dm-devel

On Thu, Sep 24, 2020 at 03:40:50PM +0200, mwilck@suse.com wrote:
> From: Martin Wilck <mwilck@suse.com>
> 
> Signed-off-by: Martin Wilck <mwilck@suse.com>
> ---
>  libmpathpersist/mpath_persist.c   | 2 --
>  libmultipath/config.c             | 4 ++++
>  libmultipath/libmultipath.version | 5 +----
>  multipathd/main.c                 | 3 ---
>  4 files changed, 5 insertions(+), 9 deletions(-)
> 
> diff --git a/libmpathpersist/mpath_persist.c b/libmpathpersist/mpath_persist.c
> index 873b419..af56a95 100644
> --- a/libmpathpersist/mpath_persist.c
> +++ b/libmpathpersist/mpath_persist.c
> @@ -78,8 +78,6 @@ mpath_lib_init (void)
>  
>  static void libmpathpersist_cleanup(void)
>  {
> -	cleanup_prio();
> -	cleanup_checkers();
>  	libmultipath_exit();
>  	dm_lib_exit();
>  }
> diff --git a/libmultipath/config.c b/libmultipath/config.c
> index 8097838..f115ac2 100644
> --- a/libmultipath/config.c
> +++ b/libmultipath/config.c
> @@ -26,6 +26,7 @@
>  #include "devmapper.h"
>  #include "mpath_cmd.h"
>  #include "propsel.h"
> +#include "foreign.h"
>  
>  /*
>   * We don't support re-initialization after
> @@ -65,6 +66,9 @@ int libmultipath_init(void)
>  static void _libmultipath_exit(void)
>  {
>  	libmultipath_exit_called = true;
> +	cleanup_foreign();

I don't really feel too strongly about this, but it seems to me that
there is a difference between the checkers and prioritizers, which
it seems like most users of libmultipath would want, and the foreign
code, which doesn't seem that way. libmpathpersist, for instance,
will use the checkers and prioritizers, but not the foreign code.
On the other hand, if the caller isn't using the foreign code,
then grabbing the lock and checking the foreign pointer shouldn't
take much time.

-Ben

> +	cleanup_checkers();
> +	cleanup_prio();
>  	libmp_dm_exit();
>  	udev_unref(udev);
>  }
> diff --git a/libmultipath/libmultipath.version b/libmultipath/libmultipath.version
> index 9abdb22..80f1950 100644
> --- a/libmultipath/libmultipath.version
> +++ b/libmultipath/libmultipath.version
> @@ -1,4 +1,4 @@
> -LIBMULTIPATH_0.8.4.5 {
> +LIBMULTIPATH_0.8.4.6 {
>  global:
>  	/* symbols referenced by multipath and multipathd */
>  	add_foreign;
> @@ -18,10 +18,7 @@ global:
>  	checker_name;
>  	checker_state_name;
>  	check_foreign;
> -	cleanup_checkers;
> -	cleanup_foreign;
>  	cleanup_lock;
> -	cleanup_prio;
>  	close_fd;
>  	coalesce_paths;
>  	convert_dev;
> diff --git a/multipathd/main.c b/multipathd/main.c
> index e7b479a..efed56e 100644
> --- a/multipathd/main.c
> +++ b/multipathd/main.c
> @@ -3027,9 +3027,6 @@ static void cleanup_child(void)
>  {
>  	cleanup_threads();
>  	cleanup_vecs();
> -	cleanup_foreign();
> -	cleanup_checkers();
> -	cleanup_prio();
>  	if (poll_dmevents)
>  		cleanup_dmevent_waiter();
>  
> -- 
> 2.28.0

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 00/23] libmultipath: improve cleanup on exit
  2020-09-24 13:40 [PATCH 00/23] libmultipath: improve cleanup on exit mwilck
                   ` (22 preceding siblings ...)
  2020-09-24 13:40 ` [PATCH 23/23] multipath-tools: mpath-tools.supp: file with valgrind suppressions mwilck
@ 2020-09-28 20:51 ` Benjamin Marzinski
  23 siblings, 0 replies; 37+ messages in thread
From: Benjamin Marzinski @ 2020-09-28 20:51 UTC (permalink / raw)
  To: mwilck; +Cc: lixiaokeng, dm-devel

On Thu, Sep 24, 2020 at 03:40:31PM +0200, mwilck@suse.com wrote:
> From: Martin Wilck <mwilck@suse.com>
> 
> Hi Christophe, hi Ben, hi lixiaokeng,
> 
> this series was inspired by lixiaokeng's recent posting "[QUESTION] memory
> leak in main (multipath)". It implements my first idea, registering
> cleanup handlers with atexit(). However it turned out to be quite
> complex. In particular multipathd has a lot of things to clean up.
> 
> This series is based on the previous series "multipath-tools: shutdown, 
> libdevmapper races, globals".
> 
> While the bulk of the series is the cleanup handling, it also contains
> some bug fixes for issues that I found while working on this.
> 
> Regards
> Martin
> 
Reviewed-by: Benjamin Marzinski <bmarzins@redhat.com>
For all patches except 1, 7, 14, 18, & 23
> Martin Wilck (23):
>   multipathd: uxlsnr: avoid deadlock on exit
>   multipathd: Fix liburcu memory leak
>   multipathd: move handling of io_err_stat_attr into libmultipath
>   multipathd: move vecs desctruction into cleanup function
>   multipathd: make some globals static
>   multipathd: move threads destruction into separate function
>   multipathd: move conf destruction into separate function
>   multipathd: move pid destruction into separate function
>   multipathd: close pidfile on exit
>   multipathd: add helper for systemd notification at exit
>   multipathd: child(): call cleanups in failure case, too
>   multipathd: unwatch_all_dmevents: check if waiter is initialized
>   multipathd: print error message if config can't be loaded
>   libmultipath: add libmp_dm_exit()
>   multipathd: fixup libdm deinitialization
>   libmultipath: log_thread_stop(): check if logarea is initialized
>   multipathd: add cleanup_child() exit handler
>   libmultipath: fix log_thread startup and teardown
>   multipathd: move cleanup_{prio,checkers,foreign} to libmultipath_exit
>   multipath: use atexit() for cleanup handlers
>   mpathpersist: use atexit() for cleanup handlers
>   multipath: fix leaks in check_path_valid()
>   multipath-tools: mpath-tools.supp: file with valgrind suppressions
> 
>  libmpathpersist/mpath_persist.c       |   2 -
>  libmultipath/config.c                 |   5 +
>  libmultipath/config.h                 |   2 +
>  libmultipath/devmapper.c              |  10 +
>  libmultipath/devmapper.h              |   1 +
>  libmultipath/io_err_stat.c            |   7 +-
>  libmultipath/libmultipath.version     |  26 +--
>  libmultipath/log_pthread.c            |  60 +++--
>  mpathpersist/main.c                   |   5 +-
>  multipath/main.c                      |  94 +++++---
>  multipathd/dmevents.c                 |   2 +
>  multipathd/main.c                     | 305 +++++++++++++++++---------
>  multipathd/uxlsnr.c                   |  17 +-
>  third-party/valgrind/mpath-tools.supp |  33 +++
>  14 files changed, 389 insertions(+), 180 deletions(-)
>  create mode 100644 third-party/valgrind/mpath-tools.supp
> 
> -- 
> 2.28.0

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 07/23] multipathd: move conf destruction into separate function
  2020-09-28 18:03   ` Benjamin Marzinski
@ 2020-09-29  9:12     ` Martin Wilck
  0 siblings, 0 replies; 37+ messages in thread
From: Martin Wilck @ 2020-09-29  9:12 UTC (permalink / raw)
  To: dm-devel

On Mon, 2020-09-28 at 13:03 -0500, Benjamin Marzinski wrote:
>  
> > -	if (logsink == 1)
> > +	if (logsink == 1) {
> > +		logsink = 0;
> >  		log_thread_stop();
> 
> It seems like log_thread_stop() could just do something like
> 
> pthread_t log_thr_save = log_thr;
> log_thr = (pthread_t)0;
> 
> at the start, and then you would continue to get syslog logging, even
> when the log thread stopped. It's racy, but all the other threads
> (except the log_thread, obviously) should be stopped. Or am I not
> understanding the purpose of doing this?

I guess we could do this, yes (although I think (pthread_t)0 should be
avoided, but that's a different issue. Let me have another look.

Regards,
Martin

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 18/23] libmultipath: fix log_thread startup and teardown
  2020-09-28 20:15   ` Benjamin Marzinski
@ 2020-09-29  9:18     ` Martin Wilck
  0 siblings, 0 replies; 37+ messages in thread
From: Martin Wilck @ 2020-09-29  9:18 UTC (permalink / raw)
  To: Benjamin Marzinski; +Cc: lixiaokeng, dm-devel

On Mon, 2020-09-28 at 15:15 -0500, Benjamin Marzinski wrote:
> On Thu, Sep 24, 2020 at 03:40:49PM +0200, mwilck@suse.com wrote:
> > From: Martin Wilck <mwilck@suse.com>
> > 
> > This fixes several issues with the log_thread. First, the running
> > flag logq_running should be set by the thread itself, not by
> > log_thread_start()/_stop(). Second, the thread was both cancelled
> > and
> > terminated via a flag (again, logq_running). It's sufficient,
> > and better, to just cancel it and use logq_running as indication
> > for
> > successful termination. Third, the locking wasn't cancel-safe in
> > some
> > places. Forth, log_thread_start() and log_thread_stop() didn't wait
> > for
> > startup/teardown properly. Fifth, using (pthread_t)0 is wrong
> > (pthread_t is
> > opaque; there's no guarantee that 0 is not a valid pthread_t
> > value). Finally,
> > pthread_cancel() was called under logq_lock, which doesn't make
> > sense to
> > me at all.
> > 
> > Signed-off-by: Martin Wilck <mwilck@suse.com>
> > ---
> >  
> >  	pthread_mutex_lock(&logev_lock);
> > -	logq_running = 0;
> > -	pthread_cond_signal(&logev_cond);
> > -	pthread_mutex_unlock(&logev_lock);
> > +	pthread_cleanup_push(cleanup_mutex, &logev_lock);
> > +	running = logq_running;
> > +	if (running) {
> > +		pthread_cancel(log_thr);
> > +		pthread_cond_signal(&logev_cond);
> > +	}
> > +	while (logq_running)
> > +		pthread_cond_wait(&logev_cond, &logev_lock);
> 
> If we're already going to join the thread, why do we have to wait for
> the cleanup handler first? Won't the join do the waiting we need?

Yes, I guess it would. I implemented it in a way analogous the startup
sequence, but it's really not necessary for shutdown. Thanks for
pointing it out.

Martin

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 19/23] multipathd: move cleanup_{prio, checkers, foreign} to libmultipath_exit
  2020-09-28 20:26   ` Benjamin Marzinski
@ 2020-09-29  9:31     ` Martin Wilck
  2020-09-29 17:50       ` Benjamin Marzinski
  0 siblings, 1 reply; 37+ messages in thread
From: Martin Wilck @ 2020-09-29  9:31 UTC (permalink / raw)
  To: Benjamin Marzinski; +Cc: lixiaokeng, dm-devel

On Mon, 2020-09-28 at 15:26 -0500, Benjamin Marzinski wrote:
> On Thu, Sep 24, 2020 at 03:40:50PM +0200, mwilck@suse.com wrote:
> > 
> >  /*
> >   * We don't support re-initialization after
> > @@ -65,6 +66,9 @@ int libmultipath_init(void)
> >  static void _libmultipath_exit(void)
> >  {
> >  	libmultipath_exit_called = true;
> > +	cleanup_foreign();
> 
> I don't really feel too strongly about this, but it seems to me that
> there is a difference between the checkers and prioritizers, which
> it seems like most users of libmultipath would want, and the foreign
> code, which doesn't seem that way. libmpathpersist, for instance,
> will use the checkers and prioritizers, but not the foreign code.
> On the other hand, if the caller isn't using the foreign code,
> then grabbing the lock and checking the foreign pointer shouldn't
> take much time.

It would just be a few cycles. I want callers to have to worry about
cleanup as little as possible. All else is error-prone IMO, and
although I agree that the foreign functions are less important than
checkers and prio, I thought it made sense to treat all our "plug-ins"
the same way.

Ideally I'd like to do checker/prio/foreign initialization completely
lazily too, in the sense that callers don't need to worry about calling
init_checkers() etc., either. But this series had to stop at some
point.

Either way, it's not a big issue, so please tell me if you feel
strongly enough about it to ask me to revert the change.

Regards,
Martin

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 19/23] multipathd: move cleanup_{prio, checkers, foreign} to libmultipath_exit
  2020-09-29  9:31     ` Martin Wilck
@ 2020-09-29 17:50       ` Benjamin Marzinski
  0 siblings, 0 replies; 37+ messages in thread
From: Benjamin Marzinski @ 2020-09-29 17:50 UTC (permalink / raw)
  To: Martin Wilck; +Cc: lixiaokeng, dm-devel

On Tue, Sep 29, 2020 at 11:31:06AM +0200, Martin Wilck wrote:
> On Mon, 2020-09-28 at 15:26 -0500, Benjamin Marzinski wrote:
> > On Thu, Sep 24, 2020 at 03:40:50PM +0200, mwilck@suse.com wrote:
> > > 
> > >  /*
> > >   * We don't support re-initialization after
> > > @@ -65,6 +66,9 @@ int libmultipath_init(void)
> > >  static void _libmultipath_exit(void)
> > >  {
> > >  	libmultipath_exit_called = true;
> > > +	cleanup_foreign();
> > 
> > I don't really feel too strongly about this, but it seems to me that
> > there is a difference between the checkers and prioritizers, which
> > it seems like most users of libmultipath would want, and the foreign
> > code, which doesn't seem that way. libmpathpersist, for instance,
> > will use the checkers and prioritizers, but not the foreign code.
> > On the other hand, if the caller isn't using the foreign code,
> > then grabbing the lock and checking the foreign pointer shouldn't
> > take much time.
> 
> It would just be a few cycles. I want callers to have to worry about
> cleanup as little as possible. All else is error-prone IMO, and
> although I agree that the foreign functions are less important than
> checkers and prio, I thought it made sense to treat all our "plug-ins"
> the same way.
> 
> Ideally I'd like to do checker/prio/foreign initialization completely
> lazily too, in the sense that callers don't need to worry about calling
> init_checkers() etc., either. But this series had to stop at some
> point.
> 
> Either way, it's not a big issue, so please tell me if you feel
> strongly enough about it to ask me to revert the change.

I already ACKed this bug in my reply to:

[PATCH 00/23] libmultipath: improve cleanup on exit

https://www.redhat.com/archives/dm-devel/2020-September/msg00635.html

Sorry for the confusion.
-Ben
 
> Regards,
> Martin
> 

^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, back to index

Thread overview: 37+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-24 13:40 [PATCH 00/23] libmultipath: improve cleanup on exit mwilck
2020-09-24 13:40 ` [PATCH 01/23] multipathd: uxlsnr: avoid deadlock " mwilck
2020-09-26  1:52   ` Benjamin Marzinski
2020-09-26 13:00     ` Martin Wilck
2020-09-24 13:40 ` [PATCH 02/23] multipathd: Fix liburcu memory leak mwilck
2020-09-24 13:40 ` [PATCH 03/23] multipathd: move handling of io_err_stat_attr into libmultipath mwilck
2020-09-24 13:40 ` [PATCH 04/23] multipathd: move vecs desctruction into cleanup function mwilck
2020-09-24 13:40 ` [PATCH 05/23] multipathd: make some globals static mwilck
2020-09-24 13:40 ` [PATCH 06/23] multipathd: move threads destruction into separate function mwilck
2020-09-24 13:40 ` [PATCH 07/23] multipathd: move conf " mwilck
2020-09-28 18:03   ` Benjamin Marzinski
2020-09-29  9:12     ` Martin Wilck
2020-09-24 13:40 ` [PATCH 08/23] multipathd: move pid " mwilck
2020-09-24 13:40 ` [PATCH 09/23] multipathd: close pidfile on exit mwilck
2020-09-24 13:40 ` [PATCH 10/23] multipathd: add helper for systemd notification at exit mwilck
2020-09-24 13:40 ` [PATCH 11/23] multipathd: child(): call cleanups in failure case, too mwilck
2020-09-24 13:40 ` [PATCH 12/23] multipathd: unwatch_all_dmevents: check if waiter is initialized mwilck
2020-09-24 13:40 ` [PATCH 13/23] multipathd: print error message if config can't be loaded mwilck
2020-09-24 13:40 ` [PATCH 14/23] libmultipath: add libmp_dm_exit() mwilck
2020-09-28 18:41   ` Benjamin Marzinski
2020-09-24 13:40 ` [PATCH 15/23] multipathd: fixup libdm deinitialization mwilck
2020-09-24 13:40 ` [PATCH 16/23] libmultipath: log_thread_stop(): check if logarea is initialized mwilck
2020-09-24 13:40 ` [PATCH 17/23] multipathd: add cleanup_child() exit handler mwilck
2020-09-24 13:40 ` [PATCH 18/23] libmultipath: fix log_thread startup and teardown mwilck
2020-09-28 20:15   ` Benjamin Marzinski
2020-09-29  9:18     ` Martin Wilck
2020-09-24 13:40 ` [PATCH 19/23] multipathd: move cleanup_{prio, checkers, foreign} to libmultipath_exit mwilck
2020-09-28 20:26   ` Benjamin Marzinski
2020-09-29  9:31     ` Martin Wilck
2020-09-29 17:50       ` Benjamin Marzinski
2020-09-24 13:40 ` [PATCH 20/23] multipath: use atexit() for cleanup handlers mwilck
2020-09-24 13:40 ` [PATCH 21/23] mpathpersist: " mwilck
2020-09-24 13:40 ` [PATCH 22/23] multipath: fix leaks in check_path_valid() mwilck
2020-09-24 13:40 ` [PATCH 23/23] multipath-tools: mpath-tools.supp: file with valgrind suppressions mwilck
2020-09-24 20:58   ` Benjamin Marzinski
2020-09-25  9:53     ` Martin Wilck
2020-09-28 20:51 ` [PATCH 00/23] libmultipath: improve cleanup on exit Benjamin Marzinski

DM-Devel Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/dm-devel/0 dm-devel/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 dm-devel dm-devel/ https://lore.kernel.org/dm-devel \
		dm-devel@redhat.com
	public-inbox-index dm-devel

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/com.redhat.dm-devel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git