All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
To: lttng-dev@lists.lttng.org
Cc: jgalar@efficios.com
Subject: [RFC PATCH v2 06/13] Fix: unregister app notify socket on sessiond tear down
Date: Mon, 18 Sep 2017 18:51:59 -0400	[thread overview]
Message-ID: <20170918225206.17725-7-jonathan.rajotte-julien__899.454745422368$1505775336$gmane$org@efficios.com> (raw)
In-Reply-To: <20170918225206.17725-1-jonathan.rajotte-julien@efficios.com>

A race between the sessiond tear down and applications initialization
can lead to a deadlock.

Applications try to communicate via the notify sockets while sessiond
does not listen anymore on these sockets since the thread responsible
for reception/response is terminated (ust_thread_manage_notify). These
sockets are never closed hence an application could hang on
communication.

Sessiond hang happen during call to cmd_destroy_session during
sessiond_cleanup. Sessiond is trying to communicate with the app while
the app is waiting for a response on the app notification socket.

To prevent this situation a call to ust_app_notify_sock_unregister is
performed on all entry of the ust_app_ht_by_notify_sock hash table at
the time of termination. This ensure that any pending communication
initiated by the application will be terminated since all sockets will
be closed at the end of the grace period via call_rcu inside
ust_app_notify_sock_unregister. The use of ust_app_ht_by_notify_sock
instead of the ust_app_ht prevent a double call_rcu since entries are
removed from ust_app_ht_by_notify_sock during ust_app_notify_sock_unregister.

This can be reproduced using the sessiond_teardown_active_session
scenario provided by [1].

[1] https://github.com/PSRCode/lttng-stress

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
---
 src/bin/lttng-sessiond/ust-thread.c | 17 ++++++++++++++++-
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/src/bin/lttng-sessiond/ust-thread.c b/src/bin/lttng-sessiond/ust-thread.c
index 1e7a8229..8f11133a 100644
--- a/src/bin/lttng-sessiond/ust-thread.c
+++ b/src/bin/lttng-sessiond/ust-thread.c
@@ -27,6 +27,19 @@
 #include "health-sessiond.h"
 #include "testpoint.h"
 
+
+static
+void notify_sock_unregister_all()
+{
+	struct lttng_ht_iter iter;
+	struct ust_app *app;
+	rcu_read_lock();
+	cds_lfht_for_each_entry(ust_app_ht_by_notify_sock->ht, &iter.iter, app, notify_sock_n.node) {
+		ust_app_notify_sock_unregister(app->notify_sock);
+	}
+	rcu_read_unlock();
+}
+
 /*
  * This thread manage application notify communication.
  */
@@ -53,7 +66,7 @@ void *ust_thread_manage_notify(void *data)
 
 	ret = lttng_poll_create(&events, 2, LTTNG_CLOEXEC);
 	if (ret < 0) {
-		goto error;
+		goto error_poll_create;
 	}
 
 	/* Add quit pipe */
@@ -197,6 +210,8 @@ error_poll_create:
 error_testpoint:
 	utils_close_pipe(apps_cmd_notify_pipe);
 	apps_cmd_notify_pipe[0] = apps_cmd_notify_pipe[1] = -1;
+	notify_sock_unregister_all();
+
 	DBG("Application notify communication apps thread cleanup complete");
 	if (err) {
 		health_error();
-- 
2.11.0

_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

  parent reply	other threads:[~2017-09-18 22:52 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20170918225206.17725-1-jonathan.rajotte-julien@efficios.com>
2017-09-18 22:51 ` [RFC PATCH v2 01/13] Extend health thread lifetime Jonathan Rajotte
2017-09-18 22:51 ` [RFC PATCH v2 02/13] Reorder pthread_join for easier ordering later on Jonathan Rajotte
2017-09-18 22:51 ` [RFC PATCH v2 03/13] Terminate dispatch thread after reg_apps_thread is terminated Jonathan Rajotte
2017-09-18 22:51 ` [RFC PATCH v2 04/13] Order termination of thread_manage_apps after dispatch_thread Jonathan Rajotte
2017-09-18 22:51 ` [RFC PATCH v2 05/13] Control thread_apps_notify lifetime with specialized quit pipe Jonathan Rajotte
2017-09-18 22:51 ` Jonathan Rajotte [this message]
2017-09-18 22:52 ` [RFC PATCH v2 07/13] Always reply to an inquiring app Jonathan Rajotte
2017-09-18 22:52 ` [RFC PATCH v2 08/13] Fix: perform lookup then test for condition Jonathan Rajotte
2017-09-18 22:52 ` [RFC PATCH v2 09/13] Perform ust_app_unregister on thread_manage_apps teardown Jonathan Rajotte
2017-09-18 22:52 ` [RFC PATCH v2 10/13] Teardown apps_notify_thread before thread_manage_apps Jonathan Rajotte
2017-09-18 22:52 ` [RFC PATCH v2 11/13] Comments update Jonathan Rajotte
2017-09-18 22:52 ` [RFC PATCH v2 12/13] Fix: delay termination on consumerd to allow metadata flushing Jonathan Rajotte
2017-09-18 22:52 ` [RFC PATCH v2 13/13] Fix: quit early if instructed to Jonathan Rajotte
     [not found] ` <20170918225206.17725-2-jonathan.rajotte-julien@efficios.com>
2017-12-14  1:54   ` [RFC PATCH v2 01/13] Extend health thread lifetime Jérémie Galarneau
     [not found] ` <20170918225206.17725-3-jonathan.rajotte-julien@efficios.com>
2017-12-14  1:54   ` [RFC PATCH v2 02/13] Reorder pthread_join for easier ordering later on Jérémie Galarneau
     [not found] ` <20170918225206.17725-4-jonathan.rajotte-julien@efficios.com>
2017-12-14  1:55   ` [RFC PATCH v2 03/13] Terminate dispatch thread after reg_apps_thread is terminated Jérémie Galarneau
     [not found] ` <20170918225206.17725-5-jonathan.rajotte-julien@efficios.com>
2017-12-14  1:55   ` [RFC PATCH v2 04/13] Order termination of thread_manage_apps after dispatch_thread Jérémie Galarneau
     [not found] ` <20170918225206.17725-6-jonathan.rajotte-julien@efficios.com>
2017-12-14  1:55   ` [RFC PATCH v2 05/13] Control thread_apps_notify lifetime with specialized quit pipe Jérémie Galarneau
     [not found] ` <20170918225206.17725-7-jonathan.rajotte-julien@efficios.com>
2017-12-14  1:56   ` [RFC PATCH v2 06/13] Fix: unregister app notify socket on sessiond tear down Jérémie Galarneau
     [not found] ` <20170918225206.17725-8-jonathan.rajotte-julien@efficios.com>
2017-12-14  1:56   ` [RFC PATCH v2 07/13] Always reply to an inquiring app Jérémie Galarneau
     [not found] ` <20170918225206.17725-9-jonathan.rajotte-julien@efficios.com>
2017-12-14  1:56   ` [RFC PATCH v2 08/13] Fix: perform lookup then test for condition Jérémie Galarneau
     [not found] ` <20170918225206.17725-10-jonathan.rajotte-julien@efficios.com>
2017-12-14  1:57   ` [RFC PATCH v2 09/13] Perform ust_app_unregister on thread_manage_apps teardown Jérémie Galarneau
     [not found] ` <20170918225206.17725-11-jonathan.rajotte-julien@efficios.com>
2017-12-13 17:58   ` [RFC PATCH v2 10/13] Teardown apps_notify_thread before thread_manage_apps Jérémie Galarneau
2017-12-14  1:57   ` Jérémie Galarneau
     [not found] ` <20170918225206.17725-12-jonathan.rajotte-julien@efficios.com>
2017-12-14  1:57   ` [RFC PATCH v2 11/13] Comments update Jérémie Galarneau
     [not found] ` <20170918225206.17725-13-jonathan.rajotte-julien@efficios.com>
2017-12-14  1:57   ` [RFC PATCH v2 12/13] Fix: delay termination on consumerd to allow metadata flushing Jérémie Galarneau
     [not found] ` <20170918225206.17725-14-jonathan.rajotte-julien@efficios.com>
2017-12-14  1:58   ` [RFC PATCH v2 13/13] Fix: quit early if instructed to Jérémie Galarneau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='20170918225206.17725-7-jonathan.rajotte-julien__899.454745422368$1505775336$gmane$org@efficios.com' \
    --to=jonathan.rajotte-julien@efficios.com \
    --cc=jgalar@efficios.com \
    --cc=lttng-dev@lists.lttng.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.