All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Jérémie Galarneau" <jeremie.galarneau@efficios.com>
To: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Cc: "lttng-dev@lists.lttng.org" <lttng-dev@lists.lttng.org>,
	Jeremie Galarneau <jgalar@efficios.com>
Subject: Re: [RFC PATCH v2 10/13] Teardown apps_notify_thread before thread_manage_apps
Date: Wed, 13 Dec 2017 12:58:28 -0500	[thread overview]
Message-ID: <CA+jJMxvhReP8v9wf0EbBuSZ=KYDriV1f1E3c5P_AmCMHGBAc9g__24501.9897855402$1513187924$gmane$org@mail.gmail.com> (raw)
In-Reply-To: <20170918225206.17725-11-jonathan.rajotte-julien@efficios.com>

On 18 September 2017 at 18:52, Jonathan Rajotte
<jonathan.rajotte-julien@efficios.com> wrote:
> RFC:
> This is necessary since we use the ust_app_ht_by_notify_sock to
> perform the cleanup operation. Since both ust_app_unregister and
> ust_app_ust_app_notify_unregister perform a remove of the app on the
> ust_ust_app_by_notify_sock but only ust_app_notify_unregister actually

There are some extra ust_ and ust_app_ on the last two lines ;)

> end up performing a close(call_rcu) on the socket.

a close(call_rcu) ?

>

It took me a while to wrap my head around what the problem is given this
description. Please, correct me if I'm wrong.

Two threads are performing a clean-up step based on the content of the
ust_app_ht_by_notify_sock hash table.

The apps_notify_thread, on shutdown:
  - invokes notify_sock_unregister_all()
    - iterates through the ust_app_ht_by_notify_sock hash table
      - calls ust_app_notify_sock_unregister() on every application
        notification socket
        - removes the app from the ust_app_ht_by_notify_sock hash table
        - closes the notify socket through a deferred call_rcu()


The apps_thread, on shutdown:
  - iterates through the ust_app_ht hash table
    - calls ust_app_unregister() on every application notification
      socket
      - This function's header comment mentions "The socket is already
        closed at this point so no close to sock.", by which the author
        most likely meant that "there is no need to close the socket".

        This is no longer true with your patch (#09). This thread
        only reacts to errors on the application sockets and then
 tears
        down

      - Flushes that app's metadata and buffers
      - Removes the app from the ust_app_ht_by_sock hash table
      - Removes the app from the ust_app_ht_by_notify_sock hash table
      - Removes the app from the ust_app_ht hash table


Given this, it already seems weird that both threads try to remove
the entry from the ust_app_ht_by_notify_sock hash table.

In the case where we are not shutting down,


> Other way to do fix this problem?
>
> Could we simply not remove it on a ust_app_unregister? And always defer
> to the apps_notify_thread for cleanup?
>
> Update the value in the hash table to -1 and emit a close and remove
> from the hash table if the value is -1?
>
> We could also keep a local list of fd in apps_notify_thread and use it for
> cleanup instead of relying on ust_ust_app_by_notify_sock.
>
> I'm not sure what is the best/elegant solution here. I am not a fan of
> the current solution but it working.
>
> Obviously this commit will be reworded and modified accordingly.
>
> Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
> ---
>  src/bin/lttng-sessiond/main.c | 55 ++++++++++++++++++++++++-------------------
>  1 file changed, 31 insertions(+), 24 deletions(-)
>
> diff --git a/src/bin/lttng-sessiond/main.c b/src/bin/lttng-sessiond/main.c
> index 4a2a661f..216d0da6 100644
> --- a/src/bin/lttng-sessiond/main.c
> +++ b/src/bin/lttng-sessiond/main.c
> @@ -6209,16 +6209,6 @@ int main(int argc, char **argv)
>         }
>         notification_thread_running = true;
>
> -       /* Create thread to manage application notify socket */
> -       ret = pthread_create(&apps_notify_thread, default_pthread_attr(),
> -                       ust_thread_manage_notify, (void *) NULL);
> -       if (ret) {
> -               errno = ret;
> -               PERROR("pthread_create notify");
> -               retval = -1;
> -               stop_threads();
> -               goto exit_apps_notify;
> -       }
>
>         /* Create thread to manage application socket */
>         ret = pthread_create(&apps_thread, default_pthread_attr(),
> @@ -6231,6 +6221,17 @@ int main(int argc, char **argv)
>                 goto exit_apps;
>         }
>
> +       /* Create thread to manage application notify socket */
> +       ret = pthread_create(&apps_notify_thread, default_pthread_attr(),
> +                       ust_thread_manage_notify, (void *) NULL);
> +       if (ret) {
> +               errno = ret;
> +               PERROR("pthread_create notify");
> +               retval = -1;
> +               stop_threads();
> +               goto exit_apps_notify;
> +       }
> +
>         /* Create thread to dispatch registration */
>         ret = pthread_create(&dispatch_thread, default_pthread_attr(),
>                         thread_dispatch_ust_registration, (void *) NULL);
> @@ -6358,20 +6359,6 @@ exit_reg_apps:
>         }
>
>  exit_dispatch:
> -       /* Instruct the apps thread to quit */
> -       ret = notify_thread_pipe(thread_apps_teardown_trigger_pipe[1]);
> -       if (ret < 0) {
> -               ERR("write error on thread quit pipe");
> -       }
> -
> -       ret = pthread_join(apps_thread, &status);
> -       if (ret) {
> -               errno = ret;
> -               PERROR("pthread_join apps");
> -               retval = -1;
> -       }
> -
> -exit_apps:
>         /* Instruct the apps_notify thread to quit */
>         ret = notify_thread_pipe(thread_apps_notify_teardown_trigger_pipe[1]);
>         if (ret < 0) {
> @@ -6386,6 +6373,26 @@ exit_apps:
>         }
>
>  exit_apps_notify:
> +       /*
> +        * The barrier ensure that all previous resources, notify sockets in
> +        * particular, are freed/closed.
> +        */
> +       rcu_barrier();
> +
> +       /* Instruct the apps thread to quit */
> +       ret = notify_thread_pipe(thread_apps_teardown_trigger_pipe[1]);
> +       if (ret < 0) {
> +               ERR("write error on thread quit pipe");
> +       }
> +
> +       ret = pthread_join(apps_thread, &status);
> +       if (ret) {
> +               errno = ret;
> +               PERROR("pthread_join apps");
> +               retval = -1;
> +       }
> +
> +exit_apps:
>  exit_notification:
>  exit_health:
>  exit_init_data:
> --
> 2.11.0
>



-- 
Jérémie Galarneau
EfficiOS Inc.
http://www.efficios.com
_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

  parent reply	other threads:[~2017-12-13 17:58 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20170918225206.17725-1-jonathan.rajotte-julien@efficios.com>
2017-09-18 22:51 ` [RFC PATCH v2 01/13] Extend health thread lifetime Jonathan Rajotte
2017-09-18 22:51 ` [RFC PATCH v2 02/13] Reorder pthread_join for easier ordering later on Jonathan Rajotte
2017-09-18 22:51 ` [RFC PATCH v2 03/13] Terminate dispatch thread after reg_apps_thread is terminated Jonathan Rajotte
2017-09-18 22:51 ` [RFC PATCH v2 04/13] Order termination of thread_manage_apps after dispatch_thread Jonathan Rajotte
2017-09-18 22:51 ` [RFC PATCH v2 05/13] Control thread_apps_notify lifetime with specialized quit pipe Jonathan Rajotte
2017-09-18 22:51 ` [RFC PATCH v2 06/13] Fix: unregister app notify socket on sessiond tear down Jonathan Rajotte
2017-09-18 22:52 ` [RFC PATCH v2 07/13] Always reply to an inquiring app Jonathan Rajotte
2017-09-18 22:52 ` [RFC PATCH v2 08/13] Fix: perform lookup then test for condition Jonathan Rajotte
2017-09-18 22:52 ` [RFC PATCH v2 09/13] Perform ust_app_unregister on thread_manage_apps teardown Jonathan Rajotte
2017-09-18 22:52 ` [RFC PATCH v2 10/13] Teardown apps_notify_thread before thread_manage_apps Jonathan Rajotte
2017-09-18 22:52 ` [RFC PATCH v2 11/13] Comments update Jonathan Rajotte
2017-09-18 22:52 ` [RFC PATCH v2 12/13] Fix: delay termination on consumerd to allow metadata flushing Jonathan Rajotte
2017-09-18 22:52 ` [RFC PATCH v2 13/13] Fix: quit early if instructed to Jonathan Rajotte
     [not found] ` <20170918225206.17725-2-jonathan.rajotte-julien@efficios.com>
2017-12-14  1:54   ` [RFC PATCH v2 01/13] Extend health thread lifetime Jérémie Galarneau
     [not found] ` <20170918225206.17725-3-jonathan.rajotte-julien@efficios.com>
2017-12-14  1:54   ` [RFC PATCH v2 02/13] Reorder pthread_join for easier ordering later on Jérémie Galarneau
     [not found] ` <20170918225206.17725-4-jonathan.rajotte-julien@efficios.com>
2017-12-14  1:55   ` [RFC PATCH v2 03/13] Terminate dispatch thread after reg_apps_thread is terminated Jérémie Galarneau
     [not found] ` <20170918225206.17725-5-jonathan.rajotte-julien@efficios.com>
2017-12-14  1:55   ` [RFC PATCH v2 04/13] Order termination of thread_manage_apps after dispatch_thread Jérémie Galarneau
     [not found] ` <20170918225206.17725-6-jonathan.rajotte-julien@efficios.com>
2017-12-14  1:55   ` [RFC PATCH v2 05/13] Control thread_apps_notify lifetime with specialized quit pipe Jérémie Galarneau
     [not found] ` <20170918225206.17725-7-jonathan.rajotte-julien@efficios.com>
2017-12-14  1:56   ` [RFC PATCH v2 06/13] Fix: unregister app notify socket on sessiond tear down Jérémie Galarneau
     [not found] ` <20170918225206.17725-8-jonathan.rajotte-julien@efficios.com>
2017-12-14  1:56   ` [RFC PATCH v2 07/13] Always reply to an inquiring app Jérémie Galarneau
     [not found] ` <20170918225206.17725-9-jonathan.rajotte-julien@efficios.com>
2017-12-14  1:56   ` [RFC PATCH v2 08/13] Fix: perform lookup then test for condition Jérémie Galarneau
     [not found] ` <20170918225206.17725-10-jonathan.rajotte-julien@efficios.com>
2017-12-14  1:57   ` [RFC PATCH v2 09/13] Perform ust_app_unregister on thread_manage_apps teardown Jérémie Galarneau
     [not found] ` <20170918225206.17725-11-jonathan.rajotte-julien@efficios.com>
2017-12-13 17:58   ` Jérémie Galarneau [this message]
2017-12-14  1:57   ` [RFC PATCH v2 10/13] Teardown apps_notify_thread before thread_manage_apps Jérémie Galarneau
     [not found] ` <20170918225206.17725-12-jonathan.rajotte-julien@efficios.com>
2017-12-14  1:57   ` [RFC PATCH v2 11/13] Comments update Jérémie Galarneau
     [not found] ` <20170918225206.17725-13-jonathan.rajotte-julien@efficios.com>
2017-12-14  1:57   ` [RFC PATCH v2 12/13] Fix: delay termination on consumerd to allow metadata flushing Jérémie Galarneau
     [not found] ` <20170918225206.17725-14-jonathan.rajotte-julien@efficios.com>
2017-12-14  1:58   ` [RFC PATCH v2 13/13] Fix: quit early if instructed to Jérémie Galarneau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CA+jJMxvhReP8v9wf0EbBuSZ=KYDriV1f1E3c5P_AmCMHGBAc9g__24501.9897855402$1513187924$gmane$org@mail.gmail.com' \
    --to=jeremie.galarneau@efficios.com \
    --cc=jgalar@efficios.com \
    --cc=jonathan.rajotte-julien@efficios.com \
    --cc=lttng-dev@lists.lttng.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.