* [PATCH lttng-tools 3/3] Fix: sessiond: ust: deadlock with per-pid buffers
[not found] <20191101202305.21496-1-mathieu.desnoyers@efficios.com>
2019-11-01 20:23 ` [PATCH lttng-tools 2/3] Fix: relayd: put chunk reference when closing stream Mathieu Desnoyers
@ 2019-11-01 20:23 ` Mathieu Desnoyers
2019-11-22 23:07 ` [PATCH lttng-tools 1/3] Fix: relayd: tracefile rotation: viewer opening missing index file Jérémie Galarneau
2 siblings, 0 replies; 3+ messages in thread
From: Mathieu Desnoyers @ 2019-11-01 20:23 UTC (permalink / raw)
To: jgalar; +Cc: lttng-dev
Do not hold the registry lock while communicating with the consumerd,
because doing so causes inter-process deadlocks between consumerd and
sessiond with the metadata request notification.
The deadlock involves both sessiond and consumerd:
* lttng-sessiond:
thread 11 - thread_application_management
close_metadata()
pthread_mutex_lock(®istry->lock);
consumer_close_metadata()
pthread_mutex_lock(socket->lock);
thread 15 - thread_consumer_management
ust_consumer_metadata_request()
pthread_mutex_lock(&ust_reg->lock);
thread 8 - thread_manage_clients
consumer_is_data_pending
pthread_mutex_lock(socket->lock);
consumer_socket_recv()
* lttng-consumerd:
thread 4 - consumer_timer_thread
sample_channel_positions()
pthread_mutex_lock(&stream->lock);
thread 8 - consumer_thread_sessiond_poll
consumer_data_pending
pthread_mutex_lock(&consumer_data.lock);
pthread_mutex_lock(&stream->lock);
thread 7 - consumer_thread_data_poll
lttng_consumer_read_subbuffer
pthread_mutex_lock(&stream->chan->lock);
pthread_mutex_lock(&stream->lock);
do_sync_metadata
pthread_mutex_lock(&metadata->lock);
lttng_ustconsumer_sync_metadata
pthread_mutex_unlock(&metadata_stream->lock);
lttng_ustconsumer_request_metadata()
pthread_mutex_lock(&ctx->metadata_socket_lock);
lttcomm_recv_unix_sock()
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
---
src/bin/lttng-sessiond/ust-app.c | 31 ++++++++++++++++++-------------
1 file changed, 18 insertions(+), 13 deletions(-)
diff --git a/src/bin/lttng-sessiond/ust-app.c b/src/bin/lttng-sessiond/ust-app.c
index 1731c368..1ecff438 100644
--- a/src/bin/lttng-sessiond/ust-app.c
+++ b/src/bin/lttng-sessiond/ust-app.c
@@ -741,6 +741,10 @@ error:
* nullified. The session lock MUST be held unless the application is
* in the destroy path.
*
+ * Do not hold the registry lock while communicating with the consumerd, because
+ * doing so causes inter-process deadlocks between consumerd and sessiond with
+ * the metadata request notification.
+ *
* Return 0 on success else a negative value.
*/
static int close_metadata(struct ust_registry_session *registry,
@@ -748,6 +752,7 @@ static int close_metadata(struct ust_registry_session *registry,
{
int ret;
struct consumer_socket *socket;
+ uint64_t metadata_key;
assert(registry);
assert(consumer);
@@ -755,34 +760,34 @@ static int close_metadata(struct ust_registry_session *registry,
rcu_read_lock();
pthread_mutex_lock(®istry->lock);
-
- if (!registry->metadata_key || registry->metadata_closed) {
+ metadata_key = registry->metadata_key;
+ if (!metadata_key || registry->metadata_closed) {
ret = 0;
+ pthread_mutex_unlock(®istry->lock);
goto end;
}
+ /*
+ * Metadata closed. Even on error this means that the consumer is not
+ * responding or not found so either way a second close should NOT be emit
+ * for this registry.
+ */
+ registry->metadata_closed = 1;
+ pthread_mutex_unlock(®istry->lock);
/* Get consumer socket to use to push the metadata.*/
socket = consumer_find_socket_by_bitness(registry->bits_per_long,
consumer);
if (!socket) {
ret = -1;
- goto error;
+ goto end;
}
- ret = consumer_close_metadata(socket, registry->metadata_key);
+ ret = consumer_close_metadata(socket, metadata_key);
if (ret < 0) {
- goto error;
+ goto end;
}
-error:
- /*
- * Metadata closed. Even on error this means that the consumer is not
- * responding or not found so either way a second close should NOT be emit
- * for this registry.
- */
- registry->metadata_closed = 1;
end:
- pthread_mutex_unlock(®istry->lock);
rcu_read_unlock();
return ret;
}
--
2.17.1
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH lttng-tools 1/3] Fix: relayd: tracefile rotation: viewer opening missing index file
[not found] <20191101202305.21496-1-mathieu.desnoyers@efficios.com>
2019-11-01 20:23 ` [PATCH lttng-tools 2/3] Fix: relayd: put chunk reference when closing stream Mathieu Desnoyers
2019-11-01 20:23 ` [PATCH lttng-tools 3/3] Fix: sessiond: ust: deadlock with per-pid buffers Mathieu Desnoyers
@ 2019-11-22 23:07 ` Jérémie Galarneau
2 siblings, 0 replies; 3+ messages in thread
From: Jérémie Galarneau @ 2019-11-22 23:07 UTC (permalink / raw)
To: Mathieu Desnoyers; +Cc: lttng-dev, jgalar
All three patches of this series were merged in master and
stable-2.11.
Thanks!
Jérémie
On Fri, Nov 01, 2019 at 04:23:03PM -0400, Mathieu Desnoyers wrote:
> Moving the head position of the tracefile array when the data is
> received opens a window where a viewer attaching to the session could
> try to open a missing index file (which has not been received yet).
>
> However, we want to bump the tail position as soon as we receive
> data, because the prior tail is not valid anymore.
>
> Solve this by introducing two head positions: the "read" head
> and the "write" head. The "write" head is the position of the
> newest data file (equivalent to the prior "head" position). We
> also introduce a "read" head position, which is only moved
> forward when the index is received.
>
> The viewer now uses the "read" head position as upper bound, which
> ensures it never attempts to open a non-existing index file.
>
> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
> ---
> src/bin/lttng-relayd/stream.c | 4 +-
> src/bin/lttng-relayd/tracefile-array.c | 58 ++++++++++++++++----------
> src/bin/lttng-relayd/tracefile-array.h | 21 ++++++++--
> src/bin/lttng-relayd/viewer-stream.c | 2 +-
> 4 files changed, 58 insertions(+), 27 deletions(-)
>
> diff --git a/src/bin/lttng-relayd/stream.c b/src/bin/lttng-relayd/stream.c
> index 94698f8d..4d3d37a2 100644
> --- a/src/bin/lttng-relayd/stream.c
> +++ b/src/bin/lttng-relayd/stream.c
> @@ -958,7 +958,7 @@ int stream_init_packet(struct relay_stream *stream, size_t packet_size,
> stream->stream_handle,
> stream->tracefile_size_current, packet_size,
> stream->tracefile_current_index, new_file_index);
> - tracefile_array_file_rotate(stream->tfa);
> + tracefile_array_file_rotate(stream->tfa, TRACEFILE_ROTATE_WRITE);
> stream->tracefile_current_index = new_file_index;
>
> if (stream->stream_fd) {
> @@ -1095,6 +1095,7 @@ int stream_update_index(struct relay_stream *stream, uint64_t net_seq_num,
>
> ret = relay_index_try_flush(index);
> if (ret == 0) {
> + tracefile_array_file_rotate(stream->tfa, TRACEFILE_ROTATE_READ);
> tracefile_array_commit_seq(stream->tfa);
> stream->index_received_seqcount++;
> *flushed = true;
> @@ -1188,6 +1189,7 @@ int stream_add_index(struct relay_stream *stream,
> }
> ret = relay_index_try_flush(index);
> if (ret == 0) {
> + tracefile_array_file_rotate(stream->tfa, TRACEFILE_ROTATE_READ);
> tracefile_array_commit_seq(stream->tfa);
> stream->index_received_seqcount++;
> stream->pos_after_last_complete_data_index += index->total_size;
> diff --git a/src/bin/lttng-relayd/tracefile-array.c b/src/bin/lttng-relayd/tracefile-array.c
> index 20b760c0..3d62317a 100644
> --- a/src/bin/lttng-relayd/tracefile-array.c
> +++ b/src/bin/lttng-relayd/tracefile-array.c
> @@ -62,7 +62,8 @@ void tracefile_array_destroy(struct tracefile_array *tfa)
> free(tfa);
> }
>
> -void tracefile_array_file_rotate(struct tracefile_array *tfa)
> +void tracefile_array_file_rotate(struct tracefile_array *tfa,
> + enum tracefile_rotate_type type)
> {
> uint64_t *headp, *tailp;
>
> @@ -70,24 +71,37 @@ void tracefile_array_file_rotate(struct tracefile_array *tfa)
> /* Not in tracefile rotation mode. */
> return;
> }
> - /* Rotate to next file. */
> - tfa->file_head = (tfa->file_head + 1) % tfa->count;
> - if (tfa->file_head == tfa->file_tail) {
> - /* Move tail. */
> - tfa->file_tail = (tfa->file_tail + 1) % tfa->count;
> - }
> - headp = &tfa->tf[tfa->file_head].seq_head;
> - tailp = &tfa->tf[tfa->file_head].seq_tail;
> - /*
> - * If we overwrite a file with content, we need to push the tail
> - * to the position following the content we are overwriting.
> - */
> - if (*headp != -1ULL) {
> - tfa->seq_tail = tfa->tf[tfa->file_tail].seq_tail;
> + switch (type) {
> + case TRACEFILE_ROTATE_READ:
> + /*
> + * Rotate read head to write head position, thus allowing
> + * reader to consume the newly rotated head file.
> + */
> + tfa->file_head_read = tfa->file_head_write;
> + break;
> + case TRACEFILE_ROTATE_WRITE:
> + /* Rotate write head to next file, pushing tail if needed. */
> + tfa->file_head_write = (tfa->file_head_write + 1) % tfa->count;
> + if (tfa->file_head_write == tfa->file_tail) {
> + /* Move tail. */
> + tfa->file_tail = (tfa->file_tail + 1) % tfa->count;
> + }
> + headp = &tfa->tf[tfa->file_head_write].seq_head;
> + tailp = &tfa->tf[tfa->file_head_write].seq_tail;
> + /*
> + * If we overwrite a file with content, we need to push the tail
> + * to the position following the content we are overwriting.
> + */
> + if (*headp != -1ULL) {
> + tfa->seq_tail = tfa->tf[tfa->file_tail].seq_tail;
> + }
> + /* Reset this file head/tail (overwrite). */
> + *headp = -1ULL;
> + *tailp = -1ULL;
> + break;
> + default:
> + abort();
> }
> - /* Reset this file head/tail (overwrite). */
> - *headp = -1ULL;
> - *tailp = -1ULL;
> }
>
> void tracefile_array_commit_seq(struct tracefile_array *tfa)
> @@ -104,8 +118,8 @@ void tracefile_array_commit_seq(struct tracefile_array *tfa)
> /* Not in tracefile rotation mode. */
> return;
> }
> - headp = &tfa->tf[tfa->file_head].seq_head;
> - tailp = &tfa->tf[tfa->file_head].seq_tail;
> + headp = &tfa->tf[tfa->file_head_write].seq_head;
> + tailp = &tfa->tf[tfa->file_head_write].seq_tail;
> /* Update head tracefile seq_head. */
> *headp = tfa->seq_head;
> /*
> @@ -117,9 +131,9 @@ void tracefile_array_commit_seq(struct tracefile_array *tfa)
> }
> }
>
> -uint64_t tracefile_array_get_file_index_head(struct tracefile_array *tfa)
> +uint64_t tracefile_array_get_read_file_index_head(struct tracefile_array *tfa)
> {
> - return tfa->file_head;
> + return tfa->file_head_read;
> }
>
> uint64_t tracefile_array_get_seq_head(struct tracefile_array *tfa)
> diff --git a/src/bin/lttng-relayd/tracefile-array.h b/src/bin/lttng-relayd/tracefile-array.h
> index 9158f4fe..04d9123d 100644
> --- a/src/bin/lttng-relayd/tracefile-array.h
> +++ b/src/bin/lttng-relayd/tracefile-array.h
> @@ -29,15 +29,30 @@ struct tracefile {
> uint64_t seq_tail; /* Oldest seqcount. Inclusive. */
> };
>
> +enum tracefile_rotate_type {
> + TRACEFILE_ROTATE_READ,
> + TRACEFILE_ROTATE_WRITE,
> +};
> +
> /*
> * Represents an array of trace files in a stream.
> + * head is the most recent file/trace packet.
> + * tail is the oldest file/trace packet.
> + *
> + * There are two heads: a "read" head and a "write" head. The "write" head is
> + * the position of the newest data file. The "read" head position is only moved
> + * forward when the index is received.
> + *
> + * The viewer uses the "read" head position as upper bound, which
> + * ensures it never attempts to open a non-existing index file.
> */
> struct tracefile_array {
> struct tracefile *tf;
> size_t count;
>
> /* Current head/tail files. */
> - uint64_t file_head;
> + uint64_t file_head_read;
> + uint64_t file_head_write;
> uint64_t file_tail;
>
> /* Overall head/tail seq for the entire array. Inclusive. */
> @@ -48,10 +63,10 @@ struct tracefile_array {
> struct tracefile_array *tracefile_array_create(size_t count);
> void tracefile_array_destroy(struct tracefile_array *tfa);
>
> -void tracefile_array_file_rotate(struct tracefile_array *tfa);
> +void tracefile_array_file_rotate(struct tracefile_array *tfa, enum tracefile_rotate_type type);
> void tracefile_array_commit_seq(struct tracefile_array *tfa);
>
> -uint64_t tracefile_array_get_file_index_head(struct tracefile_array *tfa);
> +uint64_t tracefile_array_get_read_file_index_head(struct tracefile_array *tfa);
> /* May return -1ULL in the case where we have not received any indexes yet. */
> uint64_t tracefile_array_get_seq_head(struct tracefile_array *tfa);
>
> diff --git a/src/bin/lttng-relayd/viewer-stream.c b/src/bin/lttng-relayd/viewer-stream.c
> index f41bbe1a..f3baf105 100644
> --- a/src/bin/lttng-relayd/viewer-stream.c
> +++ b/src/bin/lttng-relayd/viewer-stream.c
> @@ -106,7 +106,7 @@ struct relay_viewer_stream *viewer_stream_create(struct relay_stream *stream,
> }
> case LTTNG_VIEWER_SEEK_LAST:
> vstream->current_tracefile_id =
> - tracefile_array_get_file_index_head(stream->tfa);
> + tracefile_array_get_read_file_index_head(stream->tfa);
> /*
> * We seek at the very end of each stream, awaiting for
> * a future packet to eventually come in.
> --
> 2.17.1
>
^ permalink raw reply [flat|nested] 3+ messages in thread