lttng-dev.lists.lttng.org archive mirror
 help / color / mirror / Atom feed
* [lttng-dev] [PATCH lttng-tools] Fix: cleanup stream on snapshot failure
@ 2022-05-30 14:10 Marcel Hamer via lttng-dev
  2022-05-30 15:27 ` Jonathan Rajotte-Julien via lttng-dev
  0 siblings, 1 reply; 4+ messages in thread
From: Marcel Hamer via lttng-dev @ 2022-05-30 14:10 UTC (permalink / raw)
  To: lttng-dev

When a channel snapshot creation fails the stream should be cleaned up
properly. If the stream is not closed and cleaned properly on a failure,
the next time a snapshot is created an assert is triggered for:

	assert(!stream->trace_chunk);

inside the snapshot_channel function. Since the stream->trace_chunk was
not reset to NULL. The reset to NULL happens inside the
consumer_stream_close function.

Fixes #1352

Signed-off-by: Marcel Hamer <marcel.hamer@windriver.com>
---
 src/common/ust-consumer/ust-consumer.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/src/common/ust-consumer/ust-consumer.c b/src/common/ust-consumer/ust-consumer.c
index f176ca40a..f43216829 100644
--- a/src/common/ust-consumer/ust-consumer.c
+++ b/src/common/ust-consumer/ust-consumer.c
@@ -1147,13 +1147,13 @@ static int snapshot_channel(struct lttng_consumer_channel *channel,
 		if (use_relayd) {
 			ret = consumer_send_relayd_stream(stream, path);
 			if (ret < 0) {
-				goto error_unlock;
+				goto error_close_stream;
 			}
 		} else {
 			ret = consumer_stream_create_output_files(stream,
 					false);
 			if (ret < 0) {
-				goto error_unlock;
+				goto error_close_stream;
 			}
 			DBG("UST consumer snapshot stream (%" PRIu64 ")",
 					stream->key);
@@ -1170,19 +1170,19 @@ static int snapshot_channel(struct lttng_consumer_channel *channel,
 		ret = lttng_ustconsumer_take_snapshot(stream);
 		if (ret < 0) {
 			ERR("Taking UST snapshot");
-			goto error_unlock;
+			goto error_close_stream;
 		}
 
 		ret = lttng_ustconsumer_get_produced_snapshot(stream, &produced_pos);
 		if (ret < 0) {
 			ERR("Produced UST snapshot position");
-			goto error_unlock;
+			goto error_close_stream;
 		}
 
 		ret = lttng_ustconsumer_get_consumed_snapshot(stream, &consumed_pos);
 		if (ret < 0) {
 			ERR("Consumerd UST snapshot position");
-			goto error_unlock;
+			goto error_close_stream;
 		}
 
 		/*
-- 
2.25.1

_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [lttng-dev] [PATCH lttng-tools] Fix: cleanup stream on snapshot failure
  2022-05-30 14:10 [lttng-dev] [PATCH lttng-tools] Fix: cleanup stream on snapshot failure Marcel Hamer via lttng-dev
@ 2022-05-30 15:27 ` Jonathan Rajotte-Julien via lttng-dev
  2022-05-31 11:28   ` Marcel Hamer via lttng-dev
  0 siblings, 1 reply; 4+ messages in thread
From: Jonathan Rajotte-Julien via lttng-dev @ 2022-05-30 15:27 UTC (permalink / raw)
  To: Marcel Hamer; +Cc: lttng-dev

Hi Marcel,

Thanks for sending this patch.

Looks sensible to me, still do you have a reproducer for it? I went back to bug 1352 and even with https://bugs.lttng.org/attachments/546 was unable to force the assert failure.

Cheers

----- Original Message -----
> From: "Marcel Hamer via lttng-dev" <lttng-dev@lists.lttng.org>
> To: "lttng-dev" <lttng-dev@lists.lttng.org>
> Sent: Monday, 30 May, 2022 10:10:21
> Subject: [lttng-dev] [PATCH lttng-tools] Fix: cleanup stream on snapshot failure

> When a channel snapshot creation fails the stream should be cleaned up
> properly. If the stream is not closed and cleaned properly on a failure,
> the next time a snapshot is created an assert is triggered for:
> 
>	assert(!stream->trace_chunk);
> 
> inside the snapshot_channel function. Since the stream->trace_chunk was
> not reset to NULL. The reset to NULL happens inside the
> consumer_stream_close function.
> 
> Fixes #1352
> 
> Signed-off-by: Marcel Hamer <marcel.hamer@windriver.com>
> ---
> src/common/ust-consumer/ust-consumer.c | 10 +++++-----
> 1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/src/common/ust-consumer/ust-consumer.c
> b/src/common/ust-consumer/ust-consumer.c
> index f176ca40a..f43216829 100644
> --- a/src/common/ust-consumer/ust-consumer.c
> +++ b/src/common/ust-consumer/ust-consumer.c
> @@ -1147,13 +1147,13 @@ static int snapshot_channel(struct
> lttng_consumer_channel *channel,
> 		if (use_relayd) {
> 			ret = consumer_send_relayd_stream(stream, path);
> 			if (ret < 0) {
> -				goto error_unlock;
> +				goto error_close_stream;
> 			}
> 		} else {
> 			ret = consumer_stream_create_output_files(stream,
> 					false);
> 			if (ret < 0) {
> -				goto error_unlock;
> +				goto error_close_stream;
> 			}
> 			DBG("UST consumer snapshot stream (%" PRIu64 ")",
> 					stream->key);
> @@ -1170,19 +1170,19 @@ static int snapshot_channel(struct
> lttng_consumer_channel *channel,
> 		ret = lttng_ustconsumer_take_snapshot(stream);
> 		if (ret < 0) {
> 			ERR("Taking UST snapshot");
> -			goto error_unlock;
> +			goto error_close_stream;
> 		}
> 
> 		ret = lttng_ustconsumer_get_produced_snapshot(stream, &produced_pos);
> 		if (ret < 0) {
> 			ERR("Produced UST snapshot position");
> -			goto error_unlock;
> +			goto error_close_stream;
> 		}
> 
> 		ret = lttng_ustconsumer_get_consumed_snapshot(stream, &consumed_pos);
> 		if (ret < 0) {
> 			ERR("Consumerd UST snapshot position");
> -			goto error_unlock;
> +			goto error_close_stream;
> 		}
> 
> 		/*
> --
> 2.25.1
> 
> _______________________________________________
> lttng-dev mailing list
> lttng-dev@lists.lttng.org
> https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [lttng-dev] [PATCH lttng-tools] Fix: cleanup stream on snapshot failure
  2022-05-30 15:27 ` Jonathan Rajotte-Julien via lttng-dev
@ 2022-05-31 11:28   ` Marcel Hamer via lttng-dev
  2022-05-31 13:11     ` Jonathan Rajotte-Julien via lttng-dev
  0 siblings, 1 reply; 4+ messages in thread
From: Marcel Hamer via lttng-dev @ 2022-05-31 11:28 UTC (permalink / raw)
  To: Jonathan Rajotte-Julien; +Cc: lttng-dev

Hello Jonathan,

On Mon, May 30, 2022 at 11:27:55AM -0400, Jonathan Rajotte-Julien wrote:
> [Please note: This e-mail is from an EXTERNAL e-mail address]
> 
> Hi Marcel,
> 
> Thanks for sending this patch.
> 
> Looks sensible to me, still do you have a reproducer for it? I went back to bug 1352 and even with https://bugs.lttng.org/attachments/546 was unable to force the assert failure.

I can only reproduce it when running lttng-consumerd in a debugger
environment, in my case gdb. My reproduction scenario is:

1. Setting a breakpoint on snapshot_channel() inside
   src/common/ust-consumer/ust-consumer.c
2. When the breakpoint hits, remove the the complete lttng directory
   containing the session data.
3. Continue the lttng_consumerd process from gdb.
4. In that case you see a negative return value -1 from
   consumer_stream_create_output_files() inside snapshot_channel().
5. Take another snapshot and you will see lttng_consumerd crash because
   of the assert(!stream->trace_chunk); inside snapshot_channel(). This
   last action does not require any breakpoint intervention.

The scenario seems to be very timing sensitive to reproduce. I do not
have a clear command sequence to achieve the same error.

The proposed patch prevents lttng_consumerd from crashing in step 5.

Kind regards,

Marcel

> 
> Cheers
> 
> ----- Original Message -----
> > From: "Marcel Hamer via lttng-dev" <lttng-dev@lists.lttng.org>
> > To: "lttng-dev" <lttng-dev@lists.lttng.org>
> > Sent: Monday, 30 May, 2022 10:10:21
> > Subject: [lttng-dev] [PATCH lttng-tools] Fix: cleanup stream on snapshot failure
> 
> > When a channel snapshot creation fails the stream should be cleaned up
> > properly. If the stream is not closed and cleaned properly on a failure,
> > the next time a snapshot is created an assert is triggered for:
> >
> >       assert(!stream->trace_chunk);
> >
> > inside the snapshot_channel function. Since the stream->trace_chunk was
> > not reset to NULL. The reset to NULL happens inside the
> > consumer_stream_close function.
> >
> > Fixes #1352
> >
> > Signed-off-by: Marcel Hamer <marcel.hamer@windriver.com>
> > ---
> > src/common/ust-consumer/ust-consumer.c | 10 +++++-----
> > 1 file changed, 5 insertions(+), 5 deletions(-)
> >
> > diff --git a/src/common/ust-consumer/ust-consumer.c
> > b/src/common/ust-consumer/ust-consumer.c
> > index f176ca40a..f43216829 100644
> > --- a/src/common/ust-consumer/ust-consumer.c
> > +++ b/src/common/ust-consumer/ust-consumer.c
> > @@ -1147,13 +1147,13 @@ static int snapshot_channel(struct
> > lttng_consumer_channel *channel,
> >               if (use_relayd) {
> >                       ret = consumer_send_relayd_stream(stream, path);
> >                       if (ret < 0) {
> > -                             goto error_unlock;
> > +                             goto error_close_stream;
> >                       }
> >               } else {
> >                       ret = consumer_stream_create_output_files(stream,
> >                                       false);
> >                       if (ret < 0) {
> > -                             goto error_unlock;
> > +                             goto error_close_stream;
> >                       }
> >                       DBG("UST consumer snapshot stream (%" PRIu64 ")",
> >                                       stream->key);
> > @@ -1170,19 +1170,19 @@ static int snapshot_channel(struct
> > lttng_consumer_channel *channel,
> >               ret = lttng_ustconsumer_take_snapshot(stream);
> >               if (ret < 0) {
> >                       ERR("Taking UST snapshot");
> > -                     goto error_unlock;
> > +                     goto error_close_stream;
> >               }
> >
> >               ret = lttng_ustconsumer_get_produced_snapshot(stream, &produced_pos);
> >               if (ret < 0) {
> >                       ERR("Produced UST snapshot position");
> > -                     goto error_unlock;
> > +                     goto error_close_stream;
> >               }
> >
> >               ret = lttng_ustconsumer_get_consumed_snapshot(stream, &consumed_pos);
> >               if (ret < 0) {
> >                       ERR("Consumerd UST snapshot position");
> > -                     goto error_unlock;
> > +                     goto error_close_stream;
> >               }
> >
> >               /*
> > --
> > 2.25.1
> >
> > _______________________________________________
> > lttng-dev mailing list
> > lttng-dev@lists.lttng.org
> > https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [lttng-dev] [PATCH lttng-tools] Fix: cleanup stream on snapshot failure
  2022-05-31 11:28   ` Marcel Hamer via lttng-dev
@ 2022-05-31 13:11     ` Jonathan Rajotte-Julien via lttng-dev
  0 siblings, 0 replies; 4+ messages in thread
From: Jonathan Rajotte-Julien via lttng-dev @ 2022-05-31 13:11 UTC (permalink / raw)
  To: Marcel Hamer; +Cc: lttng-dev

Hi Marcel,

This is exactly the kind of reproducer we are looking for.

Thanks for providing it. I'll try it out and check if we need anything more in that patch.

Cheers

----- Original Message -----
> From: "Marcel Hamer" <marcel.hamer@windriver.com>
> To: "jonathan rajotte-julien" <jonathan.rajotte-julien@efficios.com>
> Cc: "lttng-dev" <lttng-dev@lists.lttng.org>
> Sent: Tuesday, May 31, 2022 7:28:55 AM
> Subject: Re: [lttng-dev] [PATCH lttng-tools] Fix: cleanup stream on snapshot failure

> Hello Jonathan,
> 
> On Mon, May 30, 2022 at 11:27:55AM -0400, Jonathan Rajotte-Julien wrote:
>> [Please note: This e-mail is from an EXTERNAL e-mail address]
>> 
>> Hi Marcel,
>> 
>> Thanks for sending this patch.
>> 
>> Looks sensible to me, still do you have a reproducer for it? I went back to bug
>> 1352 and even with https://bugs.lttng.org/attachments/546 was unable to force
>> the assert failure.
> 
> I can only reproduce it when running lttng-consumerd in a debugger
> environment, in my case gdb. My reproduction scenario is:
> 
> 1. Setting a breakpoint on snapshot_channel() inside
>   src/common/ust-consumer/ust-consumer.c
> 2. When the breakpoint hits, remove the the complete lttng directory
>   containing the session data.
> 3. Continue the lttng_consumerd process from gdb.
> 4. In that case you see a negative return value -1 from
>   consumer_stream_create_output_files() inside snapshot_channel().
> 5. Take another snapshot and you will see lttng_consumerd crash because
>   of the assert(!stream->trace_chunk); inside snapshot_channel(). This
>   last action does not require any breakpoint intervention.
> 
> The scenario seems to be very timing sensitive to reproduce. I do not
> have a clear command sequence to achieve the same error.
> 
> The proposed patch prevents lttng_consumerd from crashing in step 5.
> 
> Kind regards,
> 
> Marcel
> 
>> 
>> Cheers
>> 
>> ----- Original Message -----
>> > From: "Marcel Hamer via lttng-dev" <lttng-dev@lists.lttng.org>
>> > To: "lttng-dev" <lttng-dev@lists.lttng.org>
>> > Sent: Monday, 30 May, 2022 10:10:21
>> > Subject: [lttng-dev] [PATCH lttng-tools] Fix: cleanup stream on snapshot failure
>> 
>> > When a channel snapshot creation fails the stream should be cleaned up
>> > properly. If the stream is not closed and cleaned properly on a failure,
>> > the next time a snapshot is created an assert is triggered for:
>> >
>> >       assert(!stream->trace_chunk);
>> >
>> > inside the snapshot_channel function. Since the stream->trace_chunk was
>> > not reset to NULL. The reset to NULL happens inside the
>> > consumer_stream_close function.
>> >
>> > Fixes #1352
>> >
>> > Signed-off-by: Marcel Hamer <marcel.hamer@windriver.com>
>> > ---
>> > src/common/ust-consumer/ust-consumer.c | 10 +++++-----
>> > 1 file changed, 5 insertions(+), 5 deletions(-)
>> >
>> > diff --git a/src/common/ust-consumer/ust-consumer.c
>> > b/src/common/ust-consumer/ust-consumer.c
>> > index f176ca40a..f43216829 100644
>> > --- a/src/common/ust-consumer/ust-consumer.c
>> > +++ b/src/common/ust-consumer/ust-consumer.c
>> > @@ -1147,13 +1147,13 @@ static int snapshot_channel(struct
>> > lttng_consumer_channel *channel,
>> >               if (use_relayd) {
>> >                       ret = consumer_send_relayd_stream(stream, path);
>> >                       if (ret < 0) {
>> > -                             goto error_unlock;
>> > +                             goto error_close_stream;
>> >                       }
>> >               } else {
>> >                       ret = consumer_stream_create_output_files(stream,
>> >                                       false);
>> >                       if (ret < 0) {
>> > -                             goto error_unlock;
>> > +                             goto error_close_stream;
>> >                       }
>> >                       DBG("UST consumer snapshot stream (%" PRIu64 ")",
>> >                                       stream->key);
>> > @@ -1170,19 +1170,19 @@ static int snapshot_channel(struct
>> > lttng_consumer_channel *channel,
>> >               ret = lttng_ustconsumer_take_snapshot(stream);
>> >               if (ret < 0) {
>> >                       ERR("Taking UST snapshot");
>> > -                     goto error_unlock;
>> > +                     goto error_close_stream;
>> >               }
>> >
>> >               ret = lttng_ustconsumer_get_produced_snapshot(stream, &produced_pos);
>> >               if (ret < 0) {
>> >                       ERR("Produced UST snapshot position");
>> > -                     goto error_unlock;
>> > +                     goto error_close_stream;
>> >               }
>> >
>> >               ret = lttng_ustconsumer_get_consumed_snapshot(stream, &consumed_pos);
>> >               if (ret < 0) {
>> >                       ERR("Consumerd UST snapshot position");
>> > -                     goto error_unlock;
>> > +                     goto error_close_stream;
>> >               }
>> >
>> >               /*
>> > --
>> > 2.25.1
>> >
>> > _______________________________________________
>> > lttng-dev mailing list
>> > lttng-dev@lists.lttng.org
> > > https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2022-05-31 13:11 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-30 14:10 [lttng-dev] [PATCH lttng-tools] Fix: cleanup stream on snapshot failure Marcel Hamer via lttng-dev
2022-05-30 15:27 ` Jonathan Rajotte-Julien via lttng-dev
2022-05-31 11:28   ` Marcel Hamer via lttng-dev
2022-05-31 13:11     ` Jonathan Rajotte-Julien via lttng-dev

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).