All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: snapshot data corrupted after the output buffer is filled
       [not found] <CAF2baFfMNzRQnx-FC7kFRr5m4NpZBEcEohnwf50uOs2vv9=GBw@mail.gmail.com>
@ 2014-11-14 19:59 ` Julien Desfossez
  2014-11-15 17:59 ` Mathieu Desnoyers
       [not found] ` <734976879.13957.1416074394667.JavaMail.zimbra@efficios.com>
  2 siblings, 0 replies; 6+ messages in thread
From: Julien Desfossez @ 2014-11-14 19:59 UTC (permalink / raw)
  To: Anders Wallin, lttng-dev

That's indeed really bad, would you mind filling a bug report on
https://bugs.lttng.org/projects/lttng-tools so we can keep track of this
issue ?

Thanks,

Julien

On 14-11-14 02:28 AM, Anders Wallin wrote:
> Hi,
> 
> it looks like snapshot events became faulty after the output max size is
> reached.
> The next snapshot get mixed events from different buffers. I have tested
> it on 2.3, 2.4 and 2.5
> and it's in all of them, but it hits earlier in 2.5 due to the  problem
> reported in
> http://lists.lttng.org/pipermail/lttng-dev/2014-November/023773.html
> 
> There should never be more then 2 HUNKS in a diff between 2 consecutive
> snapshots, here I got 
> 11 as the worst case.
> 
> TESTCASE;
> 
> tracetest code:
> ---------------------------------------------------------------
> #include <stdio.h>
> #include "tracetest-tp.h"
> 
> int main(int argc, char* argv[])
> {
>         int i;
>         for (i = 0; i < 1000; i++) {
>                 tracepoint(tracetest, first_tp, i, "test");
>         }
>         return 0;
> }
> ---------------------------------------------------------------
> 
> test script:
> ---------------------------------------------------------------
> #!/bin/bash 
> : ${NO_OF_SUBBUF:=6}
> : ${SUBBUF_SIZE:=4096}
> : ${KERNEL_TRACE:=NO}
> : ${LOOPS:=10}
> 
> SESSION=test
> CHANNEL1=ch1
> CHANNEL2=ch2
> 
> CH2_SUBBUF_SIZE=$(($SUBBUF_SIZE*$NO_OF_SUBBUF))
> 
> 
> LTTNG='lttng -n'
> killall lttng-consumerd > /dev/null 2>&1
> killall lttng-sessiond > /dev/null 2>&1
> if [ -z $DEBUG ]; then
>     lttng-sessiond --no-kernel -d
> else
>     lttng-sessiond -v --no-kernel --verbose-consumer -b
> fi
> sleep 1
> 
> if [ ! -z $DEBUG ]; then
>     export LTTNG_UST_DEBUG=1
> fi
> 
> rm -rf $HOME/lttng-traces
> rm snapshot/*.bt
> 
> $LTTNG --version
> 
> $LTTNG create $SESSION --snapshot
> $LTTNG enable-channel -u --subbuf-size $SUBBUF_SIZE --num-subbuf
> $NO_OF_SUBBUF --overwrite $CHANNEL1
> $LTTNG enable-channel -u --subbuf-size $CH2_SUBBUF_SIZE --num-subbuf 2
> --overwrite $CHANNEL2
> $LTTNG enable-event -u -c $CHANNEL1 "*"
> $LTTNG start
> $LTTNG list $SESSION
> 
> mkdir -p snapshot/
> 
> h=1
> for i in $(seq 1 $LOOPS); do
>     $APP
>     $LTTNG snapshot record
>     # print no of events and first and last event                      
>                                                                        
>                                    
>     last=$(ls -1drt $HOME/lttng-traces/$SESSION*/* | tail -1)
>     babeltrace $last > snapshot/$i.bt <http://i.bt>
>     cat snapshot/$i.bt <http://i.bt> | wc -l
>     cat snapshot/$i.bt <http://i.bt> | head -1
>     cat snapshot/$i.bt <http://i.bt> | tail -1
>     diff -u snapshot/$i.bt <http://i.bt> snapshot/$h.bt <http://h.bt> >
> snapshot/$i_$h.diff
>     cat snapshot/$i_$h.diff | grep "@@" | wc -l | xargs echo "NUMBER OF
> DIFF HUNKS = "
>     h=$i
> done
> 
> $LTTNG stop
> $LTTNG destroy $SESSION
> 
> ---------------------------------------------------------------
> 
> result from run (removed some output lines for readability)
> ---------------------------------------------------------------
> lttng (LTTng Trace Control) 2.4.2 - Époque Opaque
> Snapshot mode set. Every channel enabled for that session will be set in
> overwrite mode and mmap output.
> Warning: The number of subbuffers (6) is rounded to the next power of 2 (8)
> UST channel ch1 enabled for session test
> Warning: The subbuf size (24576) is rounded to the next power of 2 (32768)
> UST channel ch2 enabled for session test
> UST event * created in channel ch1
> Tracing started for session test
> Tracing session test: [active snapshot]
>     Trace path: 
> 
> === Domain: UST global ===
> 
> Buffer type: per UID
> 
> Channels:
> -------------
> - ch1: [enabled]
> 
>     Attributes:
>       overwrite mode: 1
>       subbufers size: 4096
>       number of subbufers: 8
>       switch timer interval: 0
>       read timer interval: 0
>       output: mmap()
> 
>     Events:
>       * (type: tracepoint) [enabled]
> - ch2: [enabled]
> 
>     Attributes:
>       overwrite mode: 1
>       subbufers size: 32768
>       number of subbufers: 2
>       switch timer interval: 0
>       read timer interval: 0
>       output: mmap()
> 
>     Events:
>       None
> 
> Snapshot recorded successfully for session test
> 1011
> [07:09:43.278933559] (+?.?????????) 07f8cf84a38f
> ust_baddr_statedump:soinfo: { cpu_id = 6 }, { baddr = 0x7FFF645EE000,
> sopath = "[vdso]", size = 0, mtime = -1 }
> [07:09:43.279975100] (+0.000000975) 07f8cf84a38f tracetest:first_tp: {
> cpu_id = 5 }, { my_string_field = "test", my_integer_field = 999 }
> NUMBER OF DIFF HUNKS =  0
> Snapshot recorded successfully for session test
> 2022
> [07:09:43.278933559] (+?.?????????) 07f8cf84a38f
> ust_baddr_statedump:soinfo: { cpu_id = 6 }, { baddr = 0x7FFF645EE000,
> sopath = "[vdso]", size = 0, mtime = -1 }
> [07:09:43.337703508] (+0.000001238) 07f8cf84a38f tracetest:first_tp: {
> cpu_id = 4 }, { my_string_field = "test", my_integer_field = 999 }
> NUMBER OF DIFF HUNKS =  1
> Snapshot recorded successfully for session test
> 3033
> [07:09:43.278933559] (+?.?????????) 07f8cf84a38f
> ust_baddr_statedump:soinfo: { cpu_id = 6 }, { baddr = 0x7FFF645EE000,
> sopath = "[vdso]", size = 0, mtime = -1 }
> [07:09:43.432827051] (+0.000000308) 07f8cf84a38f tracetest:first_tp: {
> cpu_id = 6 }, { my_string_field = "test", my_integer_field = 999 }
> NUMBER OF DIFF HUNKS =  1
> Snapshot recorded successfully for session test
> 4044
> [07:09:43.278933559] (+?.?????????) 07f8cf84a38f
> ust_baddr_statedump:soinfo: { cpu_id = 6 }, { baddr = 0x7FFF645EE000,
> sopath = "[vdso]", size = 0, mtime = -1 }
> [07:09:43.475784825] (+0.000000501) 07f8cf84a38f tracetest:first_tp: {
> cpu_id = 5 }, { my_string_field = "test", my_integer_field = 999 }
> NUMBER OF DIFF HUNKS =  1
> Snapshot recorded successfully for session test
> 4735
> [07:09:43.279031320] (+?.?????????) 07f8cf84a38f tracetest:first_tp: {
> cpu_id = 5 }, { my_string_field = "test", my_integer_field = 0 }
> [07:09:43.549945106] (+0.000000517) 07f8cf84a38f tracetest:first_tp: {
> cpu_id = 6 }, { my_string_field = "test", my_integer_field = 999 }
> NUMBER OF DIFF HUNKS =  12
> Snapshot recorded successfully for session test
> 4437
> [07:09:43.335916076] (+?.?????????) 07f8cf84a38f
> ust_baddr_statedump:soinfo: { cpu_id = 2 }, { baddr = 0x7FFFAD1FE000,
> sopath = "[vdso]", size = 0, mtime = -1 }
> [07:09:43.654229030] (+0.000000436) 07f8cf84a38f tracetest:first_tp: {
> cpu_id = 5 }, { my_string_field = "test", my_integer_field = 999 }
> NUMBER OF DIFF HUNKS =  3
> Snapshot recorded successfully for session test
> 4746
> [07:09:43.335916076] (+?.?????????) 07f8cf84a38f
> ust_baddr_statedump:soinfo: { cpu_id = 2 }, { baddr = 0x7FFFAD1FE000,
> sopath = "[vdso]", size = 0, mtime = -1 }
> [07:09:43.747509349] (+0.000000502) 07f8cf84a38f tracetest:first_tp: {
> cpu_id = 6 }, { my_string_field = "test", my_integer_field = 999 }
> NUMBER OF DIFF HUNKS =  11
> Snapshot recorded successfully for session test
> 5757
> [07:09:43.335916076] (+?.?????????) 07f8cf84a38f
> ust_baddr_statedump:soinfo: { cpu_id = 2 }, { baddr = 0x7FFFAD1FE000,
> sopath = "[vdso]", size = 0, mtime = -1 }
> [07:09:43.814938128] (+0.000001371) 07f8cf84a38f tracetest:first_tp: {
> cpu_id = 4 }, { my_string_field = "test", my_integer_field = 999 }
> NUMBER OF DIFF HUNKS =  1
> Snapshot recorded successfully for session test
> 6066
> [07:09:43.335916076] (+?.?????????) 07f8cf84a38f
> ust_baddr_statedump:soinfo: { cpu_id = 2 }, { baddr = 0x7FFFAD1FE000,
> sopath = "[vdso]", size = 0, mtime = -1 }
> [07:09:43.919112586] (+0.000000990) 07f8cf84a38f tracetest:first_tp: {
> cpu_id = 6 }, { my_string_field = "test", my_integer_field = 999 }
> NUMBER OF DIFF HUNKS =  11
> Snapshot recorded successfully for session test
> 6077
> [07:09:43.335916076] (+?.?????????) 07f8cf84a38f
> ust_baddr_statedump:soinfo: { cpu_id = 2 }, { baddr = 0x7FFFAD1FE000,
> sopath = "[vdso]", size = 0, mtime = -1 }
> [07:09:44.004234933] (+0.000001213) 07f8cf84a38f tracetest:first_tp: {
> cpu_id = 6 }, { my_string_field = "test", my_integer_field = 999 }
> NUMBER OF DIFF HUNKS =  2
> Waiting for data availability
> Tracing stopped for session test
> Session test destroyed
> ---------------------------------------------------------------
> 
> Regards
> 
> Anders Wallin
> 
> 
> _______________________________________________
> lttng-dev mailing list
> lttng-dev@lists.lttng.org
> http://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: snapshot data corrupted after the output buffer is filled
       [not found] <CAF2baFfMNzRQnx-FC7kFRr5m4NpZBEcEohnwf50uOs2vv9=GBw@mail.gmail.com>
  2014-11-14 19:59 ` snapshot data corrupted after the output buffer is filled Julien Desfossez
@ 2014-11-15 17:59 ` Mathieu Desnoyers
       [not found] ` <734976879.13957.1416074394667.JavaMail.zimbra@efficios.com>
  2 siblings, 0 replies; 6+ messages in thread
From: Mathieu Desnoyers @ 2014-11-15 17:59 UTC (permalink / raw)
  To: Anders Wallin; +Cc: lttng-dev


[-- Attachment #1.1: Type: text/plain, Size: 10416 bytes --]

----- Original Message -----

> From: "Anders Wallin" <wallinux@gmail.com>
> To: lttng-dev@lists.lttng.org
> Sent: Friday, November 14, 2014 8:28:18 AM
> Subject: [lttng-dev] snapshot data corrupted after the output buffer is
> filled

> Hi,

> it looks like snapshot events became faulty after the output max size is
> reached.
> The next snapshot get mixed events from different buffers. I have tested it
> on 2.3, 2.4 and 2.5
> and it's in all of them, but it hits earlier in 2.5 due to the problem
> reported in
> http://lists.lttng.org/pipermail/lttng-dev/2014-November/023773.html

> There should never be more then 2 HUNKS in a diff between 2 consecutive
> snapshots, here I got
> 11 as the worst case.

Hi Anders, 

This is not a bug, rather just per-cpu buffering at work.It does have some 
rather non-obvious consequences in terms of snapshot event flow. 

To understand this, we need to keep in mind that LTTng does everything 
in per-cpu buffers, and that overwrite of oldest events in buffer full conditions 
happens per-cpu. 

The use-case you show below really makes sense when you start to consider 
on which CPU (and thus per-cpu buffer) the application is running. Initially, 
the first app runs on CPU 6. As long as the following applications run on 
other cpus, we keep the same first event. Then when enough events are 
written in cpu 6 buffer to overwrite the first event, we see a first event change. 
Then, depending on the scheduling of applications, older parts of per-cpu 
buffers are overwritten, and others are not (if there are less events in those 
buffer). This means some per-cpu buffers will have events going further back 
in time compared to other higher-throughput CPUs. 

This is why you see more than 2 hunks in your diff. Babeltrace merges those 
per-cpu buffers back into a single text output, and the fact that some events 
have been overwritten for a time-range for high-throughput buffers, but not 
for low-throughput buffers, make it appear as if events are missing in the 
stream if your model is a single system-wide buffer, but the missing events 
make sense if you see it as a merge of per-cpu buffers. 

Our current behavior is to gather all available data when taking a snapshot, 
and babeltrace behavior is to show all events available in the streams. We 
could perhaps consider a different babeltrace behavior (a flag) that would 
allow it to only show the events for the intersection of time range of all 
streams, based on the packet header timestamp begin/end. This would 
take care of hiding events that happen in low-throughput streams when 
there is no intersection with snapshotted high-throughput streams. 

And perhaps this should be the default ? 

Thanks! 

Mathieu 

> TESTCASE;

> tracetest code:
> ---------------------------------------------------------------
> #include <stdio.h>
> #include "tracetest-tp.h"

> int main(int argc, char* argv[])
> {
> int i;
> for (i = 0; i < 1000; i++) {
> tracepoint(tracetest, first_tp, i, "test");
> }
> return 0;
> }
> ---------------------------------------------------------------

> test script:
> ---------------------------------------------------------------
> #!/bin/bash
> : ${NO_OF_SUBBUF:=6}
> : ${SUBBUF_SIZE:=4096}
> : ${KERNEL_TRACE:=NO}
> : ${LOOPS:=10}

> SESSION=test
> CHANNEL1=ch1
> CHANNEL2=ch2

> CH2_SUBBUF_SIZE=$(($SUBBUF_SIZE*$NO_OF_SUBBUF))

> LTTNG='lttng -n'
> killall lttng-consumerd > /dev/null 2>&1
> killall lttng-sessiond > /dev/null 2>&1
> if [ -z $DEBUG ]; then
> lttng-sessiond --no-kernel -d
> else
> lttng-sessiond -v --no-kernel --verbose-consumer -b
> fi
> sleep 1

> if [ ! -z $DEBUG ]; then
> export LTTNG_UST_DEBUG=1
> fi

> rm -rf $HOME/lttng-traces
> rm snapshot/*.bt

> $LTTNG --version

> $LTTNG create $SESSION --snapshot
> $LTTNG enable-channel -u --subbuf-size $SUBBUF_SIZE --num-subbuf
> $NO_OF_SUBBUF --overwrite $CHANNEL1
> $LTTNG enable-channel -u --subbuf-size $CH2_SUBBUF_SIZE --num-subbuf 2
> --overwrite $CHANNEL2
> $LTTNG enable-event -u -c $CHANNEL1 "*"
> $LTTNG start
> $LTTNG list $SESSION

> mkdir -p snapshot/

> h=1
> for i in $(seq 1 $LOOPS); do
> $APP
> $LTTNG snapshot record
> # print no of events and first and last event
> last=$(ls -1drt $HOME/lttng-traces/$SESSION*/* | tail -1)
> babeltrace $last > snapshot/$ i.bt
> cat snapshot/$ i.bt | wc -l
> cat snapshot/$ i.bt | head -1
> cat snapshot/$ i.bt | tail -1
> diff -u snapshot/$ i.bt snapshot/$ h.bt > snapshot/$i_$h.diff
> cat snapshot/$i_$h.diff | grep "@@" | wc -l | xargs echo "NUMBER OF DIFF
> HUNKS = "
> h=$i
> done

> $LTTNG stop
> $LTTNG destroy $SESSION

> ---------------------------------------------------------------

> result from run (removed some output lines for readability)
> ---------------------------------------------------------------
> lttng (LTTng Trace Control) 2.4.2 - Époque Opaque
> Snapshot mode set. Every channel enabled for that session will be set in
> overwrite mode and mmap output.
> Warning: The number of subbuffers (6) is rounded to the next power of 2 (8)
> UST channel ch1 enabled for session test
> Warning: The subbuf size (24576) is rounded to the next power of 2 (32768)
> UST channel ch2 enabled for session test
> UST event * created in channel ch1
> Tracing started for session test
> Tracing session test: [active snapshot]
> Trace path:

> === Domain: UST global ===

> Buffer type: per UID

> Channels:
> -------------
> - ch1: [enabled]

> Attributes:
> overwrite mode: 1
> subbufers size: 4096
> number of subbufers: 8
> switch timer interval: 0
> read timer interval: 0
> output: mmap()

> Events:
> * (type: tracepoint) [enabled]
> - ch2: [enabled]

> Attributes:
> overwrite mode: 1
> subbufers size: 32768
> number of subbufers: 2
> switch timer interval: 0
> read timer interval: 0
> output: mmap()

> Events:
> None

> Snapshot recorded successfully for session test
> 1011
> [07:09:43.278933559] (+?.?????????) 07f8cf84a38f ust_baddr_statedump:soinfo:
> { cpu_id = 6 }, { baddr = 0x7FFF645EE000, sopath = "[vdso]", size = 0, mtime
> = -1 }
> [07:09:43.279975100] (+0.000000975) 07f8cf84a38f tracetest:first_tp: { cpu_id
> = 5 }, { my_string_field = "test", my_integer_field = 999 }
> NUMBER OF DIFF HUNKS = 0
> Snapshot recorded successfully for session test
> 2022
> [07:09:43.278933559] (+?.?????????) 07f8cf84a38f ust_baddr_statedump:soinfo:
> { cpu_id = 6 }, { baddr = 0x7FFF645EE000, sopath = "[vdso]", size = 0, mtime
> = -1 }
> [07:09:43.337703508] (+0.000001238) 07f8cf84a38f tracetest:first_tp: { cpu_id
> = 4 }, { my_string_field = "test", my_integer_field = 999 }
> NUMBER OF DIFF HUNKS = 1
> Snapshot recorded successfully for session test
> 3033
> [07:09:43.278933559] (+?.?????????) 07f8cf84a38f ust_baddr_statedump:soinfo:
> { cpu_id = 6 }, { baddr = 0x7FFF645EE000, sopath = "[vdso]", size = 0, mtime
> = -1 }
> [07:09:43.432827051] (+0.000000308) 07f8cf84a38f tracetest:first_tp: { cpu_id
> = 6 }, { my_string_field = "test", my_integer_field = 999 }
> NUMBER OF DIFF HUNKS = 1
> Snapshot recorded successfully for session test
> 4044
> [07:09:43.278933559] (+?.?????????) 07f8cf84a38f ust_baddr_statedump:soinfo:
> { cpu_id = 6 }, { baddr = 0x7FFF645EE000, sopath = "[vdso]", size = 0, mtime
> = -1 }
> [07:09:43.475784825] (+0.000000501) 07f8cf84a38f tracetest:first_tp: { cpu_id
> = 5 }, { my_string_field = "test", my_integer_field = 999 }
> NUMBER OF DIFF HUNKS = 1
> Snapshot recorded successfully for session test
> 4735
> [07:09:43.279031320] (+?.?????????) 07f8cf84a38f tracetest:first_tp: { cpu_id
> = 5 }, { my_string_field = "test", my_integer_field = 0 }
> [07:09:43.549945106] (+0.000000517) 07f8cf84a38f tracetest:first_tp: { cpu_id
> = 6 }, { my_string_field = "test", my_integer_field = 999 }
> NUMBER OF DIFF HUNKS = 12
> Snapshot recorded successfully for session test
> 4437
> [07:09:43.335916076] (+?.?????????) 07f8cf84a38f ust_baddr_statedump:soinfo:
> { cpu_id = 2 }, { baddr = 0x7FFFAD1FE000, sopath = "[vdso]", size = 0, mtime
> = -1 }
> [07:09:43.654229030] (+0.000000436) 07f8cf84a38f tracetest:first_tp: { cpu_id
> = 5 }, { my_string_field = "test", my_integer_field = 999 }
> NUMBER OF DIFF HUNKS = 3
> Snapshot recorded successfully for session test
> 4746
> [07:09:43.335916076] (+?.?????????) 07f8cf84a38f ust_baddr_statedump:soinfo:
> { cpu_id = 2 }, { baddr = 0x7FFFAD1FE000, sopath = "[vdso]", size = 0, mtime
> = -1 }
> [07:09:43.747509349] (+0.000000502) 07f8cf84a38f tracetest:first_tp: { cpu_id
> = 6 }, { my_string_field = "test", my_integer_field = 999 }
> NUMBER OF DIFF HUNKS = 11
> Snapshot recorded successfully for session test
> 5757
> [07:09:43.335916076] (+?.?????????) 07f8cf84a38f ust_baddr_statedump:soinfo:
> { cpu_id = 2 }, { baddr = 0x7FFFAD1FE000, sopath = "[vdso]", size = 0, mtime
> = -1 }
> [07:09:43.814938128] (+0.000001371) 07f8cf84a38f tracetest:first_tp: { cpu_id
> = 4 }, { my_string_field = "test", my_integer_field = 999 }
> NUMBER OF DIFF HUNKS = 1
> Snapshot recorded successfully for session test
> 6066
> [07:09:43.335916076] (+?.?????????) 07f8cf84a38f ust_baddr_statedump:soinfo:
> { cpu_id = 2 }, { baddr = 0x7FFFAD1FE000, sopath = "[vdso]", size = 0, mtime
> = -1 }
> [07:09:43.919112586] (+0.000000990) 07f8cf84a38f tracetest:first_tp: { cpu_id
> = 6 }, { my_string_field = "test", my_integer_field = 999 }
> NUMBER OF DIFF HUNKS = 11
> Snapshot recorded successfully for session test
> 6077
> [07:09:43.335916076] (+?.?????????) 07f8cf84a38f ust_baddr_statedump:soinfo:
> { cpu_id = 2 }, { baddr = 0x7FFFAD1FE000, sopath = "[vdso]", size = 0, mtime
> = -1 }
> [07:09:44.004234933] (+0.000001213) 07f8cf84a38f tracetest:first_tp: { cpu_id
> = 6 }, { my_string_field = "test", my_integer_field = 999 }
> NUMBER OF DIFF HUNKS = 2
> Waiting for data availability
> Tracing stopped for session test
> Session test destroyed
> ---------------------------------------------------------------

> Regards

> Anders Wallin

> _______________________________________________
> lttng-dev mailing list
> lttng-dev@lists.lttng.org
> http://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

-- 
Mathieu Desnoyers 
EfficiOS Inc. 
http://www.efficios.com 

[-- Attachment #1.2: Type: text/html, Size: 18108 bytes --]

[-- Attachment #2: Type: text/plain, Size: 155 bytes --]

_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
http://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: snapshot data corrupted after the output buffer is filled
       [not found] ` <734976879.13957.1416074394667.JavaMail.zimbra@efficios.com>
@ 2014-11-17  9:45   ` Anders Wallin
       [not found]   ` <CAF2baFcfh8h769xaxLQpW4=9nmWRBcmi7tP79mW=w1DyqJm9Hw@mail.gmail.com>
  1 sibling, 0 replies; 6+ messages in thread
From: Anders Wallin @ 2014-11-17  9:45 UTC (permalink / raw)
  To: Mathieu Desnoyers; +Cc: lttng-dev


[-- Attachment #1.1: Type: text/plain, Size: 11118 bytes --]

Thx Mathieu,

if I'm running my test on one cpu (taskset -c 1 ./test.run) it works as
expected and described by you.
The babeltrace flag would be a good solution! Do you want me to file a
bug/enhancement on this?

Regards



Anders Wallin

On Sat, Nov 15, 2014 at 6:59 PM, Mathieu Desnoyers <
mathieu.desnoyers@efficios.com> wrote:

> ------------------------------
>
> *From: *"Anders Wallin" <wallinux@gmail.com>
> *To: *lttng-dev@lists.lttng.org
> *Sent: *Friday, November 14, 2014 8:28:18 AM
> *Subject: *[lttng-dev] snapshot data corrupted after the output buffer
> is        filled
>
> Hi,
>
> it looks like snapshot events became faulty after the output max size is
> reached.
> The next snapshot get mixed events from different buffers. I have tested
> it on 2.3, 2.4 and 2.5
> and it's in all of them, but it hits earlier in 2.5 due to the  problem
> reported in
> http://lists.lttng.org/pipermail/lttng-dev/2014-November/023773.html
>
> There should never be more then 2 HUNKS in a diff between 2 consecutive
> snapshots, here I got
> 11 as the worst case.
>
> Hi Anders,
>
> This is not a bug, rather just per-cpu buffering at work.It does have some
> rather non-obvious consequences in terms of snapshot event flow.
>
> To understand this, we need to keep in mind that LTTng does everything
> in per-cpu buffers, and that overwrite of oldest events in buffer full
> conditions
> happens per-cpu.
>
> The use-case you show below really makes sense when you start to consider
> on which CPU (and thus per-cpu buffer) the application is running.
> Initially,
> the first app runs on CPU 6. As long as the following applications run on
> other cpus, we keep the same first event. Then when enough events are
> written in cpu 6 buffer to overwrite the first event, we see a first event
> change.
> Then, depending on the scheduling of applications, older parts of per-cpu
> buffers are overwritten, and others are not (if there are less events in
> those
> buffer). This means some per-cpu buffers will have events going further
> back
> in time compared to other higher-throughput CPUs.
>
> This is why you see more than 2 hunks in your diff. Babeltrace merges those
> per-cpu buffers back into a single text output, and the fact that some
> events
> have been overwritten for a time-range for high-throughput buffers, but not
> for low-throughput buffers, make it appear as if events are missing in the
> stream if your model is a single system-wide buffer, but the missing events
> make sense if you see it as a merge of per-cpu buffers.
>
> Our current behavior is to gather all available data when taking a
> snapshot,
> and babeltrace behavior is to show all events available in the streams. We
> could perhaps consider a different babeltrace behavior (a flag) that would
> allow it to only show the events for the intersection of time range of all
> streams, based on the packet header timestamp begin/end. This would
> take care of hiding events that happen in low-throughput streams when
> there is no intersection with snapshotted high-throughput streams.
>
> And perhaps this should be the default ?
>
> Thanks!
>
> Mathieu
>
>
> TESTCASE;
>
> tracetest code:
> ---------------------------------------------------------------
> #include <stdio.h>
> #include "tracetest-tp.h"
>
> int main(int argc, char* argv[])
> {
>         int i;
>         for (i = 0; i < 1000; i++) {
>                 tracepoint(tracetest, first_tp, i, "test");
>         }
>         return 0;
> }
> ---------------------------------------------------------------
>
> test script:
> ---------------------------------------------------------------
> #!/bin/bash
> : ${NO_OF_SUBBUF:=6}
> : ${SUBBUF_SIZE:=4096}
> : ${KERNEL_TRACE:=NO}
> : ${LOOPS:=10}
>
> SESSION=test
> CHANNEL1=ch1
> CHANNEL2=ch2
>
> CH2_SUBBUF_SIZE=$(($SUBBUF_SIZE*$NO_OF_SUBBUF))
>
>
> LTTNG='lttng -n'
> killall lttng-consumerd > /dev/null 2>&1
> killall lttng-sessiond > /dev/null 2>&1
> if [ -z $DEBUG ]; then
>     lttng-sessiond --no-kernel -d
> else
>     lttng-sessiond -v --no-kernel --verbose-consumer -b
> fi
> sleep 1
>
> if [ ! -z $DEBUG ]; then
>     export LTTNG_UST_DEBUG=1
> fi
>
> rm -rf $HOME/lttng-traces
> rm snapshot/*.bt
>
> $LTTNG --version
>
> $LTTNG create $SESSION --snapshot
> $LTTNG enable-channel -u --subbuf-size $SUBBUF_SIZE --num-subbuf
> $NO_OF_SUBBUF --overwrite $CHANNEL1
> $LTTNG enable-channel -u --subbuf-size $CH2_SUBBUF_SIZE --num-subbuf 2
> --overwrite $CHANNEL2
> $LTTNG enable-event -u -c $CHANNEL1 "*"
> $LTTNG start
> $LTTNG list $SESSION
>
> mkdir -p snapshot/
>
> h=1
> for i in $(seq 1 $LOOPS); do
>     $APP
>     $LTTNG snapshot record
>     # print no of events and first and last event
>
>
>     last=$(ls -1drt $HOME/lttng-traces/$SESSION*/* | tail -1)
>     babeltrace $last > snapshot/$i.bt
>     cat snapshot/$i.bt | wc -l
>     cat snapshot/$i.bt | head -1
>     cat snapshot/$i.bt | tail -1
>     diff -u snapshot/$i.bt snapshot/$h.bt > snapshot/$i_$h.diff
>     cat snapshot/$i_$h.diff | grep "@@" | wc -l | xargs echo "NUMBER OF
> DIFF HUNKS = "
>     h=$i
> done
>
> $LTTNG stop
> $LTTNG destroy $SESSION
>
> ---------------------------------------------------------------
>
> result from run (removed some output lines for readability)
> ---------------------------------------------------------------
> lttng (LTTng Trace Control) 2.4.2 - Époque Opaque
> Snapshot mode set. Every channel enabled for that session will be set in
> overwrite mode and mmap output.
> Warning: The number of subbuffers (6) is rounded to the next power of 2 (8)
> UST channel ch1 enabled for session test
> Warning: The subbuf size (24576) is rounded to the next power of 2 (32768)
> UST channel ch2 enabled for session test
> UST event * created in channel ch1
> Tracing started for session test
> Tracing session test: [active snapshot]
>     Trace path:
>
> === Domain: UST global ===
>
> Buffer type: per UID
>
> Channels:
> -------------
> - ch1: [enabled]
>
>     Attributes:
>       overwrite mode: 1
>       subbufers size: 4096
>       number of subbufers: 8
>       switch timer interval: 0
>       read timer interval: 0
>       output: mmap()
>
>     Events:
>       * (type: tracepoint) [enabled]
> - ch2: [enabled]
>
>     Attributes:
>       overwrite mode: 1
>       subbufers size: 32768
>       number of subbufers: 2
>       switch timer interval: 0
>       read timer interval: 0
>       output: mmap()
>
>     Events:
>       None
>
> Snapshot recorded successfully for session test
> 1011
> [07:09:43.278933559] (+?.?????????) 07f8cf84a38f
> ust_baddr_statedump:soinfo: { cpu_id = 6 }, { baddr = 0x7FFF645EE000,
> sopath = "[vdso]", size = 0, mtime = -1 }
> [07:09:43.279975100] (+0.000000975) 07f8cf84a38f tracetest:first_tp: {
> cpu_id = 5 }, { my_string_field = "test", my_integer_field = 999 }
> NUMBER OF DIFF HUNKS =  0
> Snapshot recorded successfully for session test
> 2022
> [07:09:43.278933559] (+?.?????????) 07f8cf84a38f
> ust_baddr_statedump:soinfo: { cpu_id = 6 }, { baddr = 0x7FFF645EE000,
> sopath = "[vdso]", size = 0, mtime = -1 }
> [07:09:43.337703508] (+0.000001238) 07f8cf84a38f tracetest:first_tp: {
> cpu_id = 4 }, { my_string_field = "test", my_integer_field = 999 }
> NUMBER OF DIFF HUNKS =  1
> Snapshot recorded successfully for session test
> 3033
> [07:09:43.278933559] (+?.?????????) 07f8cf84a38f
> ust_baddr_statedump:soinfo: { cpu_id = 6 }, { baddr = 0x7FFF645EE000,
> sopath = "[vdso]", size = 0, mtime = -1 }
> [07:09:43.432827051] (+0.000000308) 07f8cf84a38f tracetest:first_tp: {
> cpu_id = 6 }, { my_string_field = "test", my_integer_field = 999 }
> NUMBER OF DIFF HUNKS =  1
> Snapshot recorded successfully for session test
> 4044
> [07:09:43.278933559] (+?.?????????) 07f8cf84a38f
> ust_baddr_statedump:soinfo: { cpu_id = 6 }, { baddr = 0x7FFF645EE000,
> sopath = "[vdso]", size = 0, mtime = -1 }
> [07:09:43.475784825] (+0.000000501) 07f8cf84a38f tracetest:first_tp: {
> cpu_id = 5 }, { my_string_field = "test", my_integer_field = 999 }
> NUMBER OF DIFF HUNKS =  1
> Snapshot recorded successfully for session test
> 4735
> [07:09:43.279031320] (+?.?????????) 07f8cf84a38f tracetest:first_tp: {
> cpu_id = 5 }, { my_string_field = "test", my_integer_field = 0 }
> [07:09:43.549945106] (+0.000000517) 07f8cf84a38f tracetest:first_tp: {
> cpu_id = 6 }, { my_string_field = "test", my_integer_field = 999 }
> NUMBER OF DIFF HUNKS =  12
> Snapshot recorded successfully for session test
> 4437
> [07:09:43.335916076] (+?.?????????) 07f8cf84a38f
> ust_baddr_statedump:soinfo: { cpu_id = 2 }, { baddr = 0x7FFFAD1FE000,
> sopath = "[vdso]", size = 0, mtime = -1 }
> [07:09:43.654229030] (+0.000000436) 07f8cf84a38f tracetest:first_tp: {
> cpu_id = 5 }, { my_string_field = "test", my_integer_field = 999 }
> NUMBER OF DIFF HUNKS =  3
> Snapshot recorded successfully for session test
> 4746
> [07:09:43.335916076] (+?.?????????) 07f8cf84a38f
> ust_baddr_statedump:soinfo: { cpu_id = 2 }, { baddr = 0x7FFFAD1FE000,
> sopath = "[vdso]", size = 0, mtime = -1 }
> [07:09:43.747509349] (+0.000000502) 07f8cf84a38f tracetest:first_tp: {
> cpu_id = 6 }, { my_string_field = "test", my_integer_field = 999 }
> NUMBER OF DIFF HUNKS =  11
> Snapshot recorded successfully for session test
> 5757
> [07:09:43.335916076] (+?.?????????) 07f8cf84a38f
> ust_baddr_statedump:soinfo: { cpu_id = 2 }, { baddr = 0x7FFFAD1FE000,
> sopath = "[vdso]", size = 0, mtime = -1 }
> [07:09:43.814938128] (+0.000001371) 07f8cf84a38f tracetest:first_tp: {
> cpu_id = 4 }, { my_string_field = "test", my_integer_field = 999 }
> NUMBER OF DIFF HUNKS =  1
> Snapshot recorded successfully for session test
> 6066
> [07:09:43.335916076] (+?.?????????) 07f8cf84a38f
> ust_baddr_statedump:soinfo: { cpu_id = 2 }, { baddr = 0x7FFFAD1FE000,
> sopath = "[vdso]", size = 0, mtime = -1 }
> [07:09:43.919112586] (+0.000000990) 07f8cf84a38f tracetest:first_tp: {
> cpu_id = 6 }, { my_string_field = "test", my_integer_field = 999 }
> NUMBER OF DIFF HUNKS =  11
> Snapshot recorded successfully for session test
> 6077
> [07:09:43.335916076] (+?.?????????) 07f8cf84a38f
> ust_baddr_statedump:soinfo: { cpu_id = 2 }, { baddr = 0x7FFFAD1FE000,
> sopath = "[vdso]", size = 0, mtime = -1 }
> [07:09:44.004234933] (+0.000001213) 07f8cf84a38f tracetest:first_tp: {
> cpu_id = 6 }, { my_string_field = "test", my_integer_field = 999 }
> NUMBER OF DIFF HUNKS =  2
> Waiting for data availability
> Tracing stopped for session test
> Session test destroyed
> ---------------------------------------------------------------
>
> Regards
>
> Anders Wallin
>
> _______________________________________________
> lttng-dev mailing list
> lttng-dev@lists.lttng.org
> http://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
>
>
>
>
> --
> Mathieu Desnoyers
> EfficiOS Inc.
> http://www.efficios.com
>

[-- Attachment #1.2: Type: text/html, Size: 15504 bytes --]

[-- Attachment #2: Type: text/plain, Size: 155 bytes --]

_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
http://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: snapshot data corrupted after the output buffer is filled
       [not found]   ` <CAF2baFcfh8h769xaxLQpW4=9nmWRBcmi7tP79mW=w1DyqJm9Hw@mail.gmail.com>
@ 2014-11-17 11:45     ` Mathieu Desnoyers
       [not found]     ` <2137471544.14585.1416224714944.JavaMail.zimbra@efficios.com>
  1 sibling, 0 replies; 6+ messages in thread
From: Mathieu Desnoyers @ 2014-11-17 11:45 UTC (permalink / raw)
  To: Anders Wallin; +Cc: lttng-dev


[-- Attachment #1.1: Type: text/plain, Size: 14357 bytes --]

Hi Anders, 

Yes, please file a feature request against Babeltrace on 
the bug tracker pointing the this email thread and 
describing the feature. 

Thanks! 

Mathieu 

----- Original Message -----

> From: "Anders Wallin" <wallinux@gmail.com>
> To: "Mathieu Desnoyers" <mathieu.desnoyers@efficios.com>
> Cc: lttng-dev@lists.lttng.org
> Sent: Monday, November 17, 2014 10:45:37 AM
> Subject: Re: [lttng-dev] snapshot data corrupted after the output buffer is
> filled

> Thx Mathieu,

> if I'm running my test on one cpu (taskset -c 1 ./test.run) it works as
> expected and described by you.
> The babeltrace flag would be a good solution! Do you want me to file a
> bug/enhancement on this?

> Regards

> Anders Wallin

> On Sat, Nov 15, 2014 at 6:59 PM, Mathieu Desnoyers <
> mathieu.desnoyers@efficios.com > wrote:

> > > From: "Anders Wallin" < wallinux@gmail.com >
> > 
> 
> > > To: lttng-dev@lists.lttng.org
> > 
> 
> > > Sent: Friday, November 14, 2014 8:28:18 AM
> > 
> 
> > > Subject: [lttng-dev] snapshot data corrupted after the output buffer is
> > > filled
> > 
> 

> > > Hi,
> > 
> 

> > > it looks like snapshot events became faulty after the output max size is
> > > reached.
> > 
> 
> > > The next snapshot get mixed events from different buffers. I have tested
> > > it
> > > on 2.3, 2.4 and 2.5
> > 
> 
> > > and it's in all of them, but it hits earlier in 2.5 due to the problem
> > > reported in
> > 
> 
> > > http://lists.lttng.org/pipermail/lttng-dev/2014-November/023773.html
> > 
> 

> > > There should never be more then 2 HUNKS in a diff between 2 consecutive
> > > snapshots, here I got
> > 
> 
> > > 11 as the worst case.
> > 
> 

> > Hi Anders,
> 

> > This is not a bug, rather just per-cpu buffering at work.It does have some
> 
> > rather non-obvious consequences in terms of snapshot event flow.
> 

> > To understand this, we need to keep in mind that LTTng does everything
> 
> > in per-cpu buffers, and that overwrite of oldest events in buffer full
> > conditions
> 
> > happens per-cpu.
> 

> > The use-case you show below really makes sense when you start to consider
> 
> > on which CPU (and thus per-cpu buffer) the application is running.
> > Initially,
> 
> > the first app runs on CPU 6. As long as the following applications run on
> 
> > other cpus, we keep the same first event. Then when enough events are
> 
> > written in cpu 6 buffer to overwrite the first event, we see a first event
> > change.
> 
> > Then, depending on the scheduling of applications, older parts of per-cpu
> 
> > buffers are overwritten, and others are not (if there are less events in
> > those
> 
> > buffer). This means some per-cpu buffers will have events going further
> > back
> 
> > in time compared to other higher-throughput CPUs.
> 

> > This is why you see more than 2 hunks in your diff. Babeltrace merges those
> 
> > per-cpu buffers back into a single text output, and the fact that some
> > events
> 
> > have been overwritten for a time-range for high-throughput buffers, but not
> 
> > for low-throughput buffers, make it appear as if events are missing in the
> 
> > stream if your model is a single system-wide buffer, but the missing events
> 
> > make sense if you see it as a merge of per-cpu buffers.
> 

> > Our current behavior is to gather all available data when taking a
> > snapshot,
> 
> > and babeltrace behavior is to show all events available in the streams. We
> 
> > could perhaps consider a different babeltrace behavior (a flag) that would
> 
> > allow it to only show the events for the intersection of time range of all
> 
> > streams, based on the packet header timestamp begin/end. This would
> 
> > take care of hiding events that happen in low-throughput streams when
> 
> > there is no intersection with snapshotted high-throughput streams.
> 

> > And perhaps this should be the default ?
> 

> > Thanks!
> 

> > Mathieu
> 

> > > TESTCASE;
> > 
> 

> > > tracetest code:
> > 
> 
> > > ---------------------------------------------------------------
> > 
> 
> > > #include <stdio.h>
> > 
> 
> > > #include "tracetest-tp.h"
> > 
> 

> > > int main(int argc, char* argv[])
> > 
> 
> > > {
> > 
> 
> > > int i;
> > 
> 
> > > for (i = 0; i < 1000; i++) {
> > 
> 
> > > tracepoint(tracetest, first_tp, i, "test");
> > 
> 
> > > }
> > 
> 
> > > return 0;
> > 
> 
> > > }
> > 
> 
> > > ---------------------------------------------------------------
> > 
> 

> > > test script:
> > 
> 
> > > ---------------------------------------------------------------
> > 
> 
> > > #!/bin/bash
> > 
> 
> > > : ${NO_OF_SUBBUF:=6}
> > 
> 
> > > : ${SUBBUF_SIZE:=4096}
> > 
> 
> > > : ${KERNEL_TRACE:=NO}
> > 
> 
> > > : ${LOOPS:=10}
> > 
> 

> > > SESSION=test
> > 
> 
> > > CHANNEL1=ch1
> > 
> 
> > > CHANNEL2=ch2
> > 
> 

> > > CH2_SUBBUF_SIZE=$(($SUBBUF_SIZE*$NO_OF_SUBBUF))
> > 
> 

> > > LTTNG='lttng -n'
> > 
> 
> > > killall lttng-consumerd > /dev/null 2>&1
> > 
> 
> > > killall lttng-sessiond > /dev/null 2>&1
> > 
> 
> > > if [ -z $DEBUG ]; then
> > 
> 
> > > lttng-sessiond --no-kernel -d
> > 
> 
> > > else
> > 
> 
> > > lttng-sessiond -v --no-kernel --verbose-consumer -b
> > 
> 
> > > fi
> > 
> 
> > > sleep 1
> > 
> 

> > > if [ ! -z $DEBUG ]; then
> > 
> 
> > > export LTTNG_UST_DEBUG=1
> > 
> 
> > > fi
> > 
> 

> > > rm -rf $HOME/lttng-traces
> > 
> 
> > > rm snapshot/*.bt
> > 
> 

> > > $LTTNG --version
> > 
> 

> > > $LTTNG create $SESSION --snapshot
> > 
> 
> > > $LTTNG enable-channel -u --subbuf-size $SUBBUF_SIZE --num-subbuf
> > > $NO_OF_SUBBUF --overwrite $CHANNEL1
> > 
> 
> > > $LTTNG enable-channel -u --subbuf-size $CH2_SUBBUF_SIZE --num-subbuf 2
> > > --overwrite $CHANNEL2
> > 
> 
> > > $LTTNG enable-event -u -c $CHANNEL1 "*"
> > 
> 
> > > $LTTNG start
> > 
> 
> > > $LTTNG list $SESSION
> > 
> 

> > > mkdir -p snapshot/
> > 
> 

> > > h=1
> > 
> 
> > > for i in $(seq 1 $LOOPS); do
> > 
> 
> > > $APP
> > 
> 
> > > $LTTNG snapshot record
> > 
> 
> > > # print no of events and first and last event
> > 
> 
> > > last=$(ls -1drt $HOME/lttng-traces/$SESSION*/* | tail -1)
> > 
> 
> > > babeltrace $last > snapshot/$ i.bt
> > 
> 
> > > cat snapshot/$ i.bt | wc -l
> > 
> 
> > > cat snapshot/$ i.bt | head -1
> > 
> 
> > > cat snapshot/$ i.bt | tail -1
> > 
> 
> > > diff -u snapshot/$ i.bt snapshot/$ h.bt > snapshot/$i_$h.diff
> > 
> 
> > > cat snapshot/$i_$h.diff | grep "@@" | wc -l | xargs echo "NUMBER OF DIFF
> > > HUNKS = "
> > 
> 
> > > h=$i
> > 
> 
> > > done
> > 
> 

> > > $LTTNG stop
> > 
> 
> > > $LTTNG destroy $SESSION
> > 
> 

> > > ---------------------------------------------------------------
> > 
> 

> > > result from run (removed some output lines for readability)
> > 
> 
> > > ---------------------------------------------------------------
> > 
> 
> > > lttng (LTTng Trace Control) 2.4.2 - Époque Opaque
> > 
> 
> > > Snapshot mode set. Every channel enabled for that session will be set in
> > > overwrite mode and mmap output.
> > 
> 
> > > Warning: The number of subbuffers (6) is rounded to the next power of 2
> > > (8)
> > 
> 
> > > UST channel ch1 enabled for session test
> > 
> 
> > > Warning: The subbuf size (24576) is rounded to the next power of 2
> > > (32768)
> > 
> 
> > > UST channel ch2 enabled for session test
> > 
> 
> > > UST event * created in channel ch1
> > 
> 
> > > Tracing started for session test
> > 
> 
> > > Tracing session test: [active snapshot]
> > 
> 
> > > Trace path:
> > 
> 

> > > === Domain: UST global ===
> > 
> 

> > > Buffer type: per UID
> > 
> 

> > > Channels:
> > 
> 
> > > -------------
> > 
> 
> > > - ch1: [enabled]
> > 
> 

> > > Attributes:
> > 
> 
> > > overwrite mode: 1
> > 
> 
> > > subbufers size: 4096
> > 
> 
> > > number of subbufers: 8
> > 
> 
> > > switch timer interval: 0
> > 
> 
> > > read timer interval: 0
> > 
> 
> > > output: mmap()
> > 
> 

> > > Events:
> > 
> 
> > > * (type: tracepoint) [enabled]
> > 
> 
> > > - ch2: [enabled]
> > 
> 

> > > Attributes:
> > 
> 
> > > overwrite mode: 1
> > 
> 
> > > subbufers size: 32768
> > 
> 
> > > number of subbufers: 2
> > 
> 
> > > switch timer interval: 0
> > 
> 
> > > read timer interval: 0
> > 
> 
> > > output: mmap()
> > 
> 

> > > Events:
> > 
> 
> > > None
> > 
> 

> > > Snapshot recorded successfully for session test
> > 
> 
> > > 1011
> > 
> 
> > > [07:09:43.278933559] (+?.?????????) 07f8cf84a38f
> > > ust_baddr_statedump:soinfo:
> > > { cpu_id = 6 }, { baddr = 0x7FFF645EE000, sopath = "[vdso]", size = 0,
> > > mtime
> > > = -1 }
> > 
> 
> > > [07:09:43.279975100] (+0.000000975) 07f8cf84a38f tracetest:first_tp: {
> > > cpu_id
> > > = 5 }, { my_string_field = "test", my_integer_field = 999 }
> > 
> 
> > > NUMBER OF DIFF HUNKS = 0
> > 
> 
> > > Snapshot recorded successfully for session test
> > 
> 
> > > 2022
> > 
> 
> > > [07:09:43.278933559] (+?.?????????) 07f8cf84a38f
> > > ust_baddr_statedump:soinfo:
> > > { cpu_id = 6 }, { baddr = 0x7FFF645EE000, sopath = "[vdso]", size = 0,
> > > mtime
> > > = -1 }
> > 
> 
> > > [07:09:43.337703508] (+0.000001238) 07f8cf84a38f tracetest:first_tp: {
> > > cpu_id
> > > = 4 }, { my_string_field = "test", my_integer_field = 999 }
> > 
> 
> > > NUMBER OF DIFF HUNKS = 1
> > 
> 
> > > Snapshot recorded successfully for session test
> > 
> 
> > > 3033
> > 
> 
> > > [07:09:43.278933559] (+?.?????????) 07f8cf84a38f
> > > ust_baddr_statedump:soinfo:
> > > { cpu_id = 6 }, { baddr = 0x7FFF645EE000, sopath = "[vdso]", size = 0,
> > > mtime
> > > = -1 }
> > 
> 
> > > [07:09:43.432827051] (+0.000000308) 07f8cf84a38f tracetest:first_tp: {
> > > cpu_id
> > > = 6 }, { my_string_field = "test", my_integer_field = 999 }
> > 
> 
> > > NUMBER OF DIFF HUNKS = 1
> > 
> 
> > > Snapshot recorded successfully for session test
> > 
> 
> > > 4044
> > 
> 
> > > [07:09:43.278933559] (+?.?????????) 07f8cf84a38f
> > > ust_baddr_statedump:soinfo:
> > > { cpu_id = 6 }, { baddr = 0x7FFF645EE000, sopath = "[vdso]", size = 0,
> > > mtime
> > > = -1 }
> > 
> 
> > > [07:09:43.475784825] (+0.000000501) 07f8cf84a38f tracetest:first_tp: {
> > > cpu_id
> > > = 5 }, { my_string_field = "test", my_integer_field = 999 }
> > 
> 
> > > NUMBER OF DIFF HUNKS = 1
> > 
> 
> > > Snapshot recorded successfully for session test
> > 
> 
> > > 4735
> > 
> 
> > > [07:09:43.279031320] (+?.?????????) 07f8cf84a38f tracetest:first_tp: {
> > > cpu_id
> > > = 5 }, { my_string_field = "test", my_integer_field = 0 }
> > 
> 
> > > [07:09:43.549945106] (+0.000000517) 07f8cf84a38f tracetest:first_tp: {
> > > cpu_id
> > > = 6 }, { my_string_field = "test", my_integer_field = 999 }
> > 
> 
> > > NUMBER OF DIFF HUNKS = 12
> > 
> 
> > > Snapshot recorded successfully for session test
> > 
> 
> > > 4437
> > 
> 
> > > [07:09:43.335916076] (+?.?????????) 07f8cf84a38f
> > > ust_baddr_statedump:soinfo:
> > > { cpu_id = 2 }, { baddr = 0x7FFFAD1FE000, sopath = "[vdso]", size = 0,
> > > mtime
> > > = -1 }
> > 
> 
> > > [07:09:43.654229030] (+0.000000436) 07f8cf84a38f tracetest:first_tp: {
> > > cpu_id
> > > = 5 }, { my_string_field = "test", my_integer_field = 999 }
> > 
> 
> > > NUMBER OF DIFF HUNKS = 3
> > 
> 
> > > Snapshot recorded successfully for session test
> > 
> 
> > > 4746
> > 
> 
> > > [07:09:43.335916076] (+?.?????????) 07f8cf84a38f
> > > ust_baddr_statedump:soinfo:
> > > { cpu_id = 2 }, { baddr = 0x7FFFAD1FE000, sopath = "[vdso]", size = 0,
> > > mtime
> > > = -1 }
> > 
> 
> > > [07:09:43.747509349] (+0.000000502) 07f8cf84a38f tracetest:first_tp: {
> > > cpu_id
> > > = 6 }, { my_string_field = "test", my_integer_field = 999 }
> > 
> 
> > > NUMBER OF DIFF HUNKS = 11
> > 
> 
> > > Snapshot recorded successfully for session test
> > 
> 
> > > 5757
> > 
> 
> > > [07:09:43.335916076] (+?.?????????) 07f8cf84a38f
> > > ust_baddr_statedump:soinfo:
> > > { cpu_id = 2 }, { baddr = 0x7FFFAD1FE000, sopath = "[vdso]", size = 0,
> > > mtime
> > > = -1 }
> > 
> 
> > > [07:09:43.814938128] (+0.000001371) 07f8cf84a38f tracetest:first_tp: {
> > > cpu_id
> > > = 4 }, { my_string_field = "test", my_integer_field = 999 }
> > 
> 
> > > NUMBER OF DIFF HUNKS = 1
> > 
> 
> > > Snapshot recorded successfully for session test
> > 
> 
> > > 6066
> > 
> 
> > > [07:09:43.335916076] (+?.?????????) 07f8cf84a38f
> > > ust_baddr_statedump:soinfo:
> > > { cpu_id = 2 }, { baddr = 0x7FFFAD1FE000, sopath = "[vdso]", size = 0,
> > > mtime
> > > = -1 }
> > 
> 
> > > [07:09:43.919112586] (+0.000000990) 07f8cf84a38f tracetest:first_tp: {
> > > cpu_id
> > > = 6 }, { my_string_field = "test", my_integer_field = 999 }
> > 
> 
> > > NUMBER OF DIFF HUNKS = 11
> > 
> 
> > > Snapshot recorded successfully for session test
> > 
> 
> > > 6077
> > 
> 
> > > [07:09:43.335916076] (+?.?????????) 07f8cf84a38f
> > > ust_baddr_statedump:soinfo:
> > > { cpu_id = 2 }, { baddr = 0x7FFFAD1FE000, sopath = "[vdso]", size = 0,
> > > mtime
> > > = -1 }
> > 
> 
> > > [07:09:44.004234933] (+0.000001213) 07f8cf84a38f tracetest:first_tp: {
> > > cpu_id
> > > = 6 }, { my_string_field = "test", my_integer_field = 999 }
> > 
> 
> > > NUMBER OF DIFF HUNKS = 2
> > 
> 
> > > Waiting for data availability
> > 
> 
> > > Tracing stopped for session test
> > 
> 
> > > Session test destroyed
> > 
> 
> > > ---------------------------------------------------------------
> > 
> 

> > > Regards
> > 
> 

> > > Anders Wallin
> > 
> 

> > > _______________________________________________
> > 
> 
> > > lttng-dev mailing list
> > 
> 
> > > lttng-dev@lists.lttng.org
> > 
> 
> > > http://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
> > 
> 

> > --
> 
> > Mathieu Desnoyers
> 
> > EfficiOS Inc.
> 
> > http://www.efficios.com
> 

-- 
Mathieu Desnoyers 
EfficiOS Inc. 
http://www.efficios.com 

[-- Attachment #1.2: Type: text/html, Size: 17236 bytes --]

[-- Attachment #2: Type: text/plain, Size: 155 bytes --]

_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
http://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: snapshot data corrupted after the output buffer is filled
       [not found]     ` <2137471544.14585.1416224714944.JavaMail.zimbra@efficios.com>
@ 2014-11-19 10:31       ` Anders Wallin
  0 siblings, 0 replies; 6+ messages in thread
From: Anders Wallin @ 2014-11-19 10:31 UTC (permalink / raw)
  To: Mathieu Desnoyers; +Cc: lttng-dev


[-- Attachment #1.1: Type: text/plain, Size: 12230 bytes --]

Filed as https://bugs.lttng.org/issues/862



Anders Wallin

On Mon, Nov 17, 2014 at 12:45 PM, Mathieu Desnoyers <
mathieu.desnoyers@efficios.com> wrote:

> Hi Anders,
>
> Yes, please file a feature request against Babeltrace on
> the bug tracker pointing the this email thread and
> describing the feature.
>
> Thanks!
>
> Mathieu
>
>
> ------------------------------
>
> *From: *"Anders Wallin" <wallinux@gmail.com>
> *To: *"Mathieu Desnoyers" <mathieu.desnoyers@efficios.com>
> *Cc: *lttng-dev@lists.lttng.org
> *Sent: *Monday, November 17, 2014 10:45:37 AM
> *Subject: *Re: [lttng-dev] snapshot data corrupted after the output
> buffer is filled
>
>
> Thx Mathieu,
>
> if I'm running my test on one cpu (taskset -c 1 ./test.run) it works as
> expected and described by you.
> The babeltrace flag would be a good solution! Do you want me to file a
> bug/enhancement on this?
>
> Regards
>
>
>
> Anders Wallin
>
> On Sat, Nov 15, 2014 at 6:59 PM, Mathieu Desnoyers <
> mathieu.desnoyers@efficios.com> wrote:
>
>> ------------------------------
>>
>> *From: *"Anders Wallin" <wallinux@gmail.com>
>> *To: *lttng-dev@lists.lttng.org
>> *Sent: *Friday, November 14, 2014 8:28:18 AM
>> *Subject: *[lttng-dev] snapshot data corrupted after the output buffer
>> is        filled
>>
>> Hi,
>>
>> it looks like snapshot events became faulty after the output max size is
>> reached.
>> The next snapshot get mixed events from different buffers. I have tested
>> it on 2.3, 2.4 and 2.5
>> and it's in all of them, but it hits earlier in 2.5 due to the  problem
>> reported in
>> http://lists.lttng.org/pipermail/lttng-dev/2014-November/023773.html
>>
>> There should never be more then 2 HUNKS in a diff between 2 consecutive
>> snapshots, here I got
>> 11 as the worst case.
>>
>> Hi Anders,
>>
>> This is not a bug, rather just per-cpu buffering at work.It does have some
>> rather non-obvious consequences in terms of snapshot event flow.
>>
>> To understand this, we need to keep in mind that LTTng does everything
>> in per-cpu buffers, and that overwrite of oldest events in buffer full
>> conditions
>> happens per-cpu.
>>
>> The use-case you show below really makes sense when you start to consider
>> on which CPU (and thus per-cpu buffer) the application is running.
>> Initially,
>> the first app runs on CPU 6. As long as the following applications run on
>> other cpus, we keep the same first event. Then when enough events are
>> written in cpu 6 buffer to overwrite the first event, we see a first
>> event change.
>> Then, depending on the scheduling of applications, older parts of per-cpu
>> buffers are overwritten, and others are not (if there are less events in
>> those
>> buffer). This means some per-cpu buffers will have events going further
>> back
>> in time compared to other higher-throughput CPUs.
>>
>> This is why you see more than 2 hunks in your diff. Babeltrace merges
>> those
>> per-cpu buffers back into a single text output, and the fact that some
>> events
>> have been overwritten for a time-range for high-throughput buffers, but
>> not
>> for low-throughput buffers, make it appear as if events are missing in the
>> stream if your model is a single system-wide buffer, but the missing
>> events
>> make sense if you see it as a merge of per-cpu buffers.
>>
>> Our current behavior is to gather all available data when taking a
>> snapshot,
>> and babeltrace behavior is to show all events available in the streams. We
>> could perhaps consider a different babeltrace behavior (a flag) that would
>> allow it to only show the events for the intersection of time range of all
>> streams, based on the packet header timestamp begin/end. This would
>> take care of hiding events that happen in low-throughput streams when
>> there is no intersection with snapshotted high-throughput streams.
>>
>> And perhaps this should be the default ?
>>
>> Thanks!
>>
>> Mathieu
>>
>>
>> TESTCASE;
>>
>> tracetest code:
>> ---------------------------------------------------------------
>> #include <stdio.h>
>> #include "tracetest-tp.h"
>>
>> int main(int argc, char* argv[])
>> {
>>         int i;
>>         for (i = 0; i < 1000; i++) {
>>                 tracepoint(tracetest, first_tp, i, "test");
>>         }
>>         return 0;
>> }
>> ---------------------------------------------------------------
>>
>> test script:
>> ---------------------------------------------------------------
>> #!/bin/bash
>> : ${NO_OF_SUBBUF:=6}
>> : ${SUBBUF_SIZE:=4096}
>> : ${KERNEL_TRACE:=NO}
>> : ${LOOPS:=10}
>>
>> SESSION=test
>> CHANNEL1=ch1
>> CHANNEL2=ch2
>>
>> CH2_SUBBUF_SIZE=$(($SUBBUF_SIZE*$NO_OF_SUBBUF))
>>
>>
>> LTTNG='lttng -n'
>> killall lttng-consumerd > /dev/null 2>&1
>> killall lttng-sessiond > /dev/null 2>&1
>> if [ -z $DEBUG ]; then
>>     lttng-sessiond --no-kernel -d
>> else
>>     lttng-sessiond -v --no-kernel --verbose-consumer -b
>> fi
>> sleep 1
>>
>> if [ ! -z $DEBUG ]; then
>>     export LTTNG_UST_DEBUG=1
>> fi
>>
>> rm -rf $HOME/lttng-traces
>> rm snapshot/*.bt
>>
>> $LTTNG --version
>>
>> $LTTNG create $SESSION --snapshot
>> $LTTNG enable-channel -u --subbuf-size $SUBBUF_SIZE --num-subbuf
>> $NO_OF_SUBBUF --overwrite $CHANNEL1
>> $LTTNG enable-channel -u --subbuf-size $CH2_SUBBUF_SIZE --num-subbuf 2
>> --overwrite $CHANNEL2
>> $LTTNG enable-event -u -c $CHANNEL1 "*"
>> $LTTNG start
>> $LTTNG list $SESSION
>>
>> mkdir -p snapshot/
>>
>> h=1
>> for i in $(seq 1 $LOOPS); do
>>     $APP
>>     $LTTNG snapshot record
>>     # print no of events and first and last event
>>
>>
>>     last=$(ls -1drt $HOME/lttng-traces/$SESSION*/* | tail -1)
>>     babeltrace $last > snapshot/$i.bt
>>     cat snapshot/$i.bt | wc -l
>>     cat snapshot/$i.bt | head -1
>>     cat snapshot/$i.bt | tail -1
>>     diff -u snapshot/$i.bt snapshot/$h.bt > snapshot/$i_$h.diff
>>     cat snapshot/$i_$h.diff | grep "@@" | wc -l | xargs echo "NUMBER OF
>> DIFF HUNKS = "
>>     h=$i
>> done
>>
>> $LTTNG stop
>> $LTTNG destroy $SESSION
>>
>> ---------------------------------------------------------------
>>
>> result from run (removed some output lines for readability)
>> ---------------------------------------------------------------
>> lttng (LTTng Trace Control) 2.4.2 - Époque Opaque
>> Snapshot mode set. Every channel enabled for that session will be set in
>> overwrite mode and mmap output.
>> Warning: The number of subbuffers (6) is rounded to the next power of 2
>> (8)
>> UST channel ch1 enabled for session test
>> Warning: The subbuf size (24576) is rounded to the next power of 2 (32768)
>> UST channel ch2 enabled for session test
>> UST event * created in channel ch1
>> Tracing started for session test
>> Tracing session test: [active snapshot]
>>     Trace path:
>>
>> === Domain: UST global ===
>>
>> Buffer type: per UID
>>
>> Channels:
>> -------------
>> - ch1: [enabled]
>>
>>     Attributes:
>>       overwrite mode: 1
>>       subbufers size: 4096
>>       number of subbufers: 8
>>       switch timer interval: 0
>>       read timer interval: 0
>>       output: mmap()
>>
>>     Events:
>>       * (type: tracepoint) [enabled]
>> - ch2: [enabled]
>>
>>     Attributes:
>>       overwrite mode: 1
>>       subbufers size: 32768
>>       number of subbufers: 2
>>       switch timer interval: 0
>>       read timer interval: 0
>>       output: mmap()
>>
>>     Events:
>>       None
>>
>> Snapshot recorded successfully for session test
>> 1011
>> [07:09:43.278933559] (+?.?????????) 07f8cf84a38f
>> ust_baddr_statedump:soinfo: { cpu_id = 6 }, { baddr = 0x7FFF645EE000,
>> sopath = "[vdso]", size = 0, mtime = -1 }
>> [07:09:43.279975100] (+0.000000975) 07f8cf84a38f tracetest:first_tp: {
>> cpu_id = 5 }, { my_string_field = "test", my_integer_field = 999 }
>> NUMBER OF DIFF HUNKS =  0
>> Snapshot recorded successfully for session test
>> 2022
>> [07:09:43.278933559] (+?.?????????) 07f8cf84a38f
>> ust_baddr_statedump:soinfo: { cpu_id = 6 }, { baddr = 0x7FFF645EE000,
>> sopath = "[vdso]", size = 0, mtime = -1 }
>> [07:09:43.337703508] (+0.000001238) 07f8cf84a38f tracetest:first_tp: {
>> cpu_id = 4 }, { my_string_field = "test", my_integer_field = 999 }
>> NUMBER OF DIFF HUNKS =  1
>> Snapshot recorded successfully for session test
>> 3033
>> [07:09:43.278933559] (+?.?????????) 07f8cf84a38f
>> ust_baddr_statedump:soinfo: { cpu_id = 6 }, { baddr = 0x7FFF645EE000,
>> sopath = "[vdso]", size = 0, mtime = -1 }
>> [07:09:43.432827051] (+0.000000308) 07f8cf84a38f tracetest:first_tp: {
>> cpu_id = 6 }, { my_string_field = "test", my_integer_field = 999 }
>> NUMBER OF DIFF HUNKS =  1
>> Snapshot recorded successfully for session test
>> 4044
>> [07:09:43.278933559] (+?.?????????) 07f8cf84a38f
>> ust_baddr_statedump:soinfo: { cpu_id = 6 }, { baddr = 0x7FFF645EE000,
>> sopath = "[vdso]", size = 0, mtime = -1 }
>> [07:09:43.475784825] (+0.000000501) 07f8cf84a38f tracetest:first_tp: {
>> cpu_id = 5 }, { my_string_field = "test", my_integer_field = 999 }
>> NUMBER OF DIFF HUNKS =  1
>> Snapshot recorded successfully for session test
>> 4735
>> [07:09:43.279031320] (+?.?????????) 07f8cf84a38f tracetest:first_tp: {
>> cpu_id = 5 }, { my_string_field = "test", my_integer_field = 0 }
>> [07:09:43.549945106] (+0.000000517) 07f8cf84a38f tracetest:first_tp: {
>> cpu_id = 6 }, { my_string_field = "test", my_integer_field = 999 }
>> NUMBER OF DIFF HUNKS =  12
>> Snapshot recorded successfully for session test
>> 4437
>> [07:09:43.335916076] (+?.?????????) 07f8cf84a38f
>> ust_baddr_statedump:soinfo: { cpu_id = 2 }, { baddr = 0x7FFFAD1FE000,
>> sopath = "[vdso]", size = 0, mtime = -1 }
>> [07:09:43.654229030] (+0.000000436) 07f8cf84a38f tracetest:first_tp: {
>> cpu_id = 5 }, { my_string_field = "test", my_integer_field = 999 }
>> NUMBER OF DIFF HUNKS =  3
>> Snapshot recorded successfully for session test
>> 4746
>> [07:09:43.335916076] (+?.?????????) 07f8cf84a38f
>> ust_baddr_statedump:soinfo: { cpu_id = 2 }, { baddr = 0x7FFFAD1FE000,
>> sopath = "[vdso]", size = 0, mtime = -1 }
>> [07:09:43.747509349] (+0.000000502) 07f8cf84a38f tracetest:first_tp: {
>> cpu_id = 6 }, { my_string_field = "test", my_integer_field = 999 }
>> NUMBER OF DIFF HUNKS =  11
>> Snapshot recorded successfully for session test
>> 5757
>> [07:09:43.335916076] (+?.?????????) 07f8cf84a38f
>> ust_baddr_statedump:soinfo: { cpu_id = 2 }, { baddr = 0x7FFFAD1FE000,
>> sopath = "[vdso]", size = 0, mtime = -1 }
>> [07:09:43.814938128] (+0.000001371) 07f8cf84a38f tracetest:first_tp: {
>> cpu_id = 4 }, { my_string_field = "test", my_integer_field = 999 }
>> NUMBER OF DIFF HUNKS =  1
>> Snapshot recorded successfully for session test
>> 6066
>> [07:09:43.335916076] (+?.?????????) 07f8cf84a38f
>> ust_baddr_statedump:soinfo: { cpu_id = 2 }, { baddr = 0x7FFFAD1FE000,
>> sopath = "[vdso]", size = 0, mtime = -1 }
>> [07:09:43.919112586] (+0.000000990) 07f8cf84a38f tracetest:first_tp: {
>> cpu_id = 6 }, { my_string_field = "test", my_integer_field = 999 }
>> NUMBER OF DIFF HUNKS =  11
>> Snapshot recorded successfully for session test
>> 6077
>> [07:09:43.335916076] (+?.?????????) 07f8cf84a38f
>> ust_baddr_statedump:soinfo: { cpu_id = 2 }, { baddr = 0x7FFFAD1FE000,
>> sopath = "[vdso]", size = 0, mtime = -1 }
>> [07:09:44.004234933] (+0.000001213) 07f8cf84a38f tracetest:first_tp: {
>> cpu_id = 6 }, { my_string_field = "test", my_integer_field = 999 }
>> NUMBER OF DIFF HUNKS =  2
>> Waiting for data availability
>> Tracing stopped for session test
>> Session test destroyed
>> ---------------------------------------------------------------
>>
>> Regards
>>
>> Anders Wallin
>>
>> _______________________________________________
>> lttng-dev mailing list
>> lttng-dev@lists.lttng.org
>> http://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
>>
>>
>>
>>
>> --
>> Mathieu Desnoyers
>> EfficiOS Inc.
>> http://www.efficios.com
>>
>
>
>
>
> --
> Mathieu Desnoyers
> EfficiOS Inc.
> http://www.efficios.com
>

[-- Attachment #1.2: Type: text/html, Size: 17531 bytes --]

[-- Attachment #2: Type: text/plain, Size: 155 bytes --]

_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
http://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

^ permalink raw reply	[flat|nested] 6+ messages in thread

* snapshot data corrupted after the output buffer is filled
@ 2014-11-14  7:28 Anders Wallin
  0 siblings, 0 replies; 6+ messages in thread
From: Anders Wallin @ 2014-11-14  7:28 UTC (permalink / raw)
  To: lttng-dev


[-- Attachment #1.1: Type: text/plain, Size: 7692 bytes --]

Hi,

it looks like snapshot events became faulty after the output max size is
reached.
The next snapshot get mixed events from different buffers. I have tested it
on 2.3, 2.4 and 2.5
and it's in all of them, but it hits earlier in 2.5 due to the  problem
reported in
http://lists.lttng.org/pipermail/lttng-dev/2014-November/023773.html

There should never be more then 2 HUNKS in a diff between 2 consecutive
snapshots, here I got
11 as the worst case.

TESTCASE;

tracetest code:
---------------------------------------------------------------
#include <stdio.h>
#include "tracetest-tp.h"

int main(int argc, char* argv[])
{
        int i;
        for (i = 0; i < 1000; i++) {
                tracepoint(tracetest, first_tp, i, "test");
        }
        return 0;
}
---------------------------------------------------------------

test script:
---------------------------------------------------------------
#!/bin/bash
: ${NO_OF_SUBBUF:=6}
: ${SUBBUF_SIZE:=4096}
: ${KERNEL_TRACE:=NO}
: ${LOOPS:=10}

SESSION=test
CHANNEL1=ch1
CHANNEL2=ch2

CH2_SUBBUF_SIZE=$(($SUBBUF_SIZE*$NO_OF_SUBBUF))


LTTNG='lttng -n'
killall lttng-consumerd > /dev/null 2>&1
killall lttng-sessiond > /dev/null 2>&1
if [ -z $DEBUG ]; then
    lttng-sessiond --no-kernel -d
else
    lttng-sessiond -v --no-kernel --verbose-consumer -b
fi
sleep 1

if [ ! -z $DEBUG ]; then
    export LTTNG_UST_DEBUG=1
fi

rm -rf $HOME/lttng-traces
rm snapshot/*.bt

$LTTNG --version

$LTTNG create $SESSION --snapshot
$LTTNG enable-channel -u --subbuf-size $SUBBUF_SIZE --num-subbuf
$NO_OF_SUBBUF --overwrite $CHANNEL1
$LTTNG enable-channel -u --subbuf-size $CH2_SUBBUF_SIZE --num-subbuf 2
--overwrite $CHANNEL2
$LTTNG enable-event -u -c $CHANNEL1 "*"
$LTTNG start
$LTTNG list $SESSION

mkdir -p snapshot/

h=1
for i in $(seq 1 $LOOPS); do
    $APP
    $LTTNG snapshot record
    # print no of events and first and last event


    last=$(ls -1drt $HOME/lttng-traces/$SESSION*/* | tail -1)
    babeltrace $last > snapshot/$i.bt
    cat snapshot/$i.bt | wc -l
    cat snapshot/$i.bt | head -1
    cat snapshot/$i.bt | tail -1
    diff -u snapshot/$i.bt snapshot/$h.bt > snapshot/$i_$h.diff
    cat snapshot/$i_$h.diff | grep "@@" | wc -l | xargs echo "NUMBER OF
DIFF HUNKS = "
    h=$i
done

$LTTNG stop
$LTTNG destroy $SESSION

---------------------------------------------------------------

result from run (removed some output lines for readability)
---------------------------------------------------------------
lttng (LTTng Trace Control) 2.4.2 - Époque Opaque
Snapshot mode set. Every channel enabled for that session will be set in
overwrite mode and mmap output.
Warning: The number of subbuffers (6) is rounded to the next power of 2 (8)
UST channel ch1 enabled for session test
Warning: The subbuf size (24576) is rounded to the next power of 2 (32768)
UST channel ch2 enabled for session test
UST event * created in channel ch1
Tracing started for session test
Tracing session test: [active snapshot]
    Trace path:

=== Domain: UST global ===

Buffer type: per UID

Channels:
-------------
- ch1: [enabled]

    Attributes:
      overwrite mode: 1
      subbufers size: 4096
      number of subbufers: 8
      switch timer interval: 0
      read timer interval: 0
      output: mmap()

    Events:
      * (type: tracepoint) [enabled]
- ch2: [enabled]

    Attributes:
      overwrite mode: 1
      subbufers size: 32768
      number of subbufers: 2
      switch timer interval: 0
      read timer interval: 0
      output: mmap()

    Events:
      None

Snapshot recorded successfully for session test
1011
[07:09:43.278933559] (+?.?????????) 07f8cf84a38f
ust_baddr_statedump:soinfo: { cpu_id = 6 }, { baddr = 0x7FFF645EE000,
sopath = "[vdso]", size = 0, mtime = -1 }
[07:09:43.279975100] (+0.000000975) 07f8cf84a38f tracetest:first_tp: {
cpu_id = 5 }, { my_string_field = "test", my_integer_field = 999 }
NUMBER OF DIFF HUNKS =  0
Snapshot recorded successfully for session test
2022
[07:09:43.278933559] (+?.?????????) 07f8cf84a38f
ust_baddr_statedump:soinfo: { cpu_id = 6 }, { baddr = 0x7FFF645EE000,
sopath = "[vdso]", size = 0, mtime = -1 }
[07:09:43.337703508] (+0.000001238) 07f8cf84a38f tracetest:first_tp: {
cpu_id = 4 }, { my_string_field = "test", my_integer_field = 999 }
NUMBER OF DIFF HUNKS =  1
Snapshot recorded successfully for session test
3033
[07:09:43.278933559] (+?.?????????) 07f8cf84a38f
ust_baddr_statedump:soinfo: { cpu_id = 6 }, { baddr = 0x7FFF645EE000,
sopath = "[vdso]", size = 0, mtime = -1 }
[07:09:43.432827051] (+0.000000308) 07f8cf84a38f tracetest:first_tp: {
cpu_id = 6 }, { my_string_field = "test", my_integer_field = 999 }
NUMBER OF DIFF HUNKS =  1
Snapshot recorded successfully for session test
4044
[07:09:43.278933559] (+?.?????????) 07f8cf84a38f
ust_baddr_statedump:soinfo: { cpu_id = 6 }, { baddr = 0x7FFF645EE000,
sopath = "[vdso]", size = 0, mtime = -1 }
[07:09:43.475784825] (+0.000000501) 07f8cf84a38f tracetest:first_tp: {
cpu_id = 5 }, { my_string_field = "test", my_integer_field = 999 }
NUMBER OF DIFF HUNKS =  1
Snapshot recorded successfully for session test
4735
[07:09:43.279031320] (+?.?????????) 07f8cf84a38f tracetest:first_tp: {
cpu_id = 5 }, { my_string_field = "test", my_integer_field = 0 }
[07:09:43.549945106] (+0.000000517) 07f8cf84a38f tracetest:first_tp: {
cpu_id = 6 }, { my_string_field = "test", my_integer_field = 999 }
NUMBER OF DIFF HUNKS =  12
Snapshot recorded successfully for session test
4437
[07:09:43.335916076] (+?.?????????) 07f8cf84a38f
ust_baddr_statedump:soinfo: { cpu_id = 2 }, { baddr = 0x7FFFAD1FE000,
sopath = "[vdso]", size = 0, mtime = -1 }
[07:09:43.654229030] (+0.000000436) 07f8cf84a38f tracetest:first_tp: {
cpu_id = 5 }, { my_string_field = "test", my_integer_field = 999 }
NUMBER OF DIFF HUNKS =  3
Snapshot recorded successfully for session test
4746
[07:09:43.335916076] (+?.?????????) 07f8cf84a38f
ust_baddr_statedump:soinfo: { cpu_id = 2 }, { baddr = 0x7FFFAD1FE000,
sopath = "[vdso]", size = 0, mtime = -1 }
[07:09:43.747509349] (+0.000000502) 07f8cf84a38f tracetest:first_tp: {
cpu_id = 6 }, { my_string_field = "test", my_integer_field = 999 }
NUMBER OF DIFF HUNKS =  11
Snapshot recorded successfully for session test
5757
[07:09:43.335916076] (+?.?????????) 07f8cf84a38f
ust_baddr_statedump:soinfo: { cpu_id = 2 }, { baddr = 0x7FFFAD1FE000,
sopath = "[vdso]", size = 0, mtime = -1 }
[07:09:43.814938128] (+0.000001371) 07f8cf84a38f tracetest:first_tp: {
cpu_id = 4 }, { my_string_field = "test", my_integer_field = 999 }
NUMBER OF DIFF HUNKS =  1
Snapshot recorded successfully for session test
6066
[07:09:43.335916076] (+?.?????????) 07f8cf84a38f
ust_baddr_statedump:soinfo: { cpu_id = 2 }, { baddr = 0x7FFFAD1FE000,
sopath = "[vdso]", size = 0, mtime = -1 }
[07:09:43.919112586] (+0.000000990) 07f8cf84a38f tracetest:first_tp: {
cpu_id = 6 }, { my_string_field = "test", my_integer_field = 999 }
NUMBER OF DIFF HUNKS =  11
Snapshot recorded successfully for session test
6077
[07:09:43.335916076] (+?.?????????) 07f8cf84a38f
ust_baddr_statedump:soinfo: { cpu_id = 2 }, { baddr = 0x7FFFAD1FE000,
sopath = "[vdso]", size = 0, mtime = -1 }
[07:09:44.004234933] (+0.000001213) 07f8cf84a38f tracetest:first_tp: {
cpu_id = 6 }, { my_string_field = "test", my_integer_field = 999 }
NUMBER OF DIFF HUNKS =  2
Waiting for data availability
Tracing stopped for session test
Session test destroyed
---------------------------------------------------------------

Regards

Anders Wallin

[-- Attachment #1.2: Type: text/html, Size: 12986 bytes --]

[-- Attachment #2: Type: text/plain, Size: 155 bytes --]

_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
http://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2014-11-19 10:32 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CAF2baFfMNzRQnx-FC7kFRr5m4NpZBEcEohnwf50uOs2vv9=GBw@mail.gmail.com>
2014-11-14 19:59 ` snapshot data corrupted after the output buffer is filled Julien Desfossez
2014-11-15 17:59 ` Mathieu Desnoyers
     [not found] ` <734976879.13957.1416074394667.JavaMail.zimbra@efficios.com>
2014-11-17  9:45   ` Anders Wallin
     [not found]   ` <CAF2baFcfh8h769xaxLQpW4=9nmWRBcmi7tP79mW=w1DyqJm9Hw@mail.gmail.com>
2014-11-17 11:45     ` Mathieu Desnoyers
     [not found]     ` <2137471544.14585.1416224714944.JavaMail.zimbra@efficios.com>
2014-11-19 10:31       ` Anders Wallin
2014-11-14  7:28 Anders Wallin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.