All of lore.kernel.org
 help / color / mirror / Atom feed
* [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection.
@ 2015-01-23  9:35 Wang Nan
  2015-01-23 16:30 ` Alexandre Montplaisir
                   ` (5 more replies)
  0 siblings, 6 replies; 35+ messages in thread
From: Wang Nan @ 2015-01-23  9:35 UTC (permalink / raw)
  To: diamon-discuss, lttng-dev

Hi folks,

I'd like to share my first tuning experience with perf, ctf and
TraceCompass here, and I hope my experience helpful to diamon.org. Most
part of this mail is talking about my work. If you don't
interest in it, you can directly jump to conclusion part.

*My Task*

What I'm working on is finding the reason why CPU idle rate is high when
we benchmarking a database. I think it should be a very simple task:
tracing scheduling and system calls, finding the previous syscall issued
before idle, then based on statistics, collecting some user spaces call
stack, I can give an answer. I use perf to collect trace,
perf-convert-to-ctf to get ctf output and TraceCompass for
visualization.


*My Experience*

First of all I use perf to collect trace:

 # perf record -a -e sched:* -e raw_syscalls:* sleep 1

then

 # perf data convert --to-ctf out.ctf

Which is simple. However, raw_syscalls:* tracepoints export less
information than syscalls:* tracepoints. Without them I have to manually
find syscall name from syscall id. I prefer to use:

 # perf record -a -e sched:* -e syscalls:* sleep 1

However there are some bugs and I have to make some patches. They are
posted and being disscussed currently, those bugs are still exist
upstream.

Then I need to convert perf.data to ctf. It tooks 140.57s to convert
2598513 samples, which are collected during only 1 second execution. My
working server has 64 2.0GHz Intel Xeon cores, but perf conversion
utilizes only 1 of them. I think this is another thing can be improved.

The next step is visualization. Output ctf trace can be opened with
TraceCompass without problem. The most important views for me should be
resources view (I use them to check CPU usage) and control flow view (I
use them to check thread activities).

The first uncomfortable thing is TraceCompass' slow response time. For
the trace I mentioned above, on resource view, after I click on CPU
idle area, I have to wait more than 10 seconds for event list updating
to get the previous event before the idle area.

Then I found through resources view that perf itself tooks lots of CPU
time. In my case 33.5% samples are generated by perf itself. One core is
dedicated to perf and never idle or taken by others. I think this should
be another thing needs to be improved: perf should give a way to
blacklist itself when tracing all CPUs.

TraceCompass doesn't recognize syscall:* tracepoints as CPU status
changing point. I have to also catch raw_syscall:*, and which doubles
the number of samples.

Finally I found the syscall which cause idle. However I need to write a
script to do statistics. TraceCompass itself is lack a mean to count
different events in my way.

The next thing I should do is to find the calltrace which issue the
syscall. This time TraceCompass won't help, mostly because perf
convertion now doesn't support converting calltrace.

*Conclusion*

I suggest perf and TraceCompass to think about following improvements:

 1. Reducing the cost of perf recording. There are one third events are
    generated by perf itself in my case. Is it possible that perf could
    provide an ability that blacklist itself and collect all other
    events?

 2. Improving perf converting performance. Converting perf.data to CTF is
    slow, but it should be offline most of the time. We can utilize the
    abilities multi-core server to make it working in parallel.

 3. Improving TraceCompass responding performance, especially when
    synchronizing different views.

 4. Support converting userspace call trace. I think perf side should already
    have a plan on it.

 5. Ad-Hoc visualization and statistics. Currently TraceCompass only
    support dwaring pre-defined events and processes. When I try to
    capture syscalls:*, I won't get benefit from TraceCompass because it
    doesn't know them. I believe that during system tuning we will
    finally get somewhere unable to be pre-defined by TraceCompass
    designer. Therefore give users abilities to define their own events
    and model should be much helpful.

Thank you.


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection.
  2015-01-23  9:35 [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection Wang Nan
  2015-01-23 16:30 ` Alexandre Montplaisir
@ 2015-01-23 16:30 ` Alexandre Montplaisir
  2015-01-23 20:12 ` Alexandre Montplaisir
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 35+ messages in thread
From: Alexandre Montplaisir @ 2015-01-23 16:30 UTC (permalink / raw)
  To: Wang Nan; +Cc: diamon-discuss, lttng-dev, tracecompass developer discussions

Hi Wang,

First of all, thank you very much for posting this use case. This is 
exactly the type of user feedback that will help make the toolchain 
better and more useful for users!

Some comments and questions below,


On 01/23/2015 04:35 AM, Wang Nan wrote:
> [...]
>
> Then I need to convert perf.data to ctf. It tooks 140.57s to convert
> 2598513 samples, which are collected during only 1 second execution. My
> working server has 64 2.0GHz Intel Xeon cores, but perf conversion
> utilizes only 1 of them. I think this is another thing can be improved.

Out of curiosity, approximately how big (in bytes) is the generated CTF 
trace directory?

>
> The next step is visualization. Output ctf trace can be opened with
> TraceCompass without problem. The most important views for me should be
> resources view (I use them to check CPU usage) and control flow view (I
> use them to check thread activities).
>
> The first uncomfortable thing is TraceCompass' slow response time. For
> the trace I mentioned above, on resource view, after I click on CPU
> idle area, I have to wait more than 10 seconds for event list updating
> to get the previous event before the idle area.

Interesting. It is expected that opening a very large trace would take a 
long time to load the first time, as everything gets indexed. But once 
that step is done, seeking within the trace should be relatively quick 
((log n) wrt to the trace size). In theory ;)

The perf-to-CTF conversion brings a completely new type of CTF traces 
that was not seen before. It is possible that the CTF parser in Trace 
Compass has some inefficiencies that were not exposed by other trace 
types. Are you able to share that trace publicly? Or a trace taken in 
the same environment, with no sensible information in it? It could be 
very helpful in finding such problem.

> Then I found through resources view that perf itself tooks lots of CPU
> time. In my case 33.5% samples are generated by perf itself. One core is
> dedicated to perf and never idle or taken by others. I think this should
> be another thing needs to be improved: perf should give a way to
> blacklist itself when tracing all CPUs.

I don't want to start a tracer-war here :) but have you investigated 
using LTTng for recording syscall/sched events ? Compared to perf, LTTng 
is only about "getting trace events", and is a bit more involved to set 
up, but it is more focused on performance and minimizing the impact on 
the traced applications. And it outputs in CTF format too.

I remember when testing the perf-CTF patches, comparing a perf trace to 
an LTTng one, perf would be doing system calls continuously on one of 
the CPUs for the whole duration of the trace. Whereas in LTTng traces, 
the session daemon would be a bit active at the beginning and at then 
end, but otherwise completely invisible from the trace.

> TraceCompass doesn't recognize syscall:* tracepoints as CPU status
> changing point. I have to also catch raw_syscall:*, and which doubles
> the number of samples.

This is a gap in the definition of the analysis it seems. I don't 
remember implementing two types of "syscall" events in the perf 
analysis, so it should just be a matter of getting the exact event name 
and adding it to the list. I will take a look and keep you posted!

> Finally I found the syscall which cause idle. However I need to write a
> script to do statistics. TraceCompass itself is lack a mean to count
> different events in my way.

Could you elaborate on this please? I agree the "Statistics" view in TC 
is severely lacking, we could be gathering and displaying much more 
information. The only question is what information would actually be useful.

What exactly would you have liked to be able to see in the tool?

> [...]
>
>
>   5. Ad-Hoc visualization and statistics. Currently TraceCompass only
>      support dwaring pre-defined events and processes. When I try to
>      capture syscalls:*, I won't get benefit from TraceCompass because it
>      doesn't know them. I believe that during system tuning we will
>      finally get somewhere unable to be pre-defined by TraceCompass
>      designer. Therefore give users abilities to define their own events
>      and model should be much helpful.

As I mentioned earlier, the pre-defined "perf analysis" in Trace Compass 
should be fixed to handle the syscall events.


But it's interesting that you mention wanting to add your own events and 
model. I completely agree with you, we will never be able to predict 
every and all use cases the users will want to use the tool for, so 
there should be a way for the user to add their own.

Well good news, it *is* possible for the user to define their own 
analysis and views! This is still undergoing a lot of development, and 
there is no nice UI yet, which is why it is not really advertized. But 
starting from any supported trace type, a user today can define a time 
graph view (like the Resource View for example) or a XY chart, using a 
data-driven XML syntax.

If you are curious, you can take a look at a full example of doing such 
a thing on this page:
https://github.com/alexmonthy/ust-tc-example
(the example uses an LTTng UST trace as a source, but it could work with 
any supported trace type, even a custom text trace defined in the UI).

>
> Thank you.

Thanks again for taking the time to write about your experience!

Cheers,
Alexandre

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection.
  2015-01-23  9:35 [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection Wang Nan
@ 2015-01-23 16:30 ` Alexandre Montplaisir
  2015-01-23 16:56   ` [diamon-discuss] [lttng-dev] " Geneviève Bastien
                     ` (3 more replies)
  2015-01-23 16:30 ` Alexandre Montplaisir
                   ` (4 subsequent siblings)
  5 siblings, 4 replies; 35+ messages in thread
From: Alexandre Montplaisir @ 2015-01-23 16:30 UTC (permalink / raw)
  To: Wang Nan; +Cc: diamon-discuss, lttng-dev, tracecompass developer discussions

Hi Wang,

First of all, thank you very much for posting this use case. This is 
exactly the type of user feedback that will help make the toolchain 
better and more useful for users!

Some comments and questions below,


On 01/23/2015 04:35 AM, Wang Nan wrote:
> [...]
>
> Then I need to convert perf.data to ctf. It tooks 140.57s to convert
> 2598513 samples, which are collected during only 1 second execution. My
> working server has 64 2.0GHz Intel Xeon cores, but perf conversion
> utilizes only 1 of them. I think this is another thing can be improved.

Out of curiosity, approximately how big (in bytes) is the generated CTF 
trace directory?

>
> The next step is visualization. Output ctf trace can be opened with
> TraceCompass without problem. The most important views for me should be
> resources view (I use them to check CPU usage) and control flow view (I
> use them to check thread activities).
>
> The first uncomfortable thing is TraceCompass' slow response time. For
> the trace I mentioned above, on resource view, after I click on CPU
> idle area, I have to wait more than 10 seconds for event list updating
> to get the previous event before the idle area.

Interesting. It is expected that opening a very large trace would take a 
long time to load the first time, as everything gets indexed. But once 
that step is done, seeking within the trace should be relatively quick 
((log n) wrt to the trace size). In theory ;)

The perf-to-CTF conversion brings a completely new type of CTF traces 
that was not seen before. It is possible that the CTF parser in Trace 
Compass has some inefficiencies that were not exposed by other trace 
types. Are you able to share that trace publicly? Or a trace taken in 
the same environment, with no sensible information in it? It could be 
very helpful in finding such problem.

> Then I found through resources view that perf itself tooks lots of CPU
> time. In my case 33.5% samples are generated by perf itself. One core is
> dedicated to perf and never idle or taken by others. I think this should
> be another thing needs to be improved: perf should give a way to
> blacklist itself when tracing all CPUs.

I don't want to start a tracer-war here :) but have you investigated 
using LTTng for recording syscall/sched events ? Compared to perf, LTTng 
is only about "getting trace events", and is a bit more involved to set 
up, but it is more focused on performance and minimizing the impact on 
the traced applications. And it outputs in CTF format too.

I remember when testing the perf-CTF patches, comparing a perf trace to 
an LTTng one, perf would be doing system calls continuously on one of 
the CPUs for the whole duration of the trace. Whereas in LTTng traces, 
the session daemon would be a bit active at the beginning and at then 
end, but otherwise completely invisible from the trace.

> TraceCompass doesn't recognize syscall:* tracepoints as CPU status
> changing point. I have to also catch raw_syscall:*, and which doubles
> the number of samples.

This is a gap in the definition of the analysis it seems. I don't 
remember implementing two types of "syscall" events in the perf 
analysis, so it should just be a matter of getting the exact event name 
and adding it to the list. I will take a look and keep you posted!

> Finally I found the syscall which cause idle. However I need to write a
> script to do statistics. TraceCompass itself is lack a mean to count
> different events in my way.

Could you elaborate on this please? I agree the "Statistics" view in TC 
is severely lacking, we could be gathering and displaying much more 
information. The only question is what information would actually be useful.

What exactly would you have liked to be able to see in the tool?

> [...]
>
>
>   5. Ad-Hoc visualization and statistics. Currently TraceCompass only
>      support dwaring pre-defined events and processes. When I try to
>      capture syscalls:*, I won't get benefit from TraceCompass because it
>      doesn't know them. I believe that during system tuning we will
>      finally get somewhere unable to be pre-defined by TraceCompass
>      designer. Therefore give users abilities to define their own events
>      and model should be much helpful.

As I mentioned earlier, the pre-defined "perf analysis" in Trace Compass 
should be fixed to handle the syscall events.


But it's interesting that you mention wanting to add your own events and 
model. I completely agree with you, we will never be able to predict 
every and all use cases the users will want to use the tool for, so 
there should be a way for the user to add their own.

Well good news, it *is* possible for the user to define their own 
analysis and views! This is still undergoing a lot of development, and 
there is no nice UI yet, which is why it is not really advertized. But 
starting from any supported trace type, a user today can define a time 
graph view (like the Resource View for example) or a XY chart, using a 
data-driven XML syntax.

If you are curious, you can take a look at a full example of doing such 
a thing on this page:
https://github.com/alexmonthy/ust-tc-example
(the example uses an LTTng UST trace as a source, but it could work with 
any supported trace type, even a custom text trace defined in the UI).

>
> Thank you.

Thanks again for taking the time to write about your experience!

Cheers,
Alexandre


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection.
  2015-01-23 16:30 ` Alexandre Montplaisir
  2015-01-23 16:56   ` [diamon-discuss] [lttng-dev] " Geneviève Bastien
@ 2015-01-23 16:56   ` Geneviève Bastien
  2015-01-31  7:14   ` Wang Nan
  2015-01-31  7:14   ` Wang Nan
  3 siblings, 0 replies; 35+ messages in thread
From: Geneviève Bastien @ 2015-01-23 16:56 UTC (permalink / raw)
  To: Wang Nan; +Cc: diamon-discuss, lttng-dev, tracecompass developer discussions

Hi Wang,

Thanks for sharing your experience. It's always useful to have some real 
live use case of using the tools.

Alex already made a quite complete answer.

I'll just add some information about your wish for Ad-Hoc visualization 
and statistics. As Alex said, data driven analysis, using XML files is 
already present in Trace Compass. Documentation on how to use it is 
available here: 
https://wiki.eclipse.org/Linux_Tools_Project/LTTng2/User_Guide#Data_driven_analysis

You can build your own analysis with any event type. The current support 
is rather basic, you need to write the XML by yourself, starting from a 
template and it supports only XY charts and time graph views. Current 
work by students at Polytechnique in the data-driven analysis involves 
defining data-driven custom filters for events and views, developing a 
visual UI to build an analysis from a state diagram, supporting more use 
cases of analysis from the event data. I'm cc'ing Naser Ezzati, who's 
working currently on the XML analysis. He's been working among other 
things on custom statistics, I don't know what's the status of this 
development, but he may point you to his development branch, if it's 
ready, so you can see if it fits your current need.

If you want some more details on the data-driven analysis work being 
done at Poly right now, you can look at the presentations by Naser 
Ezzati, Jean-Christian Kouamé and Simon Delisle on this page: 
https://ahls.dorsal.polymtl.ca/dec2014. If you're interested in trying 
out their [still experimental] work, let us know and we'll see if there 
is an experimental working branch you could try.

Cheers,
Geneviève



On 01/23/2015 11:30 AM, Alexandre Montplaisir wrote:
> Hi Wang,
>
> First of all, thank you very much for posting this use case. This is 
> exactly the type of user feedback that will help make the toolchain 
> better and more useful for users!
>
> Some comments and questions below,
>
>
> On 01/23/2015 04:35 AM, Wang Nan wrote:
>> [...]
>>
>> Then I need to convert perf.data to ctf. It tooks 140.57s to convert
>> 2598513 samples, which are collected during only 1 second execution. My
>> working server has 64 2.0GHz Intel Xeon cores, but perf conversion
>> utilizes only 1 of them. I think this is another thing can be improved.
>
> Out of curiosity, approximately how big (in bytes) is the generated 
> CTF trace directory?
>
>>
>> The next step is visualization. Output ctf trace can be opened with
>> TraceCompass without problem. The most important views for me should be
>> resources view (I use them to check CPU usage) and control flow view (I
>> use them to check thread activities).
>>
>> The first uncomfortable thing is TraceCompass' slow response time. For
>> the trace I mentioned above, on resource view, after I click on CPU
>> idle area, I have to wait more than 10 seconds for event list updating
>> to get the previous event before the idle area.
>
> Interesting. It is expected that opening a very large trace would take 
> a long time to load the first time, as everything gets indexed. But 
> once that step is done, seeking within the trace should be relatively 
> quick ((log n) wrt to the trace size). In theory ;)
>
> The perf-to-CTF conversion brings a completely new type of CTF traces 
> that was not seen before. It is possible that the CTF parser in Trace 
> Compass has some inefficiencies that were not exposed by other trace 
> types. Are you able to share that trace publicly? Or a trace taken in 
> the same environment, with no sensible information in it? It could be 
> very helpful in finding such problem.
>
>> Then I found through resources view that perf itself tooks lots of CPU
>> time. In my case 33.5% samples are generated by perf itself. One core is
>> dedicated to perf and never idle or taken by others. I think this should
>> be another thing needs to be improved: perf should give a way to
>> blacklist itself when tracing all CPUs.
>
> I don't want to start a tracer-war here :) but have you investigated 
> using LTTng for recording syscall/sched events ? Compared to perf, 
> LTTng is only about "getting trace events", and is a bit more involved 
> to set up, but it is more focused on performance and minimizing the 
> impact on the traced applications. And it outputs in CTF format too.
>
> I remember when testing the perf-CTF patches, comparing a perf trace 
> to an LTTng one, perf would be doing system calls continuously on one 
> of the CPUs for the whole duration of the trace. Whereas in LTTng 
> traces, the session daemon would be a bit active at the beginning and 
> at then end, but otherwise completely invisible from the trace.
>
>> TraceCompass doesn't recognize syscall:* tracepoints as CPU status
>> changing point. I have to also catch raw_syscall:*, and which doubles
>> the number of samples.
>
> This is a gap in the definition of the analysis it seems. I don't 
> remember implementing two types of "syscall" events in the perf 
> analysis, so it should just be a matter of getting the exact event 
> name and adding it to the list. I will take a look and keep you posted!
>
>> Finally I found the syscall which cause idle. However I need to write a
>> script to do statistics. TraceCompass itself is lack a mean to count
>> different events in my way.
>
> Could you elaborate on this please? I agree the "Statistics" view in 
> TC is severely lacking, we could be gathering and displaying much more 
> information. The only question is what information would actually be 
> useful.
>
> What exactly would you have liked to be able to see in the tool?
>
>> [...]
>>
>>
>>   5. Ad-Hoc visualization and statistics. Currently TraceCompass only
>>      support dwaring pre-defined events and processes. When I try to
>>      capture syscalls:*, I won't get benefit from TraceCompass 
>> because it
>>      doesn't know them. I believe that during system tuning we will
>>      finally get somewhere unable to be pre-defined by TraceCompass
>>      designer. Therefore give users abilities to define their own events
>>      and model should be much helpful.
>
> As I mentioned earlier, the pre-defined "perf analysis" in Trace 
> Compass should be fixed to handle the syscall events.
>
>
> But it's interesting that you mention wanting to add your own events 
> and model. I completely agree with you, we will never be able to 
> predict every and all use cases the users will want to use the tool 
> for, so there should be a way for the user to add their own.
>
> Well good news, it *is* possible for the user to define their own 
> analysis and views! This is still undergoing a lot of development, and 
> there is no nice UI yet, which is why it is not really advertized. But 
> starting from any supported trace type, a user today can define a time 
> graph view (like the Resource View for example) or a XY chart, using a 
> data-driven XML syntax.
>
> If you are curious, you can take a look at a full example of doing 
> such a thing on this page:
> https://github.com/alexmonthy/ust-tc-example
> (the example uses an LTTng UST trace as a source, but it could work 
> with any supported trace type, even a custom text trace defined in the 
> UI).
>
>>
>> Thank you.
>
> Thanks again for taking the time to write about your experience!
>
> Cheers,
> Alexandre
>
>
> _______________________________________________
> lttng-dev mailing list
> lttng-dev@lists.lttng.org
> http://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [diamon-discuss] [lttng-dev]  My experience on perf, CTF and TraceCompass, and some suggection.
  2015-01-23 16:30 ` Alexandre Montplaisir
@ 2015-01-23 16:56   ` Geneviève Bastien
  2015-01-23 16:56   ` [diamon-discuss] " Geneviève Bastien
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 35+ messages in thread
From: Geneviève Bastien @ 2015-01-23 16:56 UTC (permalink / raw)
  To: Wang Nan
  Cc: diamon-discuss, lttng-dev, Naser Ezzati,
	tracecompass developer discussions

Hi Wang,

Thanks for sharing your experience. It's always useful to have some real 
live use case of using the tools.

Alex already made a quite complete answer.

I'll just add some information about your wish for Ad-Hoc visualization 
and statistics. As Alex said, data driven analysis, using XML files is 
already present in Trace Compass. Documentation on how to use it is 
available here: 
https://wiki.eclipse.org/Linux_Tools_Project/LTTng2/User_Guide#Data_driven_analysis

You can build your own analysis with any event type. The current support 
is rather basic, you need to write the XML by yourself, starting from a 
template and it supports only XY charts and time graph views. Current 
work by students at Polytechnique in the data-driven analysis involves 
defining data-driven custom filters for events and views, developing a 
visual UI to build an analysis from a state diagram, supporting more use 
cases of analysis from the event data. I'm cc'ing Naser Ezzati, who's 
working currently on the XML analysis. He's been working among other 
things on custom statistics, I don't know what's the status of this 
development, but he may point you to his development branch, if it's 
ready, so you can see if it fits your current need.

If you want some more details on the data-driven analysis work being 
done at Poly right now, you can look at the presentations by Naser 
Ezzati, Jean-Christian Kouamé and Simon Delisle on this page: 
https://ahls.dorsal.polymtl.ca/dec2014. If you're interested in trying 
out their [still experimental] work, let us know and we'll see if there 
is an experimental working branch you could try.

Cheers,
Geneviève



On 01/23/2015 11:30 AM, Alexandre Montplaisir wrote:
> Hi Wang,
>
> First of all, thank you very much for posting this use case. This is 
> exactly the type of user feedback that will help make the toolchain 
> better and more useful for users!
>
> Some comments and questions below,
>
>
> On 01/23/2015 04:35 AM, Wang Nan wrote:
>> [...]
>>
>> Then I need to convert perf.data to ctf. It tooks 140.57s to convert
>> 2598513 samples, which are collected during only 1 second execution. My
>> working server has 64 2.0GHz Intel Xeon cores, but perf conversion
>> utilizes only 1 of them. I think this is another thing can be improved.
>
> Out of curiosity, approximately how big (in bytes) is the generated 
> CTF trace directory?
>
>>
>> The next step is visualization. Output ctf trace can be opened with
>> TraceCompass without problem. The most important views for me should be
>> resources view (I use them to check CPU usage) and control flow view (I
>> use them to check thread activities).
>>
>> The first uncomfortable thing is TraceCompass' slow response time. For
>> the trace I mentioned above, on resource view, after I click on CPU
>> idle area, I have to wait more than 10 seconds for event list updating
>> to get the previous event before the idle area.
>
> Interesting. It is expected that opening a very large trace would take 
> a long time to load the first time, as everything gets indexed. But 
> once that step is done, seeking within the trace should be relatively 
> quick ((log n) wrt to the trace size). In theory ;)
>
> The perf-to-CTF conversion brings a completely new type of CTF traces 
> that was not seen before. It is possible that the CTF parser in Trace 
> Compass has some inefficiencies that were not exposed by other trace 
> types. Are you able to share that trace publicly? Or a trace taken in 
> the same environment, with no sensible information in it? It could be 
> very helpful in finding such problem.
>
>> Then I found through resources view that perf itself tooks lots of CPU
>> time. In my case 33.5% samples are generated by perf itself. One core is
>> dedicated to perf and never idle or taken by others. I think this should
>> be another thing needs to be improved: perf should give a way to
>> blacklist itself when tracing all CPUs.
>
> I don't want to start a tracer-war here :) but have you investigated 
> using LTTng for recording syscall/sched events ? Compared to perf, 
> LTTng is only about "getting trace events", and is a bit more involved 
> to set up, but it is more focused on performance and minimizing the 
> impact on the traced applications. And it outputs in CTF format too.
>
> I remember when testing the perf-CTF patches, comparing a perf trace 
> to an LTTng one, perf would be doing system calls continuously on one 
> of the CPUs for the whole duration of the trace. Whereas in LTTng 
> traces, the session daemon would be a bit active at the beginning and 
> at then end, but otherwise completely invisible from the trace.
>
>> TraceCompass doesn't recognize syscall:* tracepoints as CPU status
>> changing point. I have to also catch raw_syscall:*, and which doubles
>> the number of samples.
>
> This is a gap in the definition of the analysis it seems. I don't 
> remember implementing two types of "syscall" events in the perf 
> analysis, so it should just be a matter of getting the exact event 
> name and adding it to the list. I will take a look and keep you posted!
>
>> Finally I found the syscall which cause idle. However I need to write a
>> script to do statistics. TraceCompass itself is lack a mean to count
>> different events in my way.
>
> Could you elaborate on this please? I agree the "Statistics" view in 
> TC is severely lacking, we could be gathering and displaying much more 
> information. The only question is what information would actually be 
> useful.
>
> What exactly would you have liked to be able to see in the tool?
>
>> [...]
>>
>>
>>   5. Ad-Hoc visualization and statistics. Currently TraceCompass only
>>      support dwaring pre-defined events and processes. When I try to
>>      capture syscalls:*, I won't get benefit from TraceCompass 
>> because it
>>      doesn't know them. I believe that during system tuning we will
>>      finally get somewhere unable to be pre-defined by TraceCompass
>>      designer. Therefore give users abilities to define their own events
>>      and model should be much helpful.
>
> As I mentioned earlier, the pre-defined "perf analysis" in Trace 
> Compass should be fixed to handle the syscall events.
>
>
> But it's interesting that you mention wanting to add your own events 
> and model. I completely agree with you, we will never be able to 
> predict every and all use cases the users will want to use the tool 
> for, so there should be a way for the user to add their own.
>
> Well good news, it *is* possible for the user to define their own 
> analysis and views! This is still undergoing a lot of development, and 
> there is no nice UI yet, which is why it is not really advertized. But 
> starting from any supported trace type, a user today can define a time 
> graph view (like the Resource View for example) or a XY chart, using a 
> data-driven XML syntax.
>
> If you are curious, you can take a look at a full example of doing 
> such a thing on this page:
> https://github.com/alexmonthy/ust-tc-example
> (the example uses an LTTng UST trace as a source, but it could work 
> with any supported trace type, even a custom text trace defined in the 
> UI).
>
>>
>> Thank you.
>
> Thanks again for taking the time to write about your experience!
>
> Cheers,
> Alexandre
>
>
> _______________________________________________
> lttng-dev mailing list
> lttng-dev@lists.lttng.org
> http://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection.
  2015-01-23  9:35 [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection Wang Nan
  2015-01-23 16:30 ` Alexandre Montplaisir
  2015-01-23 16:30 ` Alexandre Montplaisir
@ 2015-01-23 20:12 ` Alexandre Montplaisir
  2015-01-23 20:12 ` Alexandre Montplaisir
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 35+ messages in thread
From: Alexandre Montplaisir @ 2015-01-23 20:12 UTC (permalink / raw)
  To: Wang Nan; +Cc: diamon-discuss, lttng-dev, tracecompass developer discussions


On 01/23/2015 04:35 AM, Wang Nan wrote:
> [...] I prefer to use:
>
>   # perf record -a -e sched:* -e syscalls:* sleep 1
>
> However there are some bugs and I have to make some patches. They are
> posted and being disscussed currently, those bugs are still exist
> upstream.

Hmm, I can confirm this problem, when trying to convert a trace taken 
with these events enabled, I get the following error:

Failed to add field 'nr
Failed to add event 'syscalls:sys_enter_io_submit'.

Good to know it's being looked at!


But when you say:

> TraceCompass doesn't recognize syscall:* tracepoints as CPU status
> changing point. I have to also catch raw_syscall:*, and which doubles
> the number of samples.

I assume this was that with your patches applied, so that you could take 
a trace of the syscall:* tracepoints?

Do you have a link to those patches somewhere? Or just let us know when 
it gets integrated upstream. I was trying to update the analysis in 
Trace Compass to handle syscall:* too, but we can't really do it without 
knowing how it will look like in CTF format ;)


Cheers,
Alexandre

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection.
  2015-01-23  9:35 [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection Wang Nan
                   ` (2 preceding siblings ...)
  2015-01-23 20:12 ` Alexandre Montplaisir
@ 2015-01-23 20:12 ` Alexandre Montplaisir
  2015-01-26  4:08   ` Wang Nan
  2015-01-26  4:08   ` Wang Nan
  2015-01-26 16:25 ` Jérémie Galarneau
  2015-01-26 16:25 ` Jérémie Galarneau
  5 siblings, 2 replies; 35+ messages in thread
From: Alexandre Montplaisir @ 2015-01-23 20:12 UTC (permalink / raw)
  To: Wang Nan; +Cc: diamon-discuss, lttng-dev, tracecompass developer discussions


On 01/23/2015 04:35 AM, Wang Nan wrote:
> [...] I prefer to use:
>
>   # perf record -a -e sched:* -e syscalls:* sleep 1
>
> However there are some bugs and I have to make some patches. They are
> posted and being disscussed currently, those bugs are still exist
> upstream.

Hmm, I can confirm this problem, when trying to convert a trace taken 
with these events enabled, I get the following error:

Failed to add field 'nr
Failed to add event 'syscalls:sys_enter_io_submit'.

Good to know it's being looked at!


But when you say:

> TraceCompass doesn't recognize syscall:* tracepoints as CPU status
> changing point. I have to also catch raw_syscall:*, and which doubles
> the number of samples.

I assume this was that with your patches applied, so that you could take 
a trace of the syscall:* tracepoints?

Do you have a link to those patches somewhere? Or just let us know when 
it gets integrated upstream. I was trying to update the analysis in 
Trace Compass to handle syscall:* too, but we can't really do it without 
knowing how it will look like in CTF format ;)


Cheers,
Alexandre


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection.
  2015-01-23 20:12 ` Alexandre Montplaisir
  2015-01-26  4:08   ` Wang Nan
@ 2015-01-26  4:08   ` Wang Nan
  1 sibling, 0 replies; 35+ messages in thread
From: Wang Nan @ 2015-01-26  4:08 UTC (permalink / raw)
  To: Alexandre Montplaisir
  Cc: diamon-discuss, lttng-dev, tracecompass developer discussions

On 2015/1/24 4:12, Alexandre Montplaisir wrote:
> 
> On 01/23/2015 04:35 AM, Wang Nan wrote:
>> [...] I prefer to use:
>>
>>   # perf record -a -e sched:* -e syscalls:* sleep 1
>>
>> However there are some bugs and I have to make some patches. They are
>> posted and being disscussed currently, those bugs are still exist
>> upstream.
> 
> Hmm, I can confirm this problem, when trying to convert a trace taken with these events enabled, I get the following error:
> 
> Failed to add field 'nr
> Failed to add event 'syscalls:sys_enter_io_submit'.
> 
> Good to know it's being looked at!
> 
> 
> But when you say:
> 
>> TraceCompass doesn't recognize syscall:* tracepoints as CPU status
>> changing point. I have to also catch raw_syscall:*, and which doubles
>> the number of samples.
> 
> I assume this was that with your patches applied, so that you could take a trace of the syscall:* tracepoints?
> 
> Do you have a link to those patches somewhere? Or just let us know when it gets integrated upstream. I was trying to update the analysis in Trace Compass to handle syscall:* too, but we can't really do it without knowing how it will look like in CTF format ;)
> 
> 

Hi,

You can follow this thread in LKML:

https://lkml.org/lkml/2015/1/20/330

In https://lkml.org/lkml/2015/1/22/24 , I posted 2 RFC patches can solve the problem.
However I'm still waiting response from Steven Rostedt whether he allow us to improve
traceevent.


> Cheers,
> Alexandre
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection.
  2015-01-23 20:12 ` Alexandre Montplaisir
@ 2015-01-26  4:08   ` Wang Nan
  2015-01-26  4:08   ` Wang Nan
  1 sibling, 0 replies; 35+ messages in thread
From: Wang Nan @ 2015-01-26  4:08 UTC (permalink / raw)
  To: Alexandre Montplaisir
  Cc: diamon-discuss, lttng-dev, tracecompass developer discussions

On 2015/1/24 4:12, Alexandre Montplaisir wrote:
> 
> On 01/23/2015 04:35 AM, Wang Nan wrote:
>> [...] I prefer to use:
>>
>>   # perf record -a -e sched:* -e syscalls:* sleep 1
>>
>> However there are some bugs and I have to make some patches. They are
>> posted and being disscussed currently, those bugs are still exist
>> upstream.
> 
> Hmm, I can confirm this problem, when trying to convert a trace taken with these events enabled, I get the following error:
> 
> Failed to add field 'nr
> Failed to add event 'syscalls:sys_enter_io_submit'.
> 
> Good to know it's being looked at!
> 
> 
> But when you say:
> 
>> TraceCompass doesn't recognize syscall:* tracepoints as CPU status
>> changing point. I have to also catch raw_syscall:*, and which doubles
>> the number of samples.
> 
> I assume this was that with your patches applied, so that you could take a trace of the syscall:* tracepoints?
> 
> Do you have a link to those patches somewhere? Or just let us know when it gets integrated upstream. I was trying to update the analysis in Trace Compass to handle syscall:* too, but we can't really do it without knowing how it will look like in CTF format ;)
> 
> 

Hi,

You can follow this thread in LKML:

https://lkml.org/lkml/2015/1/20/330

In https://lkml.org/lkml/2015/1/22/24 , I posted 2 RFC patches can solve the problem.
However I'm still waiting response from Steven Rostedt whether he allow us to improve
traceevent.


> Cheers,
> Alexandre
> 



^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection.
  2015-01-23  9:35 [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection Wang Nan
                   ` (4 preceding siblings ...)
  2015-01-26 16:25 ` Jérémie Galarneau
@ 2015-01-26 16:25 ` Jérémie Galarneau
  5 siblings, 0 replies; 35+ messages in thread
From: Jérémie Galarneau @ 2015-01-26 16:25 UTC (permalink / raw)
  To: Wang Nan; +Cc: diamon-discuss, lttng-dev

On Fri, Jan 23, 2015 at 4:35 AM, Wang Nan <wangnan0@huawei.com> wrote:
> Hi folks,
>
> I'd like to share my first tuning experience with perf, ctf and
> TraceCompass here, and I hope my experience helpful to diamon.org. Most
> part of this mail is talking about my work. If you don't
> interest in it, you can directly jump to conclusion part.
>
> *My Task*
>
> What I'm working on is finding the reason why CPU idle rate is high when
> we benchmarking a database. I think it should be a very simple task:
> tracing scheduling and system calls, finding the previous syscall issued
> before idle, then based on statistics, collecting some user spaces call
> stack, I can give an answer. I use perf to collect trace,
> perf-convert-to-ctf to get ctf output and TraceCompass for
> visualization.
>
>
> *My Experience*
>
> First of all I use perf to collect trace:
>
>  # perf record -a -e sched:* -e raw_syscalls:* sleep 1
>
> then
>
>  # perf data convert --to-ctf out.ctf
>
> Which is simple. However, raw_syscalls:* tracepoints export less
> information than syscalls:* tracepoints. Without them I have to manually
> find syscall name from syscall id. I prefer to use:
>
>  # perf record -a -e sched:* -e syscalls:* sleep 1
>
> However there are some bugs and I have to make some patches. They are
> posted and being disscussed currently, those bugs are still exist
> upstream.
>
> Then I need to convert perf.data to ctf. It tooks 140.57s to convert
> 2598513 samples, which are collected during only 1 second execution. My
> working server has 64 2.0GHz Intel Xeon cores, but perf conversion
> utilizes only 1 of them. I think this is another thing can be improved.

Thanks for taking the time to do this write-up!

Would it be possible to profile perf-to-ctf so we can spot the bottleneck?

Regards,
Jérémie

>
> The next step is visualization. Output ctf trace can be opened with
> TraceCompass without problem. The most important views for me should be
> resources view (I use them to check CPU usage) and control flow view (I
> use them to check thread activities).
>
> The first uncomfortable thing is TraceCompass' slow response time. For
> the trace I mentioned above, on resource view, after I click on CPU
> idle area, I have to wait more than 10 seconds for event list updating
> to get the previous event before the idle area.
>
> Then I found through resources view that perf itself tooks lots of CPU
> time. In my case 33.5% samples are generated by perf itself. One core is
> dedicated to perf and never idle or taken by others. I think this should
> be another thing needs to be improved: perf should give a way to
> blacklist itself when tracing all CPUs.
>
> TraceCompass doesn't recognize syscall:* tracepoints as CPU status
> changing point. I have to also catch raw_syscall:*, and which doubles
> the number of samples.
>
> Finally I found the syscall which cause idle. However I need to write a
> script to do statistics. TraceCompass itself is lack a mean to count
> different events in my way.
>
> The next thing I should do is to find the calltrace which issue the
> syscall. This time TraceCompass won't help, mostly because perf
> convertion now doesn't support converting calltrace.
>
> *Conclusion*
>
> I suggest perf and TraceCompass to think about following improvements:
>
>  1. Reducing the cost of perf recording. There are one third events are
>     generated by perf itself in my case. Is it possible that perf could
>     provide an ability that blacklist itself and collect all other
>     events?
>
>  2. Improving perf converting performance. Converting perf.data to CTF is
>     slow, but it should be offline most of the time. We can utilize the
>     abilities multi-core server to make it working in parallel.
>
>  3. Improving TraceCompass responding performance, especially when
>     synchronizing different views.
>
>  4. Support converting userspace call trace. I think perf side should already
>     have a plan on it.
>
>  5. Ad-Hoc visualization and statistics. Currently TraceCompass only
>     support dwaring pre-defined events and processes. When I try to
>     capture syscalls:*, I won't get benefit from TraceCompass because it
>     doesn't know them. I believe that during system tuning we will
>     finally get somewhere unable to be pre-defined by TraceCompass
>     designer. Therefore give users abilities to define their own events
>     and model should be much helpful.
>
> Thank you.
>
> _______________________________________________
> diamon-discuss mailing list
> diamon-discuss@lists.linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/diamon-discuss



-- 
Jérémie Galarneau
EfficiOS Inc.
http://www.efficios.com

_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
http://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection.
  2015-01-23  9:35 [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection Wang Nan
                   ` (3 preceding siblings ...)
  2015-01-23 20:12 ` Alexandre Montplaisir
@ 2015-01-26 16:25 ` Jérémie Galarneau
  2015-01-27  0:54   ` Wang Nan
  2015-01-27  0:54   ` Wang Nan
  2015-01-26 16:25 ` Jérémie Galarneau
  5 siblings, 2 replies; 35+ messages in thread
From: Jérémie Galarneau @ 2015-01-26 16:25 UTC (permalink / raw)
  To: Wang Nan; +Cc: diamon-discuss, lttng-dev

On Fri, Jan 23, 2015 at 4:35 AM, Wang Nan <wangnan0@huawei.com> wrote:
> Hi folks,
>
> I'd like to share my first tuning experience with perf, ctf and
> TraceCompass here, and I hope my experience helpful to diamon.org. Most
> part of this mail is talking about my work. If you don't
> interest in it, you can directly jump to conclusion part.
>
> *My Task*
>
> What I'm working on is finding the reason why CPU idle rate is high when
> we benchmarking a database. I think it should be a very simple task:
> tracing scheduling and system calls, finding the previous syscall issued
> before idle, then based on statistics, collecting some user spaces call
> stack, I can give an answer. I use perf to collect trace,
> perf-convert-to-ctf to get ctf output and TraceCompass for
> visualization.
>
>
> *My Experience*
>
> First of all I use perf to collect trace:
>
>  # perf record -a -e sched:* -e raw_syscalls:* sleep 1
>
> then
>
>  # perf data convert --to-ctf out.ctf
>
> Which is simple. However, raw_syscalls:* tracepoints export less
> information than syscalls:* tracepoints. Without them I have to manually
> find syscall name from syscall id. I prefer to use:
>
>  # perf record -a -e sched:* -e syscalls:* sleep 1
>
> However there are some bugs and I have to make some patches. They are
> posted and being disscussed currently, those bugs are still exist
> upstream.
>
> Then I need to convert perf.data to ctf. It tooks 140.57s to convert
> 2598513 samples, which are collected during only 1 second execution. My
> working server has 64 2.0GHz Intel Xeon cores, but perf conversion
> utilizes only 1 of them. I think this is another thing can be improved.

Thanks for taking the time to do this write-up!

Would it be possible to profile perf-to-ctf so we can spot the bottleneck?

Regards,
Jérémie

>
> The next step is visualization. Output ctf trace can be opened with
> TraceCompass without problem. The most important views for me should be
> resources view (I use them to check CPU usage) and control flow view (I
> use them to check thread activities).
>
> The first uncomfortable thing is TraceCompass' slow response time. For
> the trace I mentioned above, on resource view, after I click on CPU
> idle area, I have to wait more than 10 seconds for event list updating
> to get the previous event before the idle area.
>
> Then I found through resources view that perf itself tooks lots of CPU
> time. In my case 33.5% samples are generated by perf itself. One core is
> dedicated to perf and never idle or taken by others. I think this should
> be another thing needs to be improved: perf should give a way to
> blacklist itself when tracing all CPUs.
>
> TraceCompass doesn't recognize syscall:* tracepoints as CPU status
> changing point. I have to also catch raw_syscall:*, and which doubles
> the number of samples.
>
> Finally I found the syscall which cause idle. However I need to write a
> script to do statistics. TraceCompass itself is lack a mean to count
> different events in my way.
>
> The next thing I should do is to find the calltrace which issue the
> syscall. This time TraceCompass won't help, mostly because perf
> convertion now doesn't support converting calltrace.
>
> *Conclusion*
>
> I suggest perf and TraceCompass to think about following improvements:
>
>  1. Reducing the cost of perf recording. There are one third events are
>     generated by perf itself in my case. Is it possible that perf could
>     provide an ability that blacklist itself and collect all other
>     events?
>
>  2. Improving perf converting performance. Converting perf.data to CTF is
>     slow, but it should be offline most of the time. We can utilize the
>     abilities multi-core server to make it working in parallel.
>
>  3. Improving TraceCompass responding performance, especially when
>     synchronizing different views.
>
>  4. Support converting userspace call trace. I think perf side should already
>     have a plan on it.
>
>  5. Ad-Hoc visualization and statistics. Currently TraceCompass only
>     support dwaring pre-defined events and processes. When I try to
>     capture syscalls:*, I won't get benefit from TraceCompass because it
>     doesn't know them. I believe that during system tuning we will
>     finally get somewhere unable to be pre-defined by TraceCompass
>     designer. Therefore give users abilities to define their own events
>     and model should be much helpful.
>
> Thank you.
>
> _______________________________________________
> diamon-discuss mailing list
> diamon-discuss@lists.linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/diamon-discuss



-- 
Jérémie Galarneau
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection.
  2015-01-26 16:25 ` Jérémie Galarneau
@ 2015-01-27  0:54   ` Wang Nan
  2015-01-27  0:54   ` Wang Nan
  1 sibling, 0 replies; 35+ messages in thread
From: Wang Nan @ 2015-01-27  0:54 UTC (permalink / raw)
  To: Jérémie Galarneau; +Cc: Xinwei Hu, diamon-discuss, lttng-dev

On 2015/1/27 0:25, Jérémie Galarneau wrote:
> On Fri, Jan 23, 2015 at 4:35 AM, Wang Nan <wangnan0@huawei.com> wrote:
>> Hi folks,
>>
>> I'd like to share my first tuning experience with perf, ctf and
>> TraceCompass here, and I hope my experience helpful to diamon.org. Most
>> part of this mail is talking about my work. If you don't
>> interest in it, you can directly jump to conclusion part.
>>
>> *My Task*
>>
>> What I'm working on is finding the reason why CPU idle rate is high when
>> we benchmarking a database. I think it should be a very simple task:
>> tracing scheduling and system calls, finding the previous syscall issued
>> before idle, then based on statistics, collecting some user spaces call
>> stack, I can give an answer. I use perf to collect trace,
>> perf-convert-to-ctf to get ctf output and TraceCompass for
>> visualization.
>>
>>
>> *My Experience*
>>
>> First of all I use perf to collect trace:
>>
>>  # perf record -a -e sched:* -e raw_syscalls:* sleep 1
>>
>> then
>>
>>  # perf data convert --to-ctf out.ctf
>>
>> Which is simple. However, raw_syscalls:* tracepoints export less
>> information than syscalls:* tracepoints. Without them I have to manually
>> find syscall name from syscall id. I prefer to use:
>>
>>  # perf record -a -e sched:* -e syscalls:* sleep 1
>>
>> However there are some bugs and I have to make some patches. They are
>> posted and being disscussed currently, those bugs are still exist
>> upstream.
>>
>> Then I need to convert perf.data to ctf. It tooks 140.57s to convert
>> 2598513 samples, which are collected during only 1 second execution. My
>> working server has 64 2.0GHz Intel Xeon cores, but perf conversion
>> utilizes only 1 of them. I think this is another thing can be improved.
> 
> Thanks for taking the time to do this write-up!
> 
> Would it be possible to profile perf-to-ctf so we can spot the bottleneck?
> 
> Regards,
> Jérémie
> 

We should, but I don't have enough time to do this. In addition, I think perf-to-ctf
conversion is possible to be parallelized without too much work. I suggest to consider
code tuning after we done that at perf side. What do you think?

>>
>> The next step is visualization. Output ctf trace can be opened with
>> TraceCompass without problem. The most important views for me should be
>> resources view (I use them to check CPU usage) and control flow view (I
>> use them to check thread activities).
>>
>> The first uncomfortable thing is TraceCompass' slow response time. For
>> the trace I mentioned above, on resource view, after I click on CPU
>> idle area, I have to wait more than 10 seconds for event list updating
>> to get the previous event before the idle area.
>>
>> Then I found through resources view that perf itself tooks lots of CPU
>> time. In my case 33.5% samples are generated by perf itself. One core is
>> dedicated to perf and never idle or taken by others. I think this should
>> be another thing needs to be improved: perf should give a way to
>> blacklist itself when tracing all CPUs.
>>
>> TraceCompass doesn't recognize syscall:* tracepoints as CPU status
>> changing point. I have to also catch raw_syscall:*, and which doubles
>> the number of samples.
>>
>> Finally I found the syscall which cause idle. However I need to write a
>> script to do statistics. TraceCompass itself is lack a mean to count
>> different events in my way.
>>
>> The next thing I should do is to find the calltrace which issue the
>> syscall. This time TraceCompass won't help, mostly because perf
>> convertion now doesn't support converting calltrace.
>>
>> *Conclusion*
>>
>> I suggest perf and TraceCompass to think about following improvements:
>>
>>  1. Reducing the cost of perf recording. There are one third events are
>>     generated by perf itself in my case. Is it possible that perf could
>>     provide an ability that blacklist itself and collect all other
>>     events?
>>
>>  2. Improving perf converting performance. Converting perf.data to CTF is
>>     slow, but it should be offline most of the time. We can utilize the
>>     abilities multi-core server to make it working in parallel.
>>
>>  3. Improving TraceCompass responding performance, especially when
>>     synchronizing different views.
>>
>>  4. Support converting userspace call trace. I think perf side should already
>>     have a plan on it.
>>
>>  5. Ad-Hoc visualization and statistics. Currently TraceCompass only
>>     support dwaring pre-defined events and processes. When I try to
>>     capture syscalls:*, I won't get benefit from TraceCompass because it
>>     doesn't know them. I believe that during system tuning we will
>>     finally get somewhere unable to be pre-defined by TraceCompass
>>     designer. Therefore give users abilities to define their own events
>>     and model should be much helpful.
>>
>> Thank you.
>>
>> _______________________________________________
>> diamon-discuss mailing list
>> diamon-discuss@lists.linuxfoundation.org
>> https://lists.linuxfoundation.org/mailman/listinfo/diamon-discuss
> 
> 
> 



_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
http://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection.
  2015-01-26 16:25 ` Jérémie Galarneau
  2015-01-27  0:54   ` Wang Nan
@ 2015-01-27  0:54   ` Wang Nan
  2015-01-27  5:31     ` Jérémie Galarneau
  2015-01-27  5:31     ` Jérémie Galarneau
  1 sibling, 2 replies; 35+ messages in thread
From: Wang Nan @ 2015-01-27  0:54 UTC (permalink / raw)
  To: Jérémie Galarneau; +Cc: Xinwei Hu, diamon-discuss, lttng-dev

On 2015/1/27 0:25, Jérémie Galarneau wrote:
> On Fri, Jan 23, 2015 at 4:35 AM, Wang Nan <wangnan0@huawei.com> wrote:
>> Hi folks,
>>
>> I'd like to share my first tuning experience with perf, ctf and
>> TraceCompass here, and I hope my experience helpful to diamon.org. Most
>> part of this mail is talking about my work. If you don't
>> interest in it, you can directly jump to conclusion part.
>>
>> *My Task*
>>
>> What I'm working on is finding the reason why CPU idle rate is high when
>> we benchmarking a database. I think it should be a very simple task:
>> tracing scheduling and system calls, finding the previous syscall issued
>> before idle, then based on statistics, collecting some user spaces call
>> stack, I can give an answer. I use perf to collect trace,
>> perf-convert-to-ctf to get ctf output and TraceCompass for
>> visualization.
>>
>>
>> *My Experience*
>>
>> First of all I use perf to collect trace:
>>
>>  # perf record -a -e sched:* -e raw_syscalls:* sleep 1
>>
>> then
>>
>>  # perf data convert --to-ctf out.ctf
>>
>> Which is simple. However, raw_syscalls:* tracepoints export less
>> information than syscalls:* tracepoints. Without them I have to manually
>> find syscall name from syscall id. I prefer to use:
>>
>>  # perf record -a -e sched:* -e syscalls:* sleep 1
>>
>> However there are some bugs and I have to make some patches. They are
>> posted and being disscussed currently, those bugs are still exist
>> upstream.
>>
>> Then I need to convert perf.data to ctf. It tooks 140.57s to convert
>> 2598513 samples, which are collected during only 1 second execution. My
>> working server has 64 2.0GHz Intel Xeon cores, but perf conversion
>> utilizes only 1 of them. I think this is another thing can be improved.
> 
> Thanks for taking the time to do this write-up!
> 
> Would it be possible to profile perf-to-ctf so we can spot the bottleneck?
> 
> Regards,
> Jérémie
> 

We should, but I don't have enough time to do this. In addition, I think perf-to-ctf
conversion is possible to be parallelized without too much work. I suggest to consider
code tuning after we done that at perf side. What do you think?

>>
>> The next step is visualization. Output ctf trace can be opened with
>> TraceCompass without problem. The most important views for me should be
>> resources view (I use them to check CPU usage) and control flow view (I
>> use them to check thread activities).
>>
>> The first uncomfortable thing is TraceCompass' slow response time. For
>> the trace I mentioned above, on resource view, after I click on CPU
>> idle area, I have to wait more than 10 seconds for event list updating
>> to get the previous event before the idle area.
>>
>> Then I found through resources view that perf itself tooks lots of CPU
>> time. In my case 33.5% samples are generated by perf itself. One core is
>> dedicated to perf and never idle or taken by others. I think this should
>> be another thing needs to be improved: perf should give a way to
>> blacklist itself when tracing all CPUs.
>>
>> TraceCompass doesn't recognize syscall:* tracepoints as CPU status
>> changing point. I have to also catch raw_syscall:*, and which doubles
>> the number of samples.
>>
>> Finally I found the syscall which cause idle. However I need to write a
>> script to do statistics. TraceCompass itself is lack a mean to count
>> different events in my way.
>>
>> The next thing I should do is to find the calltrace which issue the
>> syscall. This time TraceCompass won't help, mostly because perf
>> convertion now doesn't support converting calltrace.
>>
>> *Conclusion*
>>
>> I suggest perf and TraceCompass to think about following improvements:
>>
>>  1. Reducing the cost of perf recording. There are one third events are
>>     generated by perf itself in my case. Is it possible that perf could
>>     provide an ability that blacklist itself and collect all other
>>     events?
>>
>>  2. Improving perf converting performance. Converting perf.data to CTF is
>>     slow, but it should be offline most of the time. We can utilize the
>>     abilities multi-core server to make it working in parallel.
>>
>>  3. Improving TraceCompass responding performance, especially when
>>     synchronizing different views.
>>
>>  4. Support converting userspace call trace. I think perf side should already
>>     have a plan on it.
>>
>>  5. Ad-Hoc visualization and statistics. Currently TraceCompass only
>>     support dwaring pre-defined events and processes. When I try to
>>     capture syscalls:*, I won't get benefit from TraceCompass because it
>>     doesn't know them. I believe that during system tuning we will
>>     finally get somewhere unable to be pre-defined by TraceCompass
>>     designer. Therefore give users abilities to define their own events
>>     and model should be much helpful.
>>
>> Thank you.
>>
>> _______________________________________________
>> diamon-discuss mailing list
>> diamon-discuss@lists.linuxfoundation.org
>> https://lists.linuxfoundation.org/mailman/listinfo/diamon-discuss
> 
> 
> 



^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection.
  2015-01-27  0:54   ` Wang Nan
@ 2015-01-27  5:31     ` Jérémie Galarneau
  2015-01-27  5:31     ` Jérémie Galarneau
  1 sibling, 0 replies; 35+ messages in thread
From: Jérémie Galarneau @ 2015-01-27  5:31 UTC (permalink / raw)
  To: Wang Nan; +Cc: Xinwei Hu, diamon-discuss, lttng-dev

On Mon, Jan 26, 2015 at 7:54 PM, Wang Nan <wangnan0@huawei.com> wrote:
> On 2015/1/27 0:25, Jérémie Galarneau wrote:
>> On Fri, Jan 23, 2015 at 4:35 AM, Wang Nan <wangnan0@huawei.com> wrote:
>>> Hi folks,
>>>
>>> I'd like to share my first tuning experience with perf, ctf and
>>> TraceCompass here, and I hope my experience helpful to diamon.org. Most
>>> part of this mail is talking about my work. If you don't
>>> interest in it, you can directly jump to conclusion part.
>>>
>>> *My Task*
>>>
>>> What I'm working on is finding the reason why CPU idle rate is high when
>>> we benchmarking a database. I think it should be a very simple task:
>>> tracing scheduling and system calls, finding the previous syscall issued
>>> before idle, then based on statistics, collecting some user spaces call
>>> stack, I can give an answer. I use perf to collect trace,
>>> perf-convert-to-ctf to get ctf output and TraceCompass for
>>> visualization.
>>>
>>>
>>> *My Experience*
>>>
>>> First of all I use perf to collect trace:
>>>
>>>  # perf record -a -e sched:* -e raw_syscalls:* sleep 1
>>>
>>> then
>>>
>>>  # perf data convert --to-ctf out.ctf
>>>
>>> Which is simple. However, raw_syscalls:* tracepoints export less
>>> information than syscalls:* tracepoints. Without them I have to manually
>>> find syscall name from syscall id. I prefer to use:
>>>
>>>  # perf record -a -e sched:* -e syscalls:* sleep 1
>>>
>>> However there are some bugs and I have to make some patches. They are
>>> posted and being disscussed currently, those bugs are still exist
>>> upstream.
>>>
>>> Then I need to convert perf.data to ctf. It tooks 140.57s to convert
>>> 2598513 samples, which are collected during only 1 second execution. My
>>> working server has 64 2.0GHz Intel Xeon cores, but perf conversion
>>> utilizes only 1 of them. I think this is another thing can be improved.
>>
>> Thanks for taking the time to do this write-up!
>>
>> Would it be possible to profile perf-to-ctf so we can spot the bottleneck?
>>
>> Regards,
>> Jérémie
>>
>
> We should, but I don't have enough time to do this. In addition, I think perf-to-ctf
> conversion is possible to be parallelized without too much work. I suggest to consider
> code tuning after we done that at perf side. What do you think?

I'm not sure I agree with the "without too much work" statement since
Babeltrace's CTF Writer API is not thread-safe at this moment. As
Alexandre previously mentionned, the size of the resulting CTF trace
could be a good indication of whether or not IO is the culprit here.

We have never done any optimization work on Babeltrace and CTF Writer.
We could be smarter on a number of front, notably with regards to
memory allocation; there is currently no pooling of allocated objects
(events, fields, types, etc.). We'll have to take a look at hard data
before getting our hands dirty. However, I'm sure there are tons of
optimization opportunities.

Jérémie

>
>>>
>>> The next step is visualization. Output ctf trace can be opened with
>>> TraceCompass without problem. The most important views for me should be
>>> resources view (I use them to check CPU usage) and control flow view (I
>>> use them to check thread activities).
>>>
>>> The first uncomfortable thing is TraceCompass' slow response time. For
>>> the trace I mentioned above, on resource view, after I click on CPU
>>> idle area, I have to wait more than 10 seconds for event list updating
>>> to get the previous event before the idle area.
>>>
>>> Then I found through resources view that perf itself tooks lots of CPU
>>> time. In my case 33.5% samples are generated by perf itself. One core is
>>> dedicated to perf and never idle or taken by others. I think this should
>>> be another thing needs to be improved: perf should give a way to
>>> blacklist itself when tracing all CPUs.
>>>
>>> TraceCompass doesn't recognize syscall:* tracepoints as CPU status
>>> changing point. I have to also catch raw_syscall:*, and which doubles
>>> the number of samples.
>>>
>>> Finally I found the syscall which cause idle. However I need to write a
>>> script to do statistics. TraceCompass itself is lack a mean to count
>>> different events in my way.
>>>
>>> The next thing I should do is to find the calltrace which issue the
>>> syscall. This time TraceCompass won't help, mostly because perf
>>> convertion now doesn't support converting calltrace.
>>>
>>> *Conclusion*
>>>
>>> I suggest perf and TraceCompass to think about following improvements:
>>>
>>>  1. Reducing the cost of perf recording. There are one third events are
>>>     generated by perf itself in my case. Is it possible that perf could
>>>     provide an ability that blacklist itself and collect all other
>>>     events?
>>>
>>>  2. Improving perf converting performance. Converting perf.data to CTF is
>>>     slow, but it should be offline most of the time. We can utilize the
>>>     abilities multi-core server to make it working in parallel.
>>>
>>>  3. Improving TraceCompass responding performance, especially when
>>>     synchronizing different views.
>>>
>>>  4. Support converting userspace call trace. I think perf side should already
>>>     have a plan on it.
>>>
>>>  5. Ad-Hoc visualization and statistics. Currently TraceCompass only
>>>     support dwaring pre-defined events and processes. When I try to
>>>     capture syscalls:*, I won't get benefit from TraceCompass because it
>>>     doesn't know them. I believe that during system tuning we will
>>>     finally get somewhere unable to be pre-defined by TraceCompass
>>>     designer. Therefore give users abilities to define their own events
>>>     and model should be much helpful.
>>>
>>> Thank you.
>>>
>>> _______________________________________________
>>> diamon-discuss mailing list
>>> diamon-discuss@lists.linuxfoundation.org
>>> https://lists.linuxfoundation.org/mailman/listinfo/diamon-discuss
>>
>>
>>
>
>



-- 
Jérémie Galarneau
EfficiOS Inc.
http://www.efficios.com

_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
http://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection.
  2015-01-27  0:54   ` Wang Nan
  2015-01-27  5:31     ` Jérémie Galarneau
@ 2015-01-27  5:31     ` Jérémie Galarneau
  2015-01-27  6:31       ` Wang Nan
  2015-01-27  6:31       ` Wang Nan
  1 sibling, 2 replies; 35+ messages in thread
From: Jérémie Galarneau @ 2015-01-27  5:31 UTC (permalink / raw)
  To: Wang Nan; +Cc: Xinwei Hu, diamon-discuss, lttng-dev

On Mon, Jan 26, 2015 at 7:54 PM, Wang Nan <wangnan0@huawei.com> wrote:
> On 2015/1/27 0:25, Jérémie Galarneau wrote:
>> On Fri, Jan 23, 2015 at 4:35 AM, Wang Nan <wangnan0@huawei.com> wrote:
>>> Hi folks,
>>>
>>> I'd like to share my first tuning experience with perf, ctf and
>>> TraceCompass here, and I hope my experience helpful to diamon.org. Most
>>> part of this mail is talking about my work. If you don't
>>> interest in it, you can directly jump to conclusion part.
>>>
>>> *My Task*
>>>
>>> What I'm working on is finding the reason why CPU idle rate is high when
>>> we benchmarking a database. I think it should be a very simple task:
>>> tracing scheduling and system calls, finding the previous syscall issued
>>> before idle, then based on statistics, collecting some user spaces call
>>> stack, I can give an answer. I use perf to collect trace,
>>> perf-convert-to-ctf to get ctf output and TraceCompass for
>>> visualization.
>>>
>>>
>>> *My Experience*
>>>
>>> First of all I use perf to collect trace:
>>>
>>>  # perf record -a -e sched:* -e raw_syscalls:* sleep 1
>>>
>>> then
>>>
>>>  # perf data convert --to-ctf out.ctf
>>>
>>> Which is simple. However, raw_syscalls:* tracepoints export less
>>> information than syscalls:* tracepoints. Without them I have to manually
>>> find syscall name from syscall id. I prefer to use:
>>>
>>>  # perf record -a -e sched:* -e syscalls:* sleep 1
>>>
>>> However there are some bugs and I have to make some patches. They are
>>> posted and being disscussed currently, those bugs are still exist
>>> upstream.
>>>
>>> Then I need to convert perf.data to ctf. It tooks 140.57s to convert
>>> 2598513 samples, which are collected during only 1 second execution. My
>>> working server has 64 2.0GHz Intel Xeon cores, but perf conversion
>>> utilizes only 1 of them. I think this is another thing can be improved.
>>
>> Thanks for taking the time to do this write-up!
>>
>> Would it be possible to profile perf-to-ctf so we can spot the bottleneck?
>>
>> Regards,
>> Jérémie
>>
>
> We should, but I don't have enough time to do this. In addition, I think perf-to-ctf
> conversion is possible to be parallelized without too much work. I suggest to consider
> code tuning after we done that at perf side. What do you think?

I'm not sure I agree with the "without too much work" statement since
Babeltrace's CTF Writer API is not thread-safe at this moment. As
Alexandre previously mentionned, the size of the resulting CTF trace
could be a good indication of whether or not IO is the culprit here.

We have never done any optimization work on Babeltrace and CTF Writer.
We could be smarter on a number of front, notably with regards to
memory allocation; there is currently no pooling of allocated objects
(events, fields, types, etc.). We'll have to take a look at hard data
before getting our hands dirty. However, I'm sure there are tons of
optimization opportunities.

Jérémie

>
>>>
>>> The next step is visualization. Output ctf trace can be opened with
>>> TraceCompass without problem. The most important views for me should be
>>> resources view (I use them to check CPU usage) and control flow view (I
>>> use them to check thread activities).
>>>
>>> The first uncomfortable thing is TraceCompass' slow response time. For
>>> the trace I mentioned above, on resource view, after I click on CPU
>>> idle area, I have to wait more than 10 seconds for event list updating
>>> to get the previous event before the idle area.
>>>
>>> Then I found through resources view that perf itself tooks lots of CPU
>>> time. In my case 33.5% samples are generated by perf itself. One core is
>>> dedicated to perf and never idle or taken by others. I think this should
>>> be another thing needs to be improved: perf should give a way to
>>> blacklist itself when tracing all CPUs.
>>>
>>> TraceCompass doesn't recognize syscall:* tracepoints as CPU status
>>> changing point. I have to also catch raw_syscall:*, and which doubles
>>> the number of samples.
>>>
>>> Finally I found the syscall which cause idle. However I need to write a
>>> script to do statistics. TraceCompass itself is lack a mean to count
>>> different events in my way.
>>>
>>> The next thing I should do is to find the calltrace which issue the
>>> syscall. This time TraceCompass won't help, mostly because perf
>>> convertion now doesn't support converting calltrace.
>>>
>>> *Conclusion*
>>>
>>> I suggest perf and TraceCompass to think about following improvements:
>>>
>>>  1. Reducing the cost of perf recording. There are one third events are
>>>     generated by perf itself in my case. Is it possible that perf could
>>>     provide an ability that blacklist itself and collect all other
>>>     events?
>>>
>>>  2. Improving perf converting performance. Converting perf.data to CTF is
>>>     slow, but it should be offline most of the time. We can utilize the
>>>     abilities multi-core server to make it working in parallel.
>>>
>>>  3. Improving TraceCompass responding performance, especially when
>>>     synchronizing different views.
>>>
>>>  4. Support converting userspace call trace. I think perf side should already
>>>     have a plan on it.
>>>
>>>  5. Ad-Hoc visualization and statistics. Currently TraceCompass only
>>>     support dwaring pre-defined events and processes. When I try to
>>>     capture syscalls:*, I won't get benefit from TraceCompass because it
>>>     doesn't know them. I believe that during system tuning we will
>>>     finally get somewhere unable to be pre-defined by TraceCompass
>>>     designer. Therefore give users abilities to define their own events
>>>     and model should be much helpful.
>>>
>>> Thank you.
>>>
>>> _______________________________________________
>>> diamon-discuss mailing list
>>> diamon-discuss@lists.linuxfoundation.org
>>> https://lists.linuxfoundation.org/mailman/listinfo/diamon-discuss
>>
>>
>>
>
>



-- 
Jérémie Galarneau
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection.
  2015-01-27  5:31     ` Jérémie Galarneau
  2015-01-27  6:31       ` Wang Nan
@ 2015-01-27  6:31       ` Wang Nan
  1 sibling, 0 replies; 35+ messages in thread
From: Wang Nan @ 2015-01-27  6:31 UTC (permalink / raw)
  To: Jérémie Galarneau; +Cc: Xinwei Hu, diamon-discuss, lttng-dev

On 2015/1/27 13:31, Jérémie Galarneau wrote:
> On Mon, Jan 26, 2015 at 7:54 PM, Wang Nan <wangnan0@huawei.com> wrote:
>> On 2015/1/27 0:25, Jérémie Galarneau wrote:
>>> On Fri, Jan 23, 2015 at 4:35 AM, Wang Nan <wangnan0@huawei.com> wrote:
>>>> Hi folks,
>>>>
>>>> I'd like to share my first tuning experience with perf, ctf and
>>>> TraceCompass here, and I hope my experience helpful to diamon.org. Most
>>>> part of this mail is talking about my work. If you don't
>>>> interest in it, you can directly jump to conclusion part.
>>>>
>>>> *My Task*
>>>>
>>>> What I'm working on is finding the reason why CPU idle rate is high when
>>>> we benchmarking a database. I think it should be a very simple task:
>>>> tracing scheduling and system calls, finding the previous syscall issued
>>>> before idle, then based on statistics, collecting some user spaces call
>>>> stack, I can give an answer. I use perf to collect trace,
>>>> perf-convert-to-ctf to get ctf output and TraceCompass for
>>>> visualization.
>>>>
>>>>
>>>> *My Experience*
>>>>
>>>> First of all I use perf to collect trace:
>>>>
>>>>  # perf record -a -e sched:* -e raw_syscalls:* sleep 1
>>>>
>>>> then
>>>>
>>>>  # perf data convert --to-ctf out.ctf
>>>>
>>>> Which is simple. However, raw_syscalls:* tracepoints export less
>>>> information than syscalls:* tracepoints. Without them I have to manually
>>>> find syscall name from syscall id. I prefer to use:
>>>>
>>>>  # perf record -a -e sched:* -e syscalls:* sleep 1
>>>>
>>>> However there are some bugs and I have to make some patches. They are
>>>> posted and being disscussed currently, those bugs are still exist
>>>> upstream.
>>>>
>>>> Then I need to convert perf.data to ctf. It tooks 140.57s to convert
>>>> 2598513 samples, which are collected during only 1 second execution. My
>>>> working server has 64 2.0GHz Intel Xeon cores, but perf conversion
>>>> utilizes only 1 of them. I think this is another thing can be improved.
>>>
>>> Thanks for taking the time to do this write-up!
>>>
>>> Would it be possible to profile perf-to-ctf so we can spot the bottleneck?
>>>
>>> Regards,
>>> Jérémie
>>>
>>
>> We should, but I don't have enough time to do this. In addition, I think perf-to-ctf
>> conversion is possible to be parallelized without too much work. I suggest to consider
>> code tuning after we done that at perf side. What do you think?
> 
> I'm not sure I agree with the "without too much work" statement since
> Babeltrace's CTF Writer API is not thread-safe at this moment. As
> Alexandre previously mentionned, the size of the resulting CTF trace
> could be a good indication of whether or not IO is the culprit here.
> 
> We have never done any optimization work on Babeltrace and CTF Writer.
> We could be smarter on a number of front, notably with regards to
> memory allocation; there is currently no pooling of allocated objects
> (events, fields, types, etc.). We'll have to take a look at hard data
> before getting our hands dirty. However, I'm sure there are tons of
> optimization opportunities.
> 
> Jérémie
> 

Agree. I'll try to tune it when I have time, then we can get another
tuning experience.

Thank you.

>>
>>>>
>>>> The next step is visualization. Output ctf trace can be opened with
>>>> TraceCompass without problem. The most important views for me should be
>>>> resources view (I use them to check CPU usage) and control flow view (I
>>>> use them to check thread activities).
>>>>
>>>> The first uncomfortable thing is TraceCompass' slow response time. For
>>>> the trace I mentioned above, on resource view, after I click on CPU
>>>> idle area, I have to wait more than 10 seconds for event list updating
>>>> to get the previous event before the idle area.
>>>>
>>>> Then I found through resources view that perf itself tooks lots of CPU
>>>> time. In my case 33.5% samples are generated by perf itself. One core is
>>>> dedicated to perf and never idle or taken by others. I think this should
>>>> be another thing needs to be improved: perf should give a way to
>>>> blacklist itself when tracing all CPUs.
>>>>
>>>> TraceCompass doesn't recognize syscall:* tracepoints as CPU status
>>>> changing point. I have to also catch raw_syscall:*, and which doubles
>>>> the number of samples.
>>>>
>>>> Finally I found the syscall which cause idle. However I need to write a
>>>> script to do statistics. TraceCompass itself is lack a mean to count
>>>> different events in my way.
>>>>
>>>> The next thing I should do is to find the calltrace which issue the
>>>> syscall. This time TraceCompass won't help, mostly because perf
>>>> convertion now doesn't support converting calltrace.
>>>>
>>>> *Conclusion*
>>>>
>>>> I suggest perf and TraceCompass to think about following improvements:
>>>>
>>>>  1. Reducing the cost of perf recording. There are one third events are
>>>>     generated by perf itself in my case. Is it possible that perf could
>>>>     provide an ability that blacklist itself and collect all other
>>>>     events?
>>>>
>>>>  2. Improving perf converting performance. Converting perf.data to CTF is
>>>>     slow, but it should be offline most of the time. We can utilize the
>>>>     abilities multi-core server to make it working in parallel.
>>>>
>>>>  3. Improving TraceCompass responding performance, especially when
>>>>     synchronizing different views.
>>>>
>>>>  4. Support converting userspace call trace. I think perf side should already
>>>>     have a plan on it.
>>>>
>>>>  5. Ad-Hoc visualization and statistics. Currently TraceCompass only
>>>>     support dwaring pre-defined events and processes. When I try to
>>>>     capture syscalls:*, I won't get benefit from TraceCompass because it
>>>>     doesn't know them. I believe that during system tuning we will
>>>>     finally get somewhere unable to be pre-defined by TraceCompass
>>>>     designer. Therefore give users abilities to define their own events
>>>>     and model should be much helpful.
>>>>
>>>> Thank you.
>>>>
>>>> _______________________________________________
>>>> diamon-discuss mailing list
>>>> diamon-discuss@lists.linuxfoundation.org
>>>> https://lists.linuxfoundation.org/mailman/listinfo/diamon-discuss
>>>
>>>
>>>
>>
>>
> 
> 
> 



_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
http://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection.
  2015-01-27  5:31     ` Jérémie Galarneau
@ 2015-01-27  6:31       ` Wang Nan
  2015-01-27  6:31       ` Wang Nan
  1 sibling, 0 replies; 35+ messages in thread
From: Wang Nan @ 2015-01-27  6:31 UTC (permalink / raw)
  To: Jérémie Galarneau; +Cc: Xinwei Hu, diamon-discuss, lttng-dev

On 2015/1/27 13:31, Jérémie Galarneau wrote:
> On Mon, Jan 26, 2015 at 7:54 PM, Wang Nan <wangnan0@huawei.com> wrote:
>> On 2015/1/27 0:25, Jérémie Galarneau wrote:
>>> On Fri, Jan 23, 2015 at 4:35 AM, Wang Nan <wangnan0@huawei.com> wrote:
>>>> Hi folks,
>>>>
>>>> I'd like to share my first tuning experience with perf, ctf and
>>>> TraceCompass here, and I hope my experience helpful to diamon.org. Most
>>>> part of this mail is talking about my work. If you don't
>>>> interest in it, you can directly jump to conclusion part.
>>>>
>>>> *My Task*
>>>>
>>>> What I'm working on is finding the reason why CPU idle rate is high when
>>>> we benchmarking a database. I think it should be a very simple task:
>>>> tracing scheduling and system calls, finding the previous syscall issued
>>>> before idle, then based on statistics, collecting some user spaces call
>>>> stack, I can give an answer. I use perf to collect trace,
>>>> perf-convert-to-ctf to get ctf output and TraceCompass for
>>>> visualization.
>>>>
>>>>
>>>> *My Experience*
>>>>
>>>> First of all I use perf to collect trace:
>>>>
>>>>  # perf record -a -e sched:* -e raw_syscalls:* sleep 1
>>>>
>>>> then
>>>>
>>>>  # perf data convert --to-ctf out.ctf
>>>>
>>>> Which is simple. However, raw_syscalls:* tracepoints export less
>>>> information than syscalls:* tracepoints. Without them I have to manually
>>>> find syscall name from syscall id. I prefer to use:
>>>>
>>>>  # perf record -a -e sched:* -e syscalls:* sleep 1
>>>>
>>>> However there are some bugs and I have to make some patches. They are
>>>> posted and being disscussed currently, those bugs are still exist
>>>> upstream.
>>>>
>>>> Then I need to convert perf.data to ctf. It tooks 140.57s to convert
>>>> 2598513 samples, which are collected during only 1 second execution. My
>>>> working server has 64 2.0GHz Intel Xeon cores, but perf conversion
>>>> utilizes only 1 of them. I think this is another thing can be improved.
>>>
>>> Thanks for taking the time to do this write-up!
>>>
>>> Would it be possible to profile perf-to-ctf so we can spot the bottleneck?
>>>
>>> Regards,
>>> Jérémie
>>>
>>
>> We should, but I don't have enough time to do this. In addition, I think perf-to-ctf
>> conversion is possible to be parallelized without too much work. I suggest to consider
>> code tuning after we done that at perf side. What do you think?
> 
> I'm not sure I agree with the "without too much work" statement since
> Babeltrace's CTF Writer API is not thread-safe at this moment. As
> Alexandre previously mentionned, the size of the resulting CTF trace
> could be a good indication of whether or not IO is the culprit here.
> 
> We have never done any optimization work on Babeltrace and CTF Writer.
> We could be smarter on a number of front, notably with regards to
> memory allocation; there is currently no pooling of allocated objects
> (events, fields, types, etc.). We'll have to take a look at hard data
> before getting our hands dirty. However, I'm sure there are tons of
> optimization opportunities.
> 
> Jérémie
> 

Agree. I'll try to tune it when I have time, then we can get another
tuning experience.

Thank you.

>>
>>>>
>>>> The next step is visualization. Output ctf trace can be opened with
>>>> TraceCompass without problem. The most important views for me should be
>>>> resources view (I use them to check CPU usage) and control flow view (I
>>>> use them to check thread activities).
>>>>
>>>> The first uncomfortable thing is TraceCompass' slow response time. For
>>>> the trace I mentioned above, on resource view, after I click on CPU
>>>> idle area, I have to wait more than 10 seconds for event list updating
>>>> to get the previous event before the idle area.
>>>>
>>>> Then I found through resources view that perf itself tooks lots of CPU
>>>> time. In my case 33.5% samples are generated by perf itself. One core is
>>>> dedicated to perf and never idle or taken by others. I think this should
>>>> be another thing needs to be improved: perf should give a way to
>>>> blacklist itself when tracing all CPUs.
>>>>
>>>> TraceCompass doesn't recognize syscall:* tracepoints as CPU status
>>>> changing point. I have to also catch raw_syscall:*, and which doubles
>>>> the number of samples.
>>>>
>>>> Finally I found the syscall which cause idle. However I need to write a
>>>> script to do statistics. TraceCompass itself is lack a mean to count
>>>> different events in my way.
>>>>
>>>> The next thing I should do is to find the calltrace which issue the
>>>> syscall. This time TraceCompass won't help, mostly because perf
>>>> convertion now doesn't support converting calltrace.
>>>>
>>>> *Conclusion*
>>>>
>>>> I suggest perf and TraceCompass to think about following improvements:
>>>>
>>>>  1. Reducing the cost of perf recording. There are one third events are
>>>>     generated by perf itself in my case. Is it possible that perf could
>>>>     provide an ability that blacklist itself and collect all other
>>>>     events?
>>>>
>>>>  2. Improving perf converting performance. Converting perf.data to CTF is
>>>>     slow, but it should be offline most of the time. We can utilize the
>>>>     abilities multi-core server to make it working in parallel.
>>>>
>>>>  3. Improving TraceCompass responding performance, especially when
>>>>     synchronizing different views.
>>>>
>>>>  4. Support converting userspace call trace. I think perf side should already
>>>>     have a plan on it.
>>>>
>>>>  5. Ad-Hoc visualization and statistics. Currently TraceCompass only
>>>>     support dwaring pre-defined events and processes. When I try to
>>>>     capture syscalls:*, I won't get benefit from TraceCompass because it
>>>>     doesn't know them. I believe that during system tuning we will
>>>>     finally get somewhere unable to be pre-defined by TraceCompass
>>>>     designer. Therefore give users abilities to define their own events
>>>>     and model should be much helpful.
>>>>
>>>> Thank you.
>>>>
>>>> _______________________________________________
>>>> diamon-discuss mailing list
>>>> diamon-discuss@lists.linuxfoundation.org
>>>> https://lists.linuxfoundation.org/mailman/listinfo/diamon-discuss
>>>
>>>
>>>
>>
>>
> 
> 
> 



^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection.
  2015-01-23 16:30 ` Alexandre Montplaisir
                     ` (2 preceding siblings ...)
  2015-01-31  7:14   ` Wang Nan
@ 2015-01-31  7:14   ` Wang Nan
  3 siblings, 0 replies; 35+ messages in thread
From: Wang Nan @ 2015-01-31  7:14 UTC (permalink / raw)
  To: Alexandre Montplaisir
  Cc: Xinwei Hu, diamon-discuss, lttng-dev, tracecompass developer discussions


> This is a gap in the definition of the analysis it seems. I don't remember implementing two types of "syscall" events in the perf analysis, so it should just be a matter of getting the exact event name and adding it to the list. I will take a look and keep you posted!
> 
>> Finally I found the syscall which cause idle. However I need to write a
>> script to do statistics. TraceCompass itself is lack a mean to count
>> different events in my way.
> 
> Could you elaborate on this please? I agree the "Statistics" view in TC is severely lacking, we could be gathering and displaying much more information. The only question is what information would actually be useful.

I'd like to describe some cases of ad-hoc statisics, which I have to write python
scripts to do.

*First case: matching sys_enter and sys_exit*

The first case is to find the reason why most of CPUs are idle. From TraceCompass
resource view, I find some gray gaps for about 300us. During these gaps, there is only
1 running CPU, all other CPUs are idle. I can find the reason why a particular CPU
is idle using TraceCompass with following steps:

  1. In TraceCompass resource view, click the idle gap of that CPU, find next event
     with the 'Select Next Event' button, continous select next event until find
     a 'raw_syscalls:sys_exit' event, then by checking 'id' field I can find what syscall
     cause the CPU idle. (I have mentioned before, that in my case, each time when I click
     that button, I have to wait for 8 to 10 seconds for the trace table update so I can kown
     which event it is. This is painful for me...)

  2. Then I need to find corresponding "raw_syscalls:sys_enter" event to see when the syscall
     is issued. I switch to control flow view then use 'Select Previous Event' to find it, then
     back to resource view I can understand how long this syscall takes, whether the CPU
     does some work or simply idle after the syscall is issued, and whether the task is scheduled
     across CPUs.

  3. For each CPU do step 1 and step 2.

In some high-end servers the number of cores may exceeds 100. Even in my case the number of traced
CPUs is 32. Doing such searching is time consuming. I have to write a python script to do that.
My result is: half of cores are waiting on different futexs, half of then are waiting on
pselect() (caused by sleep()).

*Second case: matching futex WAKE and WAIT*

Therefore the next case I'd like to share is to maching futex wait, futex wakeup and futex waken
events. This time TraceCompass can't help much. However, the python script is also not very easy
to design. I have to track CPU and process state transition by myself, match all futex sys_enter
and sys_exit events, consider different cases including FUTEX_WAKE before FUTEX_WAIT, failures,
timeout and compare retval of futex wake and the number of threads waken by it. This
disposable python script has 115 lines of code (I have to admit that I'm not a very good python
programmer), I create and debug it for serval hours.

My final result is: threads wakeup each other in a tree-like manner. The first futex WAKE command is
issued by the only running CPU, wakeup only one thread. It wakeups others, other wakeup more, finally
nearly all CPUs are wakenup. There are some threads get executed after a relative long time
after the corresponding futex WAKE command, even if there are idle CPUs at that time. Therefore we
should look into scheduler. However, the gap itself should be a normal phenomenon.

*Third case: device utilization*

We have a high-speed storage device, but writing to filesystems on it is not as fast as we expected.
I deploy serval tracepoints at device driver to track device activities. However, I have to create
some tools to draw cumulative curve and speed curve to find whether there is irregular idle. I use
gnuplot for ploting, but have to write another python script to extrace data. I think languages like
R should be useful in this case but I'm not familiary with it.

*Conclusion*

In this email I list 3 use cases related to Ad-Hoc statistics I mentioned earlier. Case 1 and 2 are
in fact not a statistics problem. They should be considered as query problems. I suspect case
1 and 2 can be expressed using SQL. If TraceCompass can provide a query language like SQL, we can quickly
find the information we need to know so will have more time for tuning. I expressed my SQL-like query
idea on one of my early email:

http://lists.linuxfoundation.org/pipermail/diamon-discuss/2014-November/000003.html

However I was not very sure the query problem we would meet.

Case 3 requires a plotting tool like gnuplot or R. I don't know whether TraceCompass designers want to
integrate such function, but at lease TraceCompass should export data for those tools.

In case 1 and 2, I spent a lot of time to analyze a phenomenon which is finally shown to be normal.
I think this should be common in real performance analysis and tuning tasks. Many ideas may
appear after the first trace is collected. I think tools like TraceCompass should consider a
way to reduce the cost of trials and error.

Thank you for reading this!

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection.
  2015-01-23 16:30 ` Alexandre Montplaisir
  2015-01-23 16:56   ` [diamon-discuss] [lttng-dev] " Geneviève Bastien
  2015-01-23 16:56   ` [diamon-discuss] " Geneviève Bastien
@ 2015-01-31  7:14   ` Wang Nan
  2015-02-04 20:24     ` Alexandre Montplaisir
                       ` (3 more replies)
  2015-01-31  7:14   ` Wang Nan
  3 siblings, 4 replies; 35+ messages in thread
From: Wang Nan @ 2015-01-31  7:14 UTC (permalink / raw)
  To: Alexandre Montplaisir
  Cc: Xinwei Hu, diamon-discuss, lttng-dev, tracecompass developer discussions


> This is a gap in the definition of the analysis it seems. I don't remember implementing two types of "syscall" events in the perf analysis, so it should just be a matter of getting the exact event name and adding it to the list. I will take a look and keep you posted!
> 
>> Finally I found the syscall which cause idle. However I need to write a
>> script to do statistics. TraceCompass itself is lack a mean to count
>> different events in my way.
> 
> Could you elaborate on this please? I agree the "Statistics" view in TC is severely lacking, we could be gathering and displaying much more information. The only question is what information would actually be useful.

I'd like to describe some cases of ad-hoc statisics, which I have to write python
scripts to do.

*First case: matching sys_enter and sys_exit*

The first case is to find the reason why most of CPUs are idle. From TraceCompass
resource view, I find some gray gaps for about 300us. During these gaps, there is only
1 running CPU, all other CPUs are idle. I can find the reason why a particular CPU
is idle using TraceCompass with following steps:

  1. In TraceCompass resource view, click the idle gap of that CPU, find next event
     with the 'Select Next Event' button, continous select next event until find
     a 'raw_syscalls:sys_exit' event, then by checking 'id' field I can find what syscall
     cause the CPU idle. (I have mentioned before, that in my case, each time when I click
     that button, I have to wait for 8 to 10 seconds for the trace table update so I can kown
     which event it is. This is painful for me...)

  2. Then I need to find corresponding "raw_syscalls:sys_enter" event to see when the syscall
     is issued. I switch to control flow view then use 'Select Previous Event' to find it, then
     back to resource view I can understand how long this syscall takes, whether the CPU
     does some work or simply idle after the syscall is issued, and whether the task is scheduled
     across CPUs.

  3. For each CPU do step 1 and step 2.

In some high-end servers the number of cores may exceeds 100. Even in my case the number of traced
CPUs is 32. Doing such searching is time consuming. I have to write a python script to do that.
My result is: half of cores are waiting on different futexs, half of then are waiting on
pselect() (caused by sleep()).

*Second case: matching futex WAKE and WAIT*

Therefore the next case I'd like to share is to maching futex wait, futex wakeup and futex waken
events. This time TraceCompass can't help much. However, the python script is also not very easy
to design. I have to track CPU and process state transition by myself, match all futex sys_enter
and sys_exit events, consider different cases including FUTEX_WAKE before FUTEX_WAIT, failures,
timeout and compare retval of futex wake and the number of threads waken by it. This
disposable python script has 115 lines of code (I have to admit that I'm not a very good python
programmer), I create and debug it for serval hours.

My final result is: threads wakeup each other in a tree-like manner. The first futex WAKE command is
issued by the only running CPU, wakeup only one thread. It wakeups others, other wakeup more, finally
nearly all CPUs are wakenup. There are some threads get executed after a relative long time
after the corresponding futex WAKE command, even if there are idle CPUs at that time. Therefore we
should look into scheduler. However, the gap itself should be a normal phenomenon.

*Third case: device utilization*

We have a high-speed storage device, but writing to filesystems on it is not as fast as we expected.
I deploy serval tracepoints at device driver to track device activities. However, I have to create
some tools to draw cumulative curve and speed curve to find whether there is irregular idle. I use
gnuplot for ploting, but have to write another python script to extrace data. I think languages like
R should be useful in this case but I'm not familiary with it.

*Conclusion*

In this email I list 3 use cases related to Ad-Hoc statistics I mentioned earlier. Case 1 and 2 are
in fact not a statistics problem. They should be considered as query problems. I suspect case
1 and 2 can be expressed using SQL. If TraceCompass can provide a query language like SQL, we can quickly
find the information we need to know so will have more time for tuning. I expressed my SQL-like query
idea on one of my early email:

http://lists.linuxfoundation.org/pipermail/diamon-discuss/2014-November/000003.html

However I was not very sure the query problem we would meet.

Case 3 requires a plotting tool like gnuplot or R. I don't know whether TraceCompass designers want to
integrate such function, but at lease TraceCompass should export data for those tools.

In case 1 and 2, I spent a lot of time to analyze a phenomenon which is finally shown to be normal.
I think this should be common in real performance analysis and tuning tasks. Many ideas may
appear after the first trace is collected. I think tools like TraceCompass should consider a
way to reduce the cost of trials and error.

Thank you for reading this!



^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection.
  2015-01-31  7:14   ` Wang Nan
  2015-02-04 20:24     ` Alexandre Montplaisir
@ 2015-02-04 20:24     ` Alexandre Montplaisir
  2015-02-07 13:22     ` Mathieu Desnoyers
  2015-02-07 13:22     ` Mathieu Desnoyers
  3 siblings, 0 replies; 35+ messages in thread
From: Alexandre Montplaisir @ 2015-02-04 20:24 UTC (permalink / raw)
  To: Wang Nan
  Cc: Xinwei Hu, diamon-discuss, lttng-dev, tracecompass developer discussions

Hi Wang,

Once again, thanks a lot for this *very* valuable feedback! A good tool 
should be able to help in real-world use cases, like the ones you described.

Right now, Trace Compass is focused on showing information on the time 
scale. Views like the Control Flow and Resource Views are useful when 
you want to look at a particular location of a trace, and want to know 
exactly what the system is doing. However, when you are looking at a 
large trace completely zoomed-out, it's not extremely useful. It should 
be able to point out the "points of interest" in this trace, where the 
user can then zoom in and make use of the time-scale views.

We've been having many discussions internally about adding views on a 
frequency scale, rather than a time scale. For example, a view that 
could aggregate the latency of system calls (or IRQ handlers, or IO 
operations, etc.), and show them either as a histogram or as a point 
cloud, so that it becomes easy to see the "outliers" of the 
distribution. The user could then click on these outliers to be brought 
to the location in the trace where they happened.

These frequency views could provide the missing step of identifying 
points of interest in a trace, after which the user could switch to the 
time-based views to examine the highlighted locations in more details.

I think such an approach could help in all 3 use cases you have 
presented. What do you think?

Also, for known domains like the Linux kernel, we can provide pre-made 
frequency statistics, like system call latencies per syscall type and so 
on. However it would be important to keep it customizable, since we can 
never predict all the use cases that the users will need. That would 
probably be added a bit later though.

We don't have a defined roadmap for the "frequency analysis" right now, 
but we will definitely try to have some working prototypes for our 1.0 
release in June. I will keep you (and these mailing lists) posted!


Some more comments below.


On 01/31/2015 02:14 AM, Wang Nan wrote:
>> This is a gap in the definition of the analysis it seems. I don't remember implementing two types of "syscall" events in the perf analysis, so it should just be a matter of getting the exact event name and adding it to the list. I will take a look and keep you posted!
>>
>>> Finally I found the syscall which cause idle. However I need to write a
>>> script to do statistics. TraceCompass itself is lack a mean to count
>>> different events in my way.
>> Could you elaborate on this please? I agree the "Statistics" view in TC is severely lacking, we could be gathering and displaying much more information. The only question is what information would actually be useful.
> I'd like to describe some cases of ad-hoc statisics, which I have to write python
> scripts to do.
>
> *First case: matching sys_enter and sys_exit*
>
> The first case is to find the reason why most of CPUs are idle. From TraceCompass
> resource view, I find some gray gaps for about 300us. During these gaps, there is only
> 1 running CPU, all other CPUs are idle. I can find the reason why a particular CPU
> is idle using TraceCompass with following steps:
>
>    1. In TraceCompass resource view, click the idle gap of that CPU, find next event
>       with the 'Select Next Event' button, continous select next event until find
>       a 'raw_syscalls:sys_exit' event, then by checking 'id' field I can find what syscall
>       cause the CPU idle. (I have mentioned before, that in my case, each time when I click
>       that button, I have to wait for 8 to 10 seconds for the trace table update so I can kown
>       which event it is. This is painful for me...)

Yes, this was because the perf-CTF traces put all their content in one 
big packet. I think this is being worked on, right?

As for the missing system call names, we have 
https://bugs.eclipse.org/bugs/show_bug.cgi?id=453361 opened about it. It 
should be possible to give knowledge of the id's to the analysis so that 
it makes it easier to identify the system call type.

>
>    2. Then I need to find corresponding "raw_syscalls:sys_enter" event to see when the syscall
>       is issued. I switch to control flow view then use 'Select Previous Event' to find it, then
>       back to resource view I can understand how long this syscall takes, whether the CPU
>       does some work or simply idle after the syscall is issued, and whether the task is scheduled
>       across CPUs.
>
>    3. For each CPU do step 1 and step 2.
>
> In some high-end servers the number of cores may exceeds 100. Even in my case the number of traced
> CPUs is 32. Doing such searching is time consuming. I have to write a python script to do that.
> My result is: half of cores are waiting on different futexs, half of then are waiting on
> pselect() (caused by sleep()).
>
> *Second case: matching futex WAKE and WAIT*
>
> Therefore the next case I'd like to share is to maching futex wait, futex wakeup and futex waken
> events. This time TraceCompass can't help much. However, the python script is also not very easy
> to design. I have to track CPU and process state transition by myself, match all futex sys_enter
> and sys_exit events, consider different cases including FUTEX_WAKE before FUTEX_WAIT, failures,
> timeout and compare retval of futex wake and the number of threads waken by it. This
> disposable python script has 115 lines of code (I have to admit that I'm not a very good python
> programmer), I create and debug it for serval hours.
>
> My final result is: threads wakeup each other in a tree-like manner. The first futex WAKE command is
> issued by the only running CPU, wakeup only one thread. It wakeups others, other wakeup more, finally
> nearly all CPUs are wakenup. There are some threads get executed after a relative long time
> after the corresponding futex WAKE command, even if there are idle CPUs at that time. Therefore we
> should look into scheduler. However, the gap itself should be a normal phenomenon.
>
> *Third case: device utilization*
>
> We have a high-speed storage device, but writing to filesystems on it is not as fast as we expected.
> I deploy serval tracepoints at device driver to track device activities. However, I have to create
> some tools to draw cumulative curve and speed curve to find whether there is irregular idle. I use
> gnuplot for ploting, but have to write another python script to extrace data. I think languages like
> R should be useful in this case but I'm not familiary with it.
>
> *Conclusion*
>
> In this email I list 3 use cases related to Ad-Hoc statistics I mentioned earlier. Case 1 and 2 are
> in fact not a statistics problem. They should be considered as query problems. I suspect case
> 1 and 2 can be expressed using SQL. If TraceCompass can provide a query language like SQL, we can quickly
> find the information we need to know so will have more time for tuning. I expressed my SQL-like query
> idea on one of my early email:
>
> http://lists.linuxfoundation.org/pipermail/diamon-discuss/2014-November/000003.html
>
> However I was not very sure the query problem we would meet.
>
> Case 3 requires a plotting tool like gnuplot or R. I don't know whether TraceCompass designers want to
> integrate such function, but at lease TraceCompass should export data for those tools.

Right now we use the SWTChart library (http://swtchart.org/) for 
outputting graphs like the CPU Usage graph, the UST memory usage view, a 
prototype of the Histogram, and a couple others. We had also thought 
that we should provide an easy way to export the graphs, like 
right-click -> export to PNG/SVG, etc.

But internally, what we provide to SWTChart is basically an array of 
double's. So it should be easy to have an internal abstraction layer 
that can export either to an SWTChart view, or to a text file to be read 
by R or Gnuplot and so on, if such a feature is deemed useful.


Cheers,
Alexandre

>
> In case 1 and 2, I spent a lot of time to analyze a phenomenon which is finally shown to be normal.
> I think this should be common in real performance analysis and tuning tasks. Many ideas may
> appear after the first trace is collected. I think tools like TraceCompass should consider a
> way to reduce the cost of trials and error.
>
> Thank you for reading this!
>
>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection.
  2015-01-31  7:14   ` Wang Nan
@ 2015-02-04 20:24     ` Alexandre Montplaisir
  2015-02-07  3:13       ` Wang Nan
  2015-02-07  3:13       ` Wang Nan
  2015-02-04 20:24     ` Alexandre Montplaisir
                       ` (2 subsequent siblings)
  3 siblings, 2 replies; 35+ messages in thread
From: Alexandre Montplaisir @ 2015-02-04 20:24 UTC (permalink / raw)
  To: Wang Nan
  Cc: Xinwei Hu, diamon-discuss, lttng-dev, tracecompass developer discussions

Hi Wang,

Once again, thanks a lot for this *very* valuable feedback! A good tool 
should be able to help in real-world use cases, like the ones you described.

Right now, Trace Compass is focused on showing information on the time 
scale. Views like the Control Flow and Resource Views are useful when 
you want to look at a particular location of a trace, and want to know 
exactly what the system is doing. However, when you are looking at a 
large trace completely zoomed-out, it's not extremely useful. It should 
be able to point out the "points of interest" in this trace, where the 
user can then zoom in and make use of the time-scale views.

We've been having many discussions internally about adding views on a 
frequency scale, rather than a time scale. For example, a view that 
could aggregate the latency of system calls (or IRQ handlers, or IO 
operations, etc.), and show them either as a histogram or as a point 
cloud, so that it becomes easy to see the "outliers" of the 
distribution. The user could then click on these outliers to be brought 
to the location in the trace where they happened.

These frequency views could provide the missing step of identifying 
points of interest in a trace, after which the user could switch to the 
time-based views to examine the highlighted locations in more details.

I think such an approach could help in all 3 use cases you have 
presented. What do you think?

Also, for known domains like the Linux kernel, we can provide pre-made 
frequency statistics, like system call latencies per syscall type and so 
on. However it would be important to keep it customizable, since we can 
never predict all the use cases that the users will need. That would 
probably be added a bit later though.

We don't have a defined roadmap for the "frequency analysis" right now, 
but we will definitely try to have some working prototypes for our 1.0 
release in June. I will keep you (and these mailing lists) posted!


Some more comments below.


On 01/31/2015 02:14 AM, Wang Nan wrote:
>> This is a gap in the definition of the analysis it seems. I don't remember implementing two types of "syscall" events in the perf analysis, so it should just be a matter of getting the exact event name and adding it to the list. I will take a look and keep you posted!
>>
>>> Finally I found the syscall which cause idle. However I need to write a
>>> script to do statistics. TraceCompass itself is lack a mean to count
>>> different events in my way.
>> Could you elaborate on this please? I agree the "Statistics" view in TC is severely lacking, we could be gathering and displaying much more information. The only question is what information would actually be useful.
> I'd like to describe some cases of ad-hoc statisics, which I have to write python
> scripts to do.
>
> *First case: matching sys_enter and sys_exit*
>
> The first case is to find the reason why most of CPUs are idle. From TraceCompass
> resource view, I find some gray gaps for about 300us. During these gaps, there is only
> 1 running CPU, all other CPUs are idle. I can find the reason why a particular CPU
> is idle using TraceCompass with following steps:
>
>    1. In TraceCompass resource view, click the idle gap of that CPU, find next event
>       with the 'Select Next Event' button, continous select next event until find
>       a 'raw_syscalls:sys_exit' event, then by checking 'id' field I can find what syscall
>       cause the CPU idle. (I have mentioned before, that in my case, each time when I click
>       that button, I have to wait for 8 to 10 seconds for the trace table update so I can kown
>       which event it is. This is painful for me...)

Yes, this was because the perf-CTF traces put all their content in one 
big packet. I think this is being worked on, right?

As for the missing system call names, we have 
https://bugs.eclipse.org/bugs/show_bug.cgi?id=453361 opened about it. It 
should be possible to give knowledge of the id's to the analysis so that 
it makes it easier to identify the system call type.

>
>    2. Then I need to find corresponding "raw_syscalls:sys_enter" event to see when the syscall
>       is issued. I switch to control flow view then use 'Select Previous Event' to find it, then
>       back to resource view I can understand how long this syscall takes, whether the CPU
>       does some work or simply idle after the syscall is issued, and whether the task is scheduled
>       across CPUs.
>
>    3. For each CPU do step 1 and step 2.
>
> In some high-end servers the number of cores may exceeds 100. Even in my case the number of traced
> CPUs is 32. Doing such searching is time consuming. I have to write a python script to do that.
> My result is: half of cores are waiting on different futexs, half of then are waiting on
> pselect() (caused by sleep()).
>
> *Second case: matching futex WAKE and WAIT*
>
> Therefore the next case I'd like to share is to maching futex wait, futex wakeup and futex waken
> events. This time TraceCompass can't help much. However, the python script is also not very easy
> to design. I have to track CPU and process state transition by myself, match all futex sys_enter
> and sys_exit events, consider different cases including FUTEX_WAKE before FUTEX_WAIT, failures,
> timeout and compare retval of futex wake and the number of threads waken by it. This
> disposable python script has 115 lines of code (I have to admit that I'm not a very good python
> programmer), I create and debug it for serval hours.
>
> My final result is: threads wakeup each other in a tree-like manner. The first futex WAKE command is
> issued by the only running CPU, wakeup only one thread. It wakeups others, other wakeup more, finally
> nearly all CPUs are wakenup. There are some threads get executed after a relative long time
> after the corresponding futex WAKE command, even if there are idle CPUs at that time. Therefore we
> should look into scheduler. However, the gap itself should be a normal phenomenon.
>
> *Third case: device utilization*
>
> We have a high-speed storage device, but writing to filesystems on it is not as fast as we expected.
> I deploy serval tracepoints at device driver to track device activities. However, I have to create
> some tools to draw cumulative curve and speed curve to find whether there is irregular idle. I use
> gnuplot for ploting, but have to write another python script to extrace data. I think languages like
> R should be useful in this case but I'm not familiary with it.
>
> *Conclusion*
>
> In this email I list 3 use cases related to Ad-Hoc statistics I mentioned earlier. Case 1 and 2 are
> in fact not a statistics problem. They should be considered as query problems. I suspect case
> 1 and 2 can be expressed using SQL. If TraceCompass can provide a query language like SQL, we can quickly
> find the information we need to know so will have more time for tuning. I expressed my SQL-like query
> idea on one of my early email:
>
> http://lists.linuxfoundation.org/pipermail/diamon-discuss/2014-November/000003.html
>
> However I was not very sure the query problem we would meet.
>
> Case 3 requires a plotting tool like gnuplot or R. I don't know whether TraceCompass designers want to
> integrate such function, but at lease TraceCompass should export data for those tools.

Right now we use the SWTChart library (http://swtchart.org/) for 
outputting graphs like the CPU Usage graph, the UST memory usage view, a 
prototype of the Histogram, and a couple others. We had also thought 
that we should provide an easy way to export the graphs, like 
right-click -> export to PNG/SVG, etc.

But internally, what we provide to SWTChart is basically an array of 
double's. So it should be easy to have an internal abstraction layer 
that can export either to an SWTChart view, or to a text file to be read 
by R or Gnuplot and so on, if such a feature is deemed useful.


Cheers,
Alexandre

>
> In case 1 and 2, I spent a lot of time to analyze a phenomenon which is finally shown to be normal.
> I think this should be common in real performance analysis and tuning tasks. Many ideas may
> appear after the first trace is collected. I think tools like TraceCompass should consider a
> way to reduce the cost of trials and error.
>
> Thank you for reading this!
>
>


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection.
  2015-02-04 20:24     ` Alexandre Montplaisir
  2015-02-07  3:13       ` Wang Nan
@ 2015-02-07  3:13       ` Wang Nan
  1 sibling, 0 replies; 35+ messages in thread
From: Wang Nan @ 2015-02-07  3:13 UTC (permalink / raw)
  To: Alexandre Montplaisir
  Cc: Xinwei Hu, diamon-discuss, lttng-dev, tracecompass developer discussions

On 2015/2/5 4:24, Alexandre Montplaisir wrote:
> Hi Wang,
> 
> Once again, thanks a lot for this *very* valuable feedback! A good tool should be able to help in real-world use cases, like the ones you described.
> 
> Right now, Trace Compass is focused on showing information on the time scale. Views like the Control Flow and Resource Views are useful when you want to look at a particular location of a trace, and want to know exactly what the system is doing. However, when you are looking at a large trace completely zoomed-out, it's not extremely useful. It should be able to point out the "points of interest" in this trace, where the user can then zoom in and make use of the time-scale views.
> 
> We've been having many discussions internally about adding views on a frequency scale, rather than a time scale. For example, a view that could aggregate the latency of system calls (or IRQ handlers, or IO operations, etc.), and show them either as a histogram or as a point cloud, so that it becomes easy to see the "outliers" of the distribution. The user could then click on these outliers to be brought to the location in the trace where they happened.
> 
> These frequency views could provide the missing step of identifying points of interest in a trace, after which the user could switch to the time-based views to examine the highlighted locations in more details.
> 
> I think such an approach could help in all 3 use cases you have presented. What do you think?
> 

Good to see you are looking at this problem.

"Frequency analysis" you mentioned is a good viewpoint for finding outliner. However, it should not be the only one we consider. Could you please explain how "frequency analysis" can solve my first problem "finding the reason why most of CPUs are idle by matching syscalls events?"

Thank you!

> Also, for known domains like the Linux kernel, we can provide pre-made frequency statistics, like system call latencies per syscall type and so on. However it would be important to keep it customizable, since we can never predict all the use cases that the users will need. That would probably be added a bit later though.
> 
> We don't have a defined roadmap for the "frequency analysis" right now, but we will definitely try to have some working prototypes for our 1.0 release in June. I will keep you (and these mailing lists) posted!
> 
> 
> Some more comments below.
> 
> 
> On 01/31/2015 02:14 AM, Wang Nan wrote:
>>> This is a gap in the definition of the analysis it seems. I don't remember implementing two types of "syscall" events in the perf analysis, so it should just be a matter of getting the exact event name and adding it to the list. I will take a look and keep you posted!
>>>
>>>> Finally I found the syscall which cause idle. However I need to write a
>>>> script to do statistics. TraceCompass itself is lack a mean to count
>>>> different events in my way.
>>> Could you elaborate on this please? I agree the "Statistics" view in TC is severely lacking, we could be gathering and displaying much more information. The only question is what information would actually be useful.
>> I'd like to describe some cases of ad-hoc statisics, which I have to write python
>> scripts to do.
>>
>> *First case: matching sys_enter and sys_exit*
>>
>> The first case is to find the reason why most of CPUs are idle. From TraceCompass
>> resource view, I find some gray gaps for about 300us. During these gaps, there is only
>> 1 running CPU, all other CPUs are idle. I can find the reason why a particular CPU
>> is idle using TraceCompass with following steps:
>>
>>    1. In TraceCompass resource view, click the idle gap of that CPU, find next event
>>       with the 'Select Next Event' button, continous select next event until find
>>       a 'raw_syscalls:sys_exit' event, then by checking 'id' field I can find what syscall
>>       cause the CPU idle. (I have mentioned before, that in my case, each time when I click
>>       that button, I have to wait for 8 to 10 seconds for the trace table update so I can kown
>>       which event it is. This is painful for me...)
> 
> Yes, this was because the perf-CTF traces put all their content in one big packet. I think this is being worked on, right?
> 
> As for the missing system call names, we have https://bugs.eclipse.org/bugs/show_bug.cgi?id=453361 opened about it. It should be possible to give knowledge of the id's to the analysis so that it makes it easier to identify the system call type.
> 
>>
>>    2. Then I need to find corresponding "raw_syscalls:sys_enter" event to see when the syscall
>>       is issued. I switch to control flow view then use 'Select Previous Event' to find it, then
>>       back to resource view I can understand how long this syscall takes, whether the CPU
>>       does some work or simply idle after the syscall is issued, and whether the task is scheduled
>>       across CPUs.
>>
>>    3. For each CPU do step 1 and step 2.
>>
>> In some high-end servers the number of cores may exceeds 100. Even in my case the number of traced
>> CPUs is 32. Doing such searching is time consuming. I have to write a python script to do that.
>> My result is: half of cores are waiting on different futexs, half of then are waiting on
>> pselect() (caused by sleep()).
>>
>> *Second case: matching futex WAKE and WAIT*
>>
>> Therefore the next case I'd like to share is to maching futex wait, futex wakeup and futex waken
>> events. This time TraceCompass can't help much. However, the python script is also not very easy
>> to design. I have to track CPU and process state transition by myself, match all futex sys_enter
>> and sys_exit events, consider different cases including FUTEX_WAKE before FUTEX_WAIT, failures,
>> timeout and compare retval of futex wake and the number of threads waken by it. This
>> disposable python script has 115 lines of code (I have to admit that I'm not a very good python
>> programmer), I create and debug it for serval hours.
>>
>> My final result is: threads wakeup each other in a tree-like manner. The first futex WAKE command is
>> issued by the only running CPU, wakeup only one thread. It wakeups others, other wakeup more, finally
>> nearly all CPUs are wakenup. There are some threads get executed after a relative long time
>> after the corresponding futex WAKE command, even if there are idle CPUs at that time. Therefore we
>> should look into scheduler. However, the gap itself should be a normal phenomenon.
>>
>> *Third case: device utilization*
>>
>> We have a high-speed storage device, but writing to filesystems on it is not as fast as we expected.
>> I deploy serval tracepoints at device driver to track device activities. However, I have to create
>> some tools to draw cumulative curve and speed curve to find whether there is irregular idle. I use
>> gnuplot for ploting, but have to write another python script to extrace data. I think languages like
>> R should be useful in this case but I'm not familiary with it.
>>
>> *Conclusion*
>>
>> In this email I list 3 use cases related to Ad-Hoc statistics I mentioned earlier. Case 1 and 2 are
>> in fact not a statistics problem. They should be considered as query problems. I suspect case
>> 1 and 2 can be expressed using SQL. If TraceCompass can provide a query language like SQL, we can quickly
>> find the information we need to know so will have more time for tuning. I expressed my SQL-like query
>> idea on one of my early email:
>>
>> http://lists.linuxfoundation.org/pipermail/diamon-discuss/2014-November/000003.html
>>
>> However I was not very sure the query problem we would meet.
>>
>> Case 3 requires a plotting tool like gnuplot or R. I don't know whether TraceCompass designers want to
>> integrate such function, but at lease TraceCompass should export data for those tools.
> 
> Right now we use the SWTChart library (http://swtchart.org/) for outputting graphs like the CPU Usage graph, the UST memory usage view, a prototype of the Histogram, and a couple others. We had also thought that we should provide an easy way to export the graphs, like right-click -> export to PNG/SVG, etc.
> 
> But internally, what we provide to SWTChart is basically an array of double's. So it should be easy to have an internal abstraction layer that can export either to an SWTChart view, or to a text file to be read by R or Gnuplot and so on, if such a feature is deemed useful.
> 
> 
> Cheers,
> Alexandre
> 
>>
>> In case 1 and 2, I spent a lot of time to analyze a phenomenon which is finally shown to be normal.
>> I think this should be common in real performance analysis and tuning tasks. Many ideas may
>> appear after the first trace is collected. I think tools like TraceCompass should consider a
>> way to reduce the cost of trials and error.
>>
>> Thank you for reading this!
>>
>>
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection.
  2015-02-04 20:24     ` Alexandre Montplaisir
@ 2015-02-07  3:13       ` Wang Nan
  2015-02-07  5:14         ` Alexandre Montplaisir
  2015-02-07  5:14         ` Alexandre Montplaisir
  2015-02-07  3:13       ` Wang Nan
  1 sibling, 2 replies; 35+ messages in thread
From: Wang Nan @ 2015-02-07  3:13 UTC (permalink / raw)
  To: Alexandre Montplaisir
  Cc: Xinwei Hu, diamon-discuss, lttng-dev, tracecompass developer discussions

On 2015/2/5 4:24, Alexandre Montplaisir wrote:
> Hi Wang,
> 
> Once again, thanks a lot for this *very* valuable feedback! A good tool should be able to help in real-world use cases, like the ones you described.
> 
> Right now, Trace Compass is focused on showing information on the time scale. Views like the Control Flow and Resource Views are useful when you want to look at a particular location of a trace, and want to know exactly what the system is doing. However, when you are looking at a large trace completely zoomed-out, it's not extremely useful. It should be able to point out the "points of interest" in this trace, where the user can then zoom in and make use of the time-scale views.
> 
> We've been having many discussions internally about adding views on a frequency scale, rather than a time scale. For example, a view that could aggregate the latency of system calls (or IRQ handlers, or IO operations, etc.), and show them either as a histogram or as a point cloud, so that it becomes easy to see the "outliers" of the distribution. The user could then click on these outliers to be brought to the location in the trace where they happened.
> 
> These frequency views could provide the missing step of identifying points of interest in a trace, after which the user could switch to the time-based views to examine the highlighted locations in more details.
> 
> I think such an approach could help in all 3 use cases you have presented. What do you think?
> 

Good to see you are looking at this problem.

"Frequency analysis" you mentioned is a good viewpoint for finding outliner. However, it should not be the only one we consider. Could you please explain how "frequency analysis" can solve my first problem "finding the reason why most of CPUs are idle by matching syscalls events?"

Thank you!

> Also, for known domains like the Linux kernel, we can provide pre-made frequency statistics, like system call latencies per syscall type and so on. However it would be important to keep it customizable, since we can never predict all the use cases that the users will need. That would probably be added a bit later though.
> 
> We don't have a defined roadmap for the "frequency analysis" right now, but we will definitely try to have some working prototypes for our 1.0 release in June. I will keep you (and these mailing lists) posted!
> 
> 
> Some more comments below.
> 
> 
> On 01/31/2015 02:14 AM, Wang Nan wrote:
>>> This is a gap in the definition of the analysis it seems. I don't remember implementing two types of "syscall" events in the perf analysis, so it should just be a matter of getting the exact event name and adding it to the list. I will take a look and keep you posted!
>>>
>>>> Finally I found the syscall which cause idle. However I need to write a
>>>> script to do statistics. TraceCompass itself is lack a mean to count
>>>> different events in my way.
>>> Could you elaborate on this please? I agree the "Statistics" view in TC is severely lacking, we could be gathering and displaying much more information. The only question is what information would actually be useful.
>> I'd like to describe some cases of ad-hoc statisics, which I have to write python
>> scripts to do.
>>
>> *First case: matching sys_enter and sys_exit*
>>
>> The first case is to find the reason why most of CPUs are idle. From TraceCompass
>> resource view, I find some gray gaps for about 300us. During these gaps, there is only
>> 1 running CPU, all other CPUs are idle. I can find the reason why a particular CPU
>> is idle using TraceCompass with following steps:
>>
>>    1. In TraceCompass resource view, click the idle gap of that CPU, find next event
>>       with the 'Select Next Event' button, continous select next event until find
>>       a 'raw_syscalls:sys_exit' event, then by checking 'id' field I can find what syscall
>>       cause the CPU idle. (I have mentioned before, that in my case, each time when I click
>>       that button, I have to wait for 8 to 10 seconds for the trace table update so I can kown
>>       which event it is. This is painful for me...)
> 
> Yes, this was because the perf-CTF traces put all their content in one big packet. I think this is being worked on, right?
> 
> As for the missing system call names, we have https://bugs.eclipse.org/bugs/show_bug.cgi?id=453361 opened about it. It should be possible to give knowledge of the id's to the analysis so that it makes it easier to identify the system call type.
> 
>>
>>    2. Then I need to find corresponding "raw_syscalls:sys_enter" event to see when the syscall
>>       is issued. I switch to control flow view then use 'Select Previous Event' to find it, then
>>       back to resource view I can understand how long this syscall takes, whether the CPU
>>       does some work or simply idle after the syscall is issued, and whether the task is scheduled
>>       across CPUs.
>>
>>    3. For each CPU do step 1 and step 2.
>>
>> In some high-end servers the number of cores may exceeds 100. Even in my case the number of traced
>> CPUs is 32. Doing such searching is time consuming. I have to write a python script to do that.
>> My result is: half of cores are waiting on different futexs, half of then are waiting on
>> pselect() (caused by sleep()).
>>
>> *Second case: matching futex WAKE and WAIT*
>>
>> Therefore the next case I'd like to share is to maching futex wait, futex wakeup and futex waken
>> events. This time TraceCompass can't help much. However, the python script is also not very easy
>> to design. I have to track CPU and process state transition by myself, match all futex sys_enter
>> and sys_exit events, consider different cases including FUTEX_WAKE before FUTEX_WAIT, failures,
>> timeout and compare retval of futex wake and the number of threads waken by it. This
>> disposable python script has 115 lines of code (I have to admit that I'm not a very good python
>> programmer), I create and debug it for serval hours.
>>
>> My final result is: threads wakeup each other in a tree-like manner. The first futex WAKE command is
>> issued by the only running CPU, wakeup only one thread. It wakeups others, other wakeup more, finally
>> nearly all CPUs are wakenup. There are some threads get executed after a relative long time
>> after the corresponding futex WAKE command, even if there are idle CPUs at that time. Therefore we
>> should look into scheduler. However, the gap itself should be a normal phenomenon.
>>
>> *Third case: device utilization*
>>
>> We have a high-speed storage device, but writing to filesystems on it is not as fast as we expected.
>> I deploy serval tracepoints at device driver to track device activities. However, I have to create
>> some tools to draw cumulative curve and speed curve to find whether there is irregular idle. I use
>> gnuplot for ploting, but have to write another python script to extrace data. I think languages like
>> R should be useful in this case but I'm not familiary with it.
>>
>> *Conclusion*
>>
>> In this email I list 3 use cases related to Ad-Hoc statistics I mentioned earlier. Case 1 and 2 are
>> in fact not a statistics problem. They should be considered as query problems. I suspect case
>> 1 and 2 can be expressed using SQL. If TraceCompass can provide a query language like SQL, we can quickly
>> find the information we need to know so will have more time for tuning. I expressed my SQL-like query
>> idea on one of my early email:
>>
>> http://lists.linuxfoundation.org/pipermail/diamon-discuss/2014-November/000003.html
>>
>> However I was not very sure the query problem we would meet.
>>
>> Case 3 requires a plotting tool like gnuplot or R. I don't know whether TraceCompass designers want to
>> integrate such function, but at lease TraceCompass should export data for those tools.
> 
> Right now we use the SWTChart library (http://swtchart.org/) for outputting graphs like the CPU Usage graph, the UST memory usage view, a prototype of the Histogram, and a couple others. We had also thought that we should provide an easy way to export the graphs, like right-click -> export to PNG/SVG, etc.
> 
> But internally, what we provide to SWTChart is basically an array of double's. So it should be easy to have an internal abstraction layer that can export either to an SWTChart view, or to a text file to be read by R or Gnuplot and so on, if such a feature is deemed useful.
> 
> 
> Cheers,
> Alexandre
> 
>>
>> In case 1 and 2, I spent a lot of time to analyze a phenomenon which is finally shown to be normal.
>> I think this should be common in real performance analysis and tuning tasks. Many ideas may
>> appear after the first trace is collected. I think tools like TraceCompass should consider a
>> way to reduce the cost of trials and error.
>>
>> Thank you for reading this!
>>
>>
> 



^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection.
  2015-02-07  3:13       ` Wang Nan
  2015-02-07  5:14         ` Alexandre Montplaisir
@ 2015-02-07  5:14         ` Alexandre Montplaisir
  1 sibling, 0 replies; 35+ messages in thread
From: Alexandre Montplaisir @ 2015-02-07  5:14 UTC (permalink / raw)
  To: Wang Nan
  Cc: Xinwei Hu, diamon-discuss, lttng-dev, tracecompass developer discussions

On 2015-02-06 10:13 PM, Wang Nan wrote:
>> I think such an approach could help in all 3 use cases you have presented. What do you think?
>> >
> Good to see you are looking at this problem.
>
> "Frequency analysis" you mentioned is a good viewpoint for finding outliner. However, it should not be the only one we consider. Could you please explain how "frequency analysis" can solve my first problem "finding the reason why most of CPUs are idle by matching syscalls events?"
>
> Thank you!
>

I was thinking of the automatic event matching, for example matching 
syscall_entry_* events with their corresponding syscall_exit_*. This is 
a pre-requisite of doing frequency analysis, but could be useful on its 
own. (Re-reading myself now I notice I didn't mention event matching at 
all, my bad!)

If I understand your first problem correctly, it boils down to wanting 
to identify the system call that is ongoing when a CPU is idle. And this 
is not straightforward, because the Select Next/Previous Event buttons 
in the time graph views will stop at every state transition, like CPU 
idle or IRQs, which are "in the way" of the system call entry event. 
Correct?


Now, what if we had a view that simply lists all the system calls in the 
trace? Each row would contain the complete information of a system call, 
so its start time (timestamp of the sys_entry event), end time 
(timestamp of the sys_exit event), duration, name/id, arguments, return 
value, etc.

And this view could be synchronized with the other ones, where clicking 
on a row would bring you to the corresponding syscall_entry event. And 
inversely, clicking on any timestamp would bring this view to the row 
corresponding to the system call that was active at that time, if there 
is one. I believe this could speed up your problem of identifying the 
running system call for any arbitrary point in the trace.

Also, that view could be re-used for other types of intervals, like 
IRQs, IO operations, and so on. And if the user sorts by the Duration 
column, bam, they have a sorted list of the worst offenders for longer 
system calls, IRQs, etc.


Would this be helpful?

Cheers,
Alexandre

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection.
  2015-02-07  3:13       ` Wang Nan
@ 2015-02-07  5:14         ` Alexandre Montplaisir
  2015-02-07  7:25           ` Wang Nan
  2015-02-07  7:25           ` Wang Nan
  2015-02-07  5:14         ` Alexandre Montplaisir
  1 sibling, 2 replies; 35+ messages in thread
From: Alexandre Montplaisir @ 2015-02-07  5:14 UTC (permalink / raw)
  To: Wang Nan
  Cc: Xinwei Hu, diamon-discuss, lttng-dev, tracecompass developer discussions

On 2015-02-06 10:13 PM, Wang Nan wrote:
>> I think such an approach could help in all 3 use cases you have presented. What do you think?
>> >
> Good to see you are looking at this problem.
>
> "Frequency analysis" you mentioned is a good viewpoint for finding outliner. However, it should not be the only one we consider. Could you please explain how "frequency analysis" can solve my first problem "finding the reason why most of CPUs are idle by matching syscalls events?"
>
> Thank you!
>

I was thinking of the automatic event matching, for example matching 
syscall_entry_* events with their corresponding syscall_exit_*. This is 
a pre-requisite of doing frequency analysis, but could be useful on its 
own. (Re-reading myself now I notice I didn't mention event matching at 
all, my bad!)

If I understand your first problem correctly, it boils down to wanting 
to identify the system call that is ongoing when a CPU is idle. And this 
is not straightforward, because the Select Next/Previous Event buttons 
in the time graph views will stop at every state transition, like CPU 
idle or IRQs, which are "in the way" of the system call entry event. 
Correct?


Now, what if we had a view that simply lists all the system calls in the 
trace? Each row would contain the complete information of a system call, 
so its start time (timestamp of the sys_entry event), end time 
(timestamp of the sys_exit event), duration, name/id, arguments, return 
value, etc.

And this view could be synchronized with the other ones, where clicking 
on a row would bring you to the corresponding syscall_entry event. And 
inversely, clicking on any timestamp would bring this view to the row 
corresponding to the system call that was active at that time, if there 
is one. I believe this could speed up your problem of identifying the 
running system call for any arbitrary point in the trace.

Also, that view could be re-used for other types of intervals, like 
IRQs, IO operations, and so on. And if the user sorts by the Duration 
column, bam, they have a sorted list of the worst offenders for longer 
system calls, IRQs, etc.


Would this be helpful?

Cheers,
Alexandre

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection.
  2015-02-07  5:14         ` Alexandre Montplaisir
  2015-02-07  7:25           ` Wang Nan
@ 2015-02-07  7:25           ` Wang Nan
  1 sibling, 0 replies; 35+ messages in thread
From: Wang Nan @ 2015-02-07  7:25 UTC (permalink / raw)
  To: Alexandre Montplaisir
  Cc: Xinwei Hu, diamon-discuss, lttng-dev, tracecompass developer discussions

On 2015/2/7 13:14, Alexandre Montplaisir wrote:
> On 2015-02-06 10:13 PM, Wang Nan wrote:
>>> I think such an approach could help in all 3 use cases you have presented. What do you think?
>>> >
>> Good to see you are looking at this problem.
>>
>> "Frequency analysis" you mentioned is a good viewpoint for finding outliner. However, it should not be the only one we consider. Could you please explain how "frequency analysis" can solve my first problem "finding the reason why most of CPUs are idle by matching syscalls events?"
>>
>> Thank you!
>>
> 
> I was thinking of the automatic event matching, for example matching syscall_entry_* events with their corresponding syscall_exit_*. This is a pre-requisite of doing frequency analysis, but could be useful on its own. (Re-reading myself now I notice I didn't mention event matching at all, my bad!)
> 
> If I understand your first problem correctly, it boils down to wanting to identify the system call that is ongoing when a CPU is idle. And this is not straightforward, because the Select Next/Previous Event buttons in the time graph views will stop at every state transition, like CPU idle or IRQs, which are "in the way" of the system call entry event. Correct?
> 
> 
> Now, what if we had a view that simply lists all the system calls in the trace? Each row would contain the complete information of a system call, so its start time (timestamp of the sys_entry event), end time (timestamp of the sys_exit event), duration, name/id, arguments, return value, etc.
> 
> And this view could be synchronized with the other ones, where clicking on a row would bring you to the corresponding syscall_entry event. And inversely, clicking on any timestamp would bring this view to the row corresponding to the system call that was active at that time, if there is one. I believe this could speed up your problem of identifying the running system call for any arbitrary point in the trace.
> 
> Also, that view could be re-used for other types of intervals, like IRQs, IO operations, and so on. And if the user sorts by the Duration column, bam, they have a sorted list of the worst offenders for longer system calls, IRQs, etc.
> 
> 
> Would this be helpful?
> 

Correct, it should be helpful. If I understand your description correctly, you are thinking another time
chart view like resource view and control flow view, this time each row is a syscall, like this:

         <-------- time -------->
          /-------------------
sys_write |      ***
          +-------------------
sys_read  |   ***        ****
          +-------------------
sys_poll  |  **************
          +-------------------
.................

And it will be synchronized with other ones. So if I click on the idle area on resource view, then switch
to this new view, I can realize which syscall is ongoing at that time.

However, unlike CPUs, types of syscalls are not stateful entities. If we are going to draw such diagram, it should looks like:

                    <-------- time -------->
                     /-------------------
sys_write(pid: 1234) |      ***
                     +-------------------
sys_write(pid: 5678) |
                     +-------------------
sys_read(pid: 5678)  |              ****
                     +-------------------
sys_read(pid: 4321)  |    ****
                     +------------------
sys_poll(pid: 1212)  |      *******
                     +-------------------
sys_poll(pid: 1213)  |  **************
                     +-------------------
sys_poll(pid: 1214)  |   *****
                     +-------------------
.................

We must draw one raw for each syscall (there are hundreds of then) issued by each pid (also hundreds of).
This view will be extremely large and sparse and we should have a mean to sort them for finding meanful
rows at a specific time. But correct, it is still useful.

We can also aggregate same syscalls, use different color to express how many procedures are ongoing at that
time, like this:

         <-------- time -------->
          /-------------------
sys_write |      ~~~
          +-------------------
sys_read  |    ~~~~      ~~~~
          +------------------
sys_poll  |  ~~~~==!!====~~~
          +-------------------
.................

Is your "frequency analysis" something like that?



However, when I working on the first problem, what I think about is a SQL query like:

select * from events
inner join
  ( select max(timestamp) as lasttime from events groupby cpu sortby timestamp where
    events.timestamp < xxxxxxx and
    events.type == "raw_syscalls:sys_exit"
    group by cpu ) X
on
events.timestamp = X.lasttime and
events.cpu = X.cpu and
events.type = "raw_syscalls:sys_exit"

(I'm not sure whether it works or need some debugging, but I think my idea is expressed.)

I think query like that should be more expressive than timechart (but yes, SQL is not suitable
for query events). Think about my 'matching futex WAIT and WAKE' then, I must first define my
futex model use XML, then pass it to TraceCompass, reprocess the trace then I can open the new
view.

> Cheers,
> Alexandre

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection.
  2015-02-07  5:14         ` Alexandre Montplaisir
@ 2015-02-07  7:25           ` Wang Nan
  2015-02-09 20:13             ` Alexandre Montplaisir
  2015-02-09 20:13             ` Alexandre Montplaisir
  2015-02-07  7:25           ` Wang Nan
  1 sibling, 2 replies; 35+ messages in thread
From: Wang Nan @ 2015-02-07  7:25 UTC (permalink / raw)
  To: Alexandre Montplaisir
  Cc: Xinwei Hu, diamon-discuss, lttng-dev, tracecompass developer discussions

On 2015/2/7 13:14, Alexandre Montplaisir wrote:
> On 2015-02-06 10:13 PM, Wang Nan wrote:
>>> I think such an approach could help in all 3 use cases you have presented. What do you think?
>>> >
>> Good to see you are looking at this problem.
>>
>> "Frequency analysis" you mentioned is a good viewpoint for finding outliner. However, it should not be the only one we consider. Could you please explain how "frequency analysis" can solve my first problem "finding the reason why most of CPUs are idle by matching syscalls events?"
>>
>> Thank you!
>>
> 
> I was thinking of the automatic event matching, for example matching syscall_entry_* events with their corresponding syscall_exit_*. This is a pre-requisite of doing frequency analysis, but could be useful on its own. (Re-reading myself now I notice I didn't mention event matching at all, my bad!)
> 
> If I understand your first problem correctly, it boils down to wanting to identify the system call that is ongoing when a CPU is idle. And this is not straightforward, because the Select Next/Previous Event buttons in the time graph views will stop at every state transition, like CPU idle or IRQs, which are "in the way" of the system call entry event. Correct?
> 
> 
> Now, what if we had a view that simply lists all the system calls in the trace? Each row would contain the complete information of a system call, so its start time (timestamp of the sys_entry event), end time (timestamp of the sys_exit event), duration, name/id, arguments, return value, etc.
> 
> And this view could be synchronized with the other ones, where clicking on a row would bring you to the corresponding syscall_entry event. And inversely, clicking on any timestamp would bring this view to the row corresponding to the system call that was active at that time, if there is one. I believe this could speed up your problem of identifying the running system call for any arbitrary point in the trace.
> 
> Also, that view could be re-used for other types of intervals, like IRQs, IO operations, and so on. And if the user sorts by the Duration column, bam, they have a sorted list of the worst offenders for longer system calls, IRQs, etc.
> 
> 
> Would this be helpful?
> 

Correct, it should be helpful. If I understand your description correctly, you are thinking another time
chart view like resource view and control flow view, this time each row is a syscall, like this:

         <-------- time -------->
          /-------------------
sys_write |      ***
          +-------------------
sys_read  |   ***        ****
          +-------------------
sys_poll  |  **************
          +-------------------
.................

And it will be synchronized with other ones. So if I click on the idle area on resource view, then switch
to this new view, I can realize which syscall is ongoing at that time.

However, unlike CPUs, types of syscalls are not stateful entities. If we are going to draw such diagram, it should looks like:

                    <-------- time -------->
                     /-------------------
sys_write(pid: 1234) |      ***
                     +-------------------
sys_write(pid: 5678) |
                     +-------------------
sys_read(pid: 5678)  |              ****
                     +-------------------
sys_read(pid: 4321)  |    ****
                     +------------------
sys_poll(pid: 1212)  |      *******
                     +-------------------
sys_poll(pid: 1213)  |  **************
                     +-------------------
sys_poll(pid: 1214)  |   *****
                     +-------------------
.................

We must draw one raw for each syscall (there are hundreds of then) issued by each pid (also hundreds of).
This view will be extremely large and sparse and we should have a mean to sort them for finding meanful
rows at a specific time. But correct, it is still useful.

We can also aggregate same syscalls, use different color to express how many procedures are ongoing at that
time, like this:

         <-------- time -------->
          /-------------------
sys_write |      ~~~
          +-------------------
sys_read  |    ~~~~      ~~~~
          +------------------
sys_poll  |  ~~~~==!!====~~~
          +-------------------
.................

Is your "frequency analysis" something like that?



However, when I working on the first problem, what I think about is a SQL query like:

select * from events
inner join
  ( select max(timestamp) as lasttime from events groupby cpu sortby timestamp where
    events.timestamp < xxxxxxx and
    events.type == "raw_syscalls:sys_exit"
    group by cpu ) X
on
events.timestamp = X.lasttime and
events.cpu = X.cpu and
events.type = "raw_syscalls:sys_exit"

(I'm not sure whether it works or need some debugging, but I think my idea is expressed.)

I think query like that should be more expressive than timechart (but yes, SQL is not suitable
for query events). Think about my 'matching futex WAIT and WAKE' then, I must first define my
futex model use XML, then pass it to TraceCompass, reprocess the trace then I can open the new
view.

> Cheers,
> Alexandre



^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection.
  2015-01-31  7:14   ` Wang Nan
  2015-02-04 20:24     ` Alexandre Montplaisir
  2015-02-04 20:24     ` Alexandre Montplaisir
@ 2015-02-07 13:22     ` Mathieu Desnoyers
  2015-02-07 13:22     ` Mathieu Desnoyers
  3 siblings, 0 replies; 35+ messages in thread
From: Mathieu Desnoyers @ 2015-02-07 13:22 UTC (permalink / raw)
  To: Wang Nan
  Cc: Xinwei Hu, diamon-discuss, lttng-dev, tracecompass developer discussions

----- Original Message -----
> From: "Wang Nan" <wangnan0@huawei.com>
> To: "Alexandre Montplaisir" <alexmonthy@voxpopuli.im>
> Cc: "Xinwei Hu" <huxinwei@huawei.com>, diamon-discuss@lists.linuxfoundation.org, lttng-dev@lists.lttng.org,
> "tracecompass developer discussions" <tracecompass-dev@eclipse.org>
> Sent: Saturday, January 31, 2015 2:14:18 AM
> Subject: Re: [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection.
> 
> 
> > This is a gap in the definition of the analysis it seems. I don't remember
> > implementing two types of "syscall" events in the perf analysis, so it
> > should just be a matter of getting the exact event name and adding it to
> > the list. I will take a look and keep you posted!
> > 
> >> Finally I found the syscall which cause idle. However I need to write a
> >> script to do statistics. TraceCompass itself is lack a mean to count
> >> different events in my way.
> > 
> > Could you elaborate on this please? I agree the "Statistics" view in TC is
> > severely lacking, we could be gathering and displaying much more
> > information. The only question is what information would actually be
> > useful.
> 
> I'd like to describe some cases of ad-hoc statisics, which I have to write
> python
> scripts to do.
> 
> *First case: matching sys_enter and sys_exit*
> 
> The first case is to find the reason why most of CPUs are idle. From
> TraceCompass
> resource view, I find some gray gaps for about 300us. During these gaps,
> there is only
> 1 running CPU, all other CPUs are idle. I can find the reason why a
> particular CPU
> is idle using TraceCompass with following steps:
> 
>   1. In TraceCompass resource view, click the idle gap of that CPU, find next
>   event
>      with the 'Select Next Event' button, continous select next event until
>      find
>      a 'raw_syscalls:sys_exit' event, then by checking 'id' field I can find
>      what syscall
>      cause the CPU idle. (I have mentioned before, that in my case, each time
>      when I click
>      that button, I have to wait for 8 to 10 seconds for the trace table
>      update so I can kown
>      which event it is. This is painful for me...)

For this use-case, I think the kind of analysis you need is a "critical path
analysis", which can identify what made each task you care about block during
your time range. See work published by Francis Giraldeau on this topic.

CPU being idle is a side-effect of the tasks being blocked. If you would have
a lesser priority task running in the background, that task would run instead
of the CPU idle loop. Therefore, the idle CPU is really just a side-effect, not
a cause.

Thanks!

Mathieu

> 
>   2. Then I need to find corresponding "raw_syscalls:sys_enter" event to see
>   when the syscall
>      is issued. I switch to control flow view then use 'Select Previous
>      Event' to find it, then
>      back to resource view I can understand how long this syscall takes,
>      whether the CPU
>      does some work or simply idle after the syscall is issued, and whether
>      the task is scheduled
>      across CPUs.
> 
>   3. For each CPU do step 1 and step 2.
> 
> In some high-end servers the number of cores may exceeds 100. Even in my case
> the number of traced
> CPUs is 32. Doing such searching is time consuming. I have to write a python
> script to do that.
> My result is: half of cores are waiting on different futexs, half of then are
> waiting on
> pselect() (caused by sleep()).
> 
> *Second case: matching futex WAKE and WAIT*
> 
> Therefore the next case I'd like to share is to maching futex wait, futex
> wakeup and futex waken
> events. This time TraceCompass can't help much. However, the python script is
> also not very easy
> to design. I have to track CPU and process state transition by myself, match
> all futex sys_enter
> and sys_exit events, consider different cases including FUTEX_WAKE before
> FUTEX_WAIT, failures,
> timeout and compare retval of futex wake and the number of threads waken by
> it. This
> disposable python script has 115 lines of code (I have to admit that I'm not
> a very good python
> programmer), I create and debug it for serval hours.
> 
> My final result is: threads wakeup each other in a tree-like manner. The
> first futex WAKE command is
> issued by the only running CPU, wakeup only one thread. It wakeups others,
> other wakeup more, finally
> nearly all CPUs are wakenup. There are some threads get executed after a
> relative long time
> after the corresponding futex WAKE command, even if there are idle CPUs at
> that time. Therefore we
> should look into scheduler. However, the gap itself should be a normal
> phenomenon.
> 
> *Third case: device utilization*
> 
> We have a high-speed storage device, but writing to filesystems on it is not
> as fast as we expected.
> I deploy serval tracepoints at device driver to track device activities.
> However, I have to create
> some tools to draw cumulative curve and speed curve to find whether there is
> irregular idle. I use
> gnuplot for ploting, but have to write another python script to extrace data.
> I think languages like
> R should be useful in this case but I'm not familiary with it.
> 
> *Conclusion*
> 
> In this email I list 3 use cases related to Ad-Hoc statistics I mentioned
> earlier. Case 1 and 2 are
> in fact not a statistics problem. They should be considered as query
> problems. I suspect case
> 1 and 2 can be expressed using SQL. If TraceCompass can provide a query
> language like SQL, we can quickly
> find the information we need to know so will have more time for tuning. I
> expressed my SQL-like query
> idea on one of my early email:
> 
> http://lists.linuxfoundation.org/pipermail/diamon-discuss/2014-November/000003.html
> 
> However I was not very sure the query problem we would meet.
> 
> Case 3 requires a plotting tool like gnuplot or R. I don't know whether
> TraceCompass designers want to
> integrate such function, but at lease TraceCompass should export data for
> those tools.
> 
> In case 1 and 2, I spent a lot of time to analyze a phenomenon which is
> finally shown to be normal.
> I think this should be common in real performance analysis and tuning tasks.
> Many ideas may
> appear after the first trace is collected. I think tools like TraceCompass
> should consider a
> way to reduce the cost of trials and error.
> 
> Thank you for reading this!
> 
> 
> _______________________________________________
> diamon-discuss mailing list
> diamon-discuss@lists.linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/diamon-discuss
> 

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection.
  2015-01-31  7:14   ` Wang Nan
                       ` (2 preceding siblings ...)
  2015-02-07 13:22     ` Mathieu Desnoyers
@ 2015-02-07 13:22     ` Mathieu Desnoyers
  3 siblings, 0 replies; 35+ messages in thread
From: Mathieu Desnoyers @ 2015-02-07 13:22 UTC (permalink / raw)
  To: Wang Nan
  Cc: Xinwei Hu, diamon-discuss, lttng-dev, tracecompass developer discussions

----- Original Message -----
> From: "Wang Nan" <wangnan0@huawei.com>
> To: "Alexandre Montplaisir" <alexmonthy@voxpopuli.im>
> Cc: "Xinwei Hu" <huxinwei@huawei.com>, diamon-discuss@lists.linuxfoundation.org, lttng-dev@lists.lttng.org,
> "tracecompass developer discussions" <tracecompass-dev@eclipse.org>
> Sent: Saturday, January 31, 2015 2:14:18 AM
> Subject: Re: [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection.
> 
> 
> > This is a gap in the definition of the analysis it seems. I don't remember
> > implementing two types of "syscall" events in the perf analysis, so it
> > should just be a matter of getting the exact event name and adding it to
> > the list. I will take a look and keep you posted!
> > 
> >> Finally I found the syscall which cause idle. However I need to write a
> >> script to do statistics. TraceCompass itself is lack a mean to count
> >> different events in my way.
> > 
> > Could you elaborate on this please? I agree the "Statistics" view in TC is
> > severely lacking, we could be gathering and displaying much more
> > information. The only question is what information would actually be
> > useful.
> 
> I'd like to describe some cases of ad-hoc statisics, which I have to write
> python
> scripts to do.
> 
> *First case: matching sys_enter and sys_exit*
> 
> The first case is to find the reason why most of CPUs are idle. From
> TraceCompass
> resource view, I find some gray gaps for about 300us. During these gaps,
> there is only
> 1 running CPU, all other CPUs are idle. I can find the reason why a
> particular CPU
> is idle using TraceCompass with following steps:
> 
>   1. In TraceCompass resource view, click the idle gap of that CPU, find next
>   event
>      with the 'Select Next Event' button, continous select next event until
>      find
>      a 'raw_syscalls:sys_exit' event, then by checking 'id' field I can find
>      what syscall
>      cause the CPU idle. (I have mentioned before, that in my case, each time
>      when I click
>      that button, I have to wait for 8 to 10 seconds for the trace table
>      update so I can kown
>      which event it is. This is painful for me...)

For this use-case, I think the kind of analysis you need is a "critical path
analysis", which can identify what made each task you care about block during
your time range. See work published by Francis Giraldeau on this topic.

CPU being idle is a side-effect of the tasks being blocked. If you would have
a lesser priority task running in the background, that task would run instead
of the CPU idle loop. Therefore, the idle CPU is really just a side-effect, not
a cause.

Thanks!

Mathieu

> 
>   2. Then I need to find corresponding "raw_syscalls:sys_enter" event to see
>   when the syscall
>      is issued. I switch to control flow view then use 'Select Previous
>      Event' to find it, then
>      back to resource view I can understand how long this syscall takes,
>      whether the CPU
>      does some work or simply idle after the syscall is issued, and whether
>      the task is scheduled
>      across CPUs.
> 
>   3. For each CPU do step 1 and step 2.
> 
> In some high-end servers the number of cores may exceeds 100. Even in my case
> the number of traced
> CPUs is 32. Doing such searching is time consuming. I have to write a python
> script to do that.
> My result is: half of cores are waiting on different futexs, half of then are
> waiting on
> pselect() (caused by sleep()).
> 
> *Second case: matching futex WAKE and WAIT*
> 
> Therefore the next case I'd like to share is to maching futex wait, futex
> wakeup and futex waken
> events. This time TraceCompass can't help much. However, the python script is
> also not very easy
> to design. I have to track CPU and process state transition by myself, match
> all futex sys_enter
> and sys_exit events, consider different cases including FUTEX_WAKE before
> FUTEX_WAIT, failures,
> timeout and compare retval of futex wake and the number of threads waken by
> it. This
> disposable python script has 115 lines of code (I have to admit that I'm not
> a very good python
> programmer), I create and debug it for serval hours.
> 
> My final result is: threads wakeup each other in a tree-like manner. The
> first futex WAKE command is
> issued by the only running CPU, wakeup only one thread. It wakeups others,
> other wakeup more, finally
> nearly all CPUs are wakenup. There are some threads get executed after a
> relative long time
> after the corresponding futex WAKE command, even if there are idle CPUs at
> that time. Therefore we
> should look into scheduler. However, the gap itself should be a normal
> phenomenon.
> 
> *Third case: device utilization*
> 
> We have a high-speed storage device, but writing to filesystems on it is not
> as fast as we expected.
> I deploy serval tracepoints at device driver to track device activities.
> However, I have to create
> some tools to draw cumulative curve and speed curve to find whether there is
> irregular idle. I use
> gnuplot for ploting, but have to write another python script to extrace data.
> I think languages like
> R should be useful in this case but I'm not familiary with it.
> 
> *Conclusion*
> 
> In this email I list 3 use cases related to Ad-Hoc statistics I mentioned
> earlier. Case 1 and 2 are
> in fact not a statistics problem. They should be considered as query
> problems. I suspect case
> 1 and 2 can be expressed using SQL. If TraceCompass can provide a query
> language like SQL, we can quickly
> find the information we need to know so will have more time for tuning. I
> expressed my SQL-like query
> idea on one of my early email:
> 
> http://lists.linuxfoundation.org/pipermail/diamon-discuss/2014-November/000003.html
> 
> However I was not very sure the query problem we would meet.
> 
> Case 3 requires a plotting tool like gnuplot or R. I don't know whether
> TraceCompass designers want to
> integrate such function, but at lease TraceCompass should export data for
> those tools.
> 
> In case 1 and 2, I spent a lot of time to analyze a phenomenon which is
> finally shown to be normal.
> I think this should be common in real performance analysis and tuning tasks.
> Many ideas may
> appear after the first trace is collected. I think tools like TraceCompass
> should consider a
> way to reduce the cost of trials and error.
> 
> Thank you for reading this!
> 
> 
> _______________________________________________
> diamon-discuss mailing list
> diamon-discuss@lists.linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/diamon-discuss
> 

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection.
  2015-02-07  7:25           ` Wang Nan
@ 2015-02-09 20:13             ` Alexandre Montplaisir
  2015-02-09 20:13             ` Alexandre Montplaisir
  1 sibling, 0 replies; 35+ messages in thread
From: Alexandre Montplaisir @ 2015-02-09 20:13 UTC (permalink / raw)
  To: Wang Nan
  Cc: Xinwei Hu, diamon-discuss, lttng-dev, tracecompass developer discussions


On 02/07/2015 02:25 AM, Wang Nan wrote:
> On 2015/2/7 13:14, Alexandre Montplaisir wrote:
>> On 2015-02-06 10:13 PM, Wang Nan wrote:
>>>> I think such an approach could help in all 3 use cases you have presented. What do you think?
>>> Good to see you are looking at this problem.
>>>
>>> "Frequency analysis" you mentioned is a good viewpoint for finding outliner. However, it should not be the only one we consider. Could you please explain how "frequency analysis" can solve my first problem "finding the reason why most of CPUs are idle by matching syscalls events?"
>>>
>>> Thank you!
>>>
>> I was thinking of the automatic event matching, for example matching syscall_entry_* events with their corresponding syscall_exit_*. This is a pre-requisite of doing frequency analysis, but could be useful on its own. (Re-reading myself now I notice I didn't mention event matching at all, my bad!)
>>
>> If I understand your first problem correctly, it boils down to wanting to identify the system call that is ongoing when a CPU is idle. And this is not straightforward, because the Select Next/Previous Event buttons in the time graph views will stop at every state transition, like CPU idle or IRQs, which are "in the way" of the system call entry event. Correct?
>>
>>
>> Now, what if we had a view that simply lists all the system calls in the trace? Each row would contain the complete information of a system call, so its start time (timestamp of the sys_entry event), end time (timestamp of the sys_exit event), duration, name/id, arguments, return value, etc.
>>
>> And this view could be synchronized with the other ones, where clicking on a row would bring you to the corresponding syscall_entry event. And inversely, clicking on any timestamp would bring this view to the row corresponding to the system call that was active at that time, if there is one. I believe this could speed up your problem of identifying the running system call for any arbitrary point in the trace.
>>
>> Also, that view could be re-used for other types of intervals, like IRQs, IO operations, and so on. And if the user sorts by the Duration column, bam, they have a sorted list of the worst offenders for longer system calls, IRQs, etc.
>>
>>
>> Would this be helpful?
>>
> Correct, it should be helpful. If I understand your description correctly, you are thinking another time
> chart view like resource view and control flow view, this time each row is a syscall, like this:

In fact, for the "event matching" what I had in mind was a simple text 
table, where every line represents one (entire) system call, so you 
would have columns like:

start | end | duration | name/id |     arguments       | return

------|-----|----------|---------|---------------------|-------
10    | 15  |    5     | open    | path=abc, flags=... |   4
20    | 23  |    3     | close   | fd=4                |   0
....


It would then be easy to synchronize this view with the others, so that 
it would point to the ongoing system call(s) whenever the user selects a 
timestamp in another view. And vice versa, clicking on an entry in this 
table would move to the corresponding syscall_entry event.

The view could also be generalized for other things (IRQs?), either in 
different views, or the same view with a drop-down showing which type of 
information to display.


And for the so-called frequency analysis, you use the data from the 
"duration" column to populate a histogram (or point cloud?) showing how 
the distribution of latency of every system call (Y axis represents the 
absolute number of occurrences):


|

|           ___

|           ___

|     ___   ___         ___

|__   ___   ___   ___   ___

+-----------------------------

  0-5  5-10  10-15 15-20 20+
    duration of syscall (ms)


So in this example we could see right away that the average duration is 
around 10-15 ms, but we have some outliers that took more than 20 ms, so 
it would be interesting to zoom in there to get more details. And 
eventually clicking on one particular occurrence could bring you to the 
exact point in the trace where it happened (or create bookmarks for the 
outlier entries, etc.)


> [...]
>
>
> However, when I working on the first problem, what I think about is a SQL query like:
>
> select * from events
> inner join
>    ( select max(timestamp) as lasttime from events groupby cpu sortby timestamp where
>      events.timestamp < xxxxxxx and
>      events.type == "raw_syscalls:sys_exit"
>      group by cpu ) X
> on
> events.timestamp = X.lasttime and
> events.cpu = X.cpu and
> events.type = "raw_syscalls:sys_exit"
>
> (I'm not sure whether it works or need some debugging, but I think my idea is expressed.)

Interesting, and what would the output of such a query look like? Would 
it be a list of events, but with the "groupby" that would be more like a 
tree? Like this:

CPU0
   |__event 1
   |__event 2
   |__...
CPU1
   |__event 4
   |__event 5
   |__...
...

?

We already have the Event table, on which we can apply filters of this 
type (timestamp == something && cpu == otherthing...). But we only show 
the results in the same flat list. I'm not sure what would be best here. 
Showing results in a separate view, or supporting sub-trees in the base 
Event table?


Cheers,
Alexandre

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection.
  2015-02-07  7:25           ` Wang Nan
  2015-02-09 20:13             ` Alexandre Montplaisir
@ 2015-02-09 20:13             ` Alexandre Montplaisir
  2015-02-10  4:55               ` Wang Nan
  2015-02-10  4:55               ` Wang Nan
  1 sibling, 2 replies; 35+ messages in thread
From: Alexandre Montplaisir @ 2015-02-09 20:13 UTC (permalink / raw)
  To: Wang Nan
  Cc: Xinwei Hu, diamon-discuss, lttng-dev, tracecompass developer discussions


On 02/07/2015 02:25 AM, Wang Nan wrote:
> On 2015/2/7 13:14, Alexandre Montplaisir wrote:
>> On 2015-02-06 10:13 PM, Wang Nan wrote:
>>>> I think such an approach could help in all 3 use cases you have presented. What do you think?
>>> Good to see you are looking at this problem.
>>>
>>> "Frequency analysis" you mentioned is a good viewpoint for finding outliner. However, it should not be the only one we consider. Could you please explain how "frequency analysis" can solve my first problem "finding the reason why most of CPUs are idle by matching syscalls events?"
>>>
>>> Thank you!
>>>
>> I was thinking of the automatic event matching, for example matching syscall_entry_* events with their corresponding syscall_exit_*. This is a pre-requisite of doing frequency analysis, but could be useful on its own. (Re-reading myself now I notice I didn't mention event matching at all, my bad!)
>>
>> If I understand your first problem correctly, it boils down to wanting to identify the system call that is ongoing when a CPU is idle. And this is not straightforward, because the Select Next/Previous Event buttons in the time graph views will stop at every state transition, like CPU idle or IRQs, which are "in the way" of the system call entry event. Correct?
>>
>>
>> Now, what if we had a view that simply lists all the system calls in the trace? Each row would contain the complete information of a system call, so its start time (timestamp of the sys_entry event), end time (timestamp of the sys_exit event), duration, name/id, arguments, return value, etc.
>>
>> And this view could be synchronized with the other ones, where clicking on a row would bring you to the corresponding syscall_entry event. And inversely, clicking on any timestamp would bring this view to the row corresponding to the system call that was active at that time, if there is one. I believe this could speed up your problem of identifying the running system call for any arbitrary point in the trace.
>>
>> Also, that view could be re-used for other types of intervals, like IRQs, IO operations, and so on. And if the user sorts by the Duration column, bam, they have a sorted list of the worst offenders for longer system calls, IRQs, etc.
>>
>>
>> Would this be helpful?
>>
> Correct, it should be helpful. If I understand your description correctly, you are thinking another time
> chart view like resource view and control flow view, this time each row is a syscall, like this:

In fact, for the "event matching" what I had in mind was a simple text 
table, where every line represents one (entire) system call, so you 
would have columns like:

start | end | duration | name/id |     arguments       | return

------|-----|----------|---------|---------------------|-------
10    | 15  |    5     | open    | path=abc, flags=... |   4
20    | 23  |    3     | close   | fd=4                |   0
....


It would then be easy to synchronize this view with the others, so that 
it would point to the ongoing system call(s) whenever the user selects a 
timestamp in another view. And vice versa, clicking on an entry in this 
table would move to the corresponding syscall_entry event.

The view could also be generalized for other things (IRQs?), either in 
different views, or the same view with a drop-down showing which type of 
information to display.


And for the so-called frequency analysis, you use the data from the 
"duration" column to populate a histogram (or point cloud?) showing how 
the distribution of latency of every system call (Y axis represents the 
absolute number of occurrences):


|

|           ___

|           ___

|     ___   ___         ___

|__   ___   ___   ___   ___

+-----------------------------

  0-5  5-10  10-15 15-20 20+
    duration of syscall (ms)


So in this example we could see right away that the average duration is 
around 10-15 ms, but we have some outliers that took more than 20 ms, so 
it would be interesting to zoom in there to get more details. And 
eventually clicking on one particular occurrence could bring you to the 
exact point in the trace where it happened (or create bookmarks for the 
outlier entries, etc.)


> [...]
>
>
> However, when I working on the first problem, what I think about is a SQL query like:
>
> select * from events
> inner join
>    ( select max(timestamp) as lasttime from events groupby cpu sortby timestamp where
>      events.timestamp < xxxxxxx and
>      events.type == "raw_syscalls:sys_exit"
>      group by cpu ) X
> on
> events.timestamp = X.lasttime and
> events.cpu = X.cpu and
> events.type = "raw_syscalls:sys_exit"
>
> (I'm not sure whether it works or need some debugging, but I think my idea is expressed.)

Interesting, and what would the output of such a query look like? Would 
it be a list of events, but with the "groupby" that would be more like a 
tree? Like this:

CPU0
   |__event 1
   |__event 2
   |__...
CPU1
   |__event 4
   |__event 5
   |__...
...

?

We already have the Event table, on which we can apply filters of this 
type (timestamp == something && cpu == otherthing...). But we only show 
the results in the same flat list. I'm not sure what would be best here. 
Showing results in a separate view, or supporting sub-trees in the base 
Event table?


Cheers,
Alexandre

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection.
  2015-02-09 20:13             ` Alexandre Montplaisir
@ 2015-02-10  4:55               ` Wang Nan
  2015-02-10  4:55               ` Wang Nan
  1 sibling, 0 replies; 35+ messages in thread
From: Wang Nan @ 2015-02-10  4:55 UTC (permalink / raw)
  To: Alexandre Montplaisir
  Cc: Xinwei Hu, diamon-discuss, lttng-dev, tracecompass developer discussions

On 2015/2/10 4:13, Alexandre Montplaisir wrote:
> 
> On 02/07/2015 02:25 AM, Wang Nan wrote:
>> On 2015/2/7 13:14, Alexandre Montplaisir wrote:
>>> On 2015-02-06 10:13 PM, Wang Nan wrote:
>>>>> I think such an approach could help in all 3 use cases you have presented. What do you think?
>>>> Good to see you are looking at this problem.
>>>>
>>>> "Frequency analysis" you mentioned is a good viewpoint for finding outliner. However, it should not be the only one we consider. Could you please explain how "frequency analysis" can solve my first problem "finding the reason why most of CPUs are idle by matching syscalls events?"
>>>>
>>>> Thank you!
>>>>
>>> I was thinking of the automatic event matching, for example matching syscall_entry_* events with their corresponding syscall_exit_*. This is a pre-requisite of doing frequency analysis, but could be useful on its own. (Re-reading myself now I notice I didn't mention event matching at all, my bad!)
>>>
>>> If I understand your first problem correctly, it boils down to wanting to identify the system call that is ongoing when a CPU is idle. And this is not straightforward, because the Select Next/Previous Event buttons in the time graph views will stop at every state transition, like CPU idle or IRQs, which are "in the way" of the system call entry event. Correct?
>>>
>>>
>>> Now, what if we had a view that simply lists all the system calls in the trace? Each row would contain the complete information of a system call, so its start time (timestamp of the sys_entry event), end time (timestamp of the sys_exit event), duration, name/id, arguments, return value, etc.
>>>
>>> And this view could be synchronized with the other ones, where clicking on a row would bring you to the corresponding syscall_entry event. And inversely, clicking on any timestamp would bring this view to the row corresponding to the system call that was active at that time, if there is one. I believe this could speed up your problem of identifying the running system call for any arbitrary point in the trace.
>>>
>>> Also, that view could be re-used for other types of intervals, like IRQs, IO operations, and so on. And if the user sorts by the Duration column, bam, they have a sorted list of the worst offenders for longer system calls, IRQs, etc.
>>>
>>>
>>> Would this be helpful?
>>>
>> Correct, it should be helpful. If I understand your description correctly, you are thinking another time
>> chart view like resource view and control flow view, this time each row is a syscall, like this:
> 
> In fact, for the "event matching" what I had in mind was a simple text table, where every line represents one (entire) system call, so you would have columns like:
> 
> start | end | duration | name/id |     arguments       | return
> 
> ------|-----|----------|---------|---------------------|-------
> 10    | 15  |    5     | open    | path=abc, flags=... |   4
> 20    | 23  |    3     | close   | fd=4                |   0
> ....
> 
> 
> It would then be easy to synchronize this view with the others, so that it would point to the ongoing system call(s) whenever the user selects a timestamp in another view. And vice versa, clicking on an entry in this table would move to the corresponding syscall_entry event.
> 
> The view could also be generalized for other things (IRQs?), either in different views, or the same view with a drop-down showing which type of information to display.
> 
> 
> And for the so-called frequency analysis, you use the data from the "duration" column to populate a histogram (or point cloud?) showing how the distribution of latency of every system call (Y axis represents the absolute number of occurrences):
> 
> 
> |
> 
> |           ___
> 
> |           ___
> 
> |     ___   ___         ___
> 
> |__   ___   ___   ___   ___
> 
> +-----------------------------
> 
>  0-5  5-10  10-15 15-20 20+
>    duration of syscall (ms)
> 
> 
> So in this example we could see right away that the average duration is around 10-15 ms, but we have some outliers that took more than 20 ms, so it would be interesting to zoom in there to get more details. And eventually clicking on one particular occurrence could bring you to the exact point in the trace where it happened (or create bookmarks for the outlier entries, etc.)
> 

Looks good. It should be better if users are able to define their own matching rules.

In addition, do you think about chaining more than 2 events together? Some procedures, such as IO and network, can be traced at different layers, and we are able to chain events caused by same sources together with some cules. Please refer to:

http://comments.gmane.org/gmane.linux.kernel/1883569

Which we are working on.

> 
>> [...]
>>
>>
>> However, when I working on the first problem, what I think about is a SQL query like:
>>
>> select * from events
>> inner join
>>    ( select max(timestamp) as lasttime from events groupby cpu sortby timestamp where
>>      events.timestamp < xxxxxxx and
>>      events.type == "raw_syscalls:sys_exit"
>>      group by cpu ) X
>> on
>> events.timestamp = X.lasttime and
>> events.cpu = X.cpu and
>> events.type = "raw_syscalls:sys_exit"
>>
>> (I'm not sure whether it works or need some debugging, but I think my idea is expressed.)
> 
> Interesting, and what would the output of such a query look like? Would it be a list of events, but with the "groupby" that would be more like a tree? Like this:
> 
> CPU0
>   |__event 1
>   |__event 2
>   |__...
> CPU1
>   |__event 4
>   |__event 5
>   |__...
> ...
> 
> ?
> 

Correct. Even a flat table with sort and filters enable should be helpful.

> We already have the Event table, on which we can apply filters of this type (timestamp == something && cpu == otherthing...). But we only show the results in the same flat list. I'm not sure what would be best here. Showing results in a separate view, or supporting sub-trees in the base Event table?
> 
> 
> Cheers,
> Alexandre

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection.
  2015-02-09 20:13             ` Alexandre Montplaisir
  2015-02-10  4:55               ` Wang Nan
@ 2015-02-10  4:55               ` Wang Nan
  2015-02-10  8:02                 ` Alexandre Montplaisir
  2015-02-10  8:02                 ` Alexandre Montplaisir
  1 sibling, 2 replies; 35+ messages in thread
From: Wang Nan @ 2015-02-10  4:55 UTC (permalink / raw)
  To: Alexandre Montplaisir
  Cc: Xinwei Hu, diamon-discuss, lttng-dev, tracecompass developer discussions

On 2015/2/10 4:13, Alexandre Montplaisir wrote:
> 
> On 02/07/2015 02:25 AM, Wang Nan wrote:
>> On 2015/2/7 13:14, Alexandre Montplaisir wrote:
>>> On 2015-02-06 10:13 PM, Wang Nan wrote:
>>>>> I think such an approach could help in all 3 use cases you have presented. What do you think?
>>>> Good to see you are looking at this problem.
>>>>
>>>> "Frequency analysis" you mentioned is a good viewpoint for finding outliner. However, it should not be the only one we consider. Could you please explain how "frequency analysis" can solve my first problem "finding the reason why most of CPUs are idle by matching syscalls events?"
>>>>
>>>> Thank you!
>>>>
>>> I was thinking of the automatic event matching, for example matching syscall_entry_* events with their corresponding syscall_exit_*. This is a pre-requisite of doing frequency analysis, but could be useful on its own. (Re-reading myself now I notice I didn't mention event matching at all, my bad!)
>>>
>>> If I understand your first problem correctly, it boils down to wanting to identify the system call that is ongoing when a CPU is idle. And this is not straightforward, because the Select Next/Previous Event buttons in the time graph views will stop at every state transition, like CPU idle or IRQs, which are "in the way" of the system call entry event. Correct?
>>>
>>>
>>> Now, what if we had a view that simply lists all the system calls in the trace? Each row would contain the complete information of a system call, so its start time (timestamp of the sys_entry event), end time (timestamp of the sys_exit event), duration, name/id, arguments, return value, etc.
>>>
>>> And this view could be synchronized with the other ones, where clicking on a row would bring you to the corresponding syscall_entry event. And inversely, clicking on any timestamp would bring this view to the row corresponding to the system call that was active at that time, if there is one. I believe this could speed up your problem of identifying the running system call for any arbitrary point in the trace.
>>>
>>> Also, that view could be re-used for other types of intervals, like IRQs, IO operations, and so on. And if the user sorts by the Duration column, bam, they have a sorted list of the worst offenders for longer system calls, IRQs, etc.
>>>
>>>
>>> Would this be helpful?
>>>
>> Correct, it should be helpful. If I understand your description correctly, you are thinking another time
>> chart view like resource view and control flow view, this time each row is a syscall, like this:
> 
> In fact, for the "event matching" what I had in mind was a simple text table, where every line represents one (entire) system call, so you would have columns like:
> 
> start | end | duration | name/id |     arguments       | return
> 
> ------|-----|----------|---------|---------------------|-------
> 10    | 15  |    5     | open    | path=abc, flags=... |   4
> 20    | 23  |    3     | close   | fd=4                |   0
> ....
> 
> 
> It would then be easy to synchronize this view with the others, so that it would point to the ongoing system call(s) whenever the user selects a timestamp in another view. And vice versa, clicking on an entry in this table would move to the corresponding syscall_entry event.
> 
> The view could also be generalized for other things (IRQs?), either in different views, or the same view with a drop-down showing which type of information to display.
> 
> 
> And for the so-called frequency analysis, you use the data from the "duration" column to populate a histogram (or point cloud?) showing how the distribution of latency of every system call (Y axis represents the absolute number of occurrences):
> 
> 
> |
> 
> |           ___
> 
> |           ___
> 
> |     ___   ___         ___
> 
> |__   ___   ___   ___   ___
> 
> +-----------------------------
> 
>  0-5  5-10  10-15 15-20 20+
>    duration of syscall (ms)
> 
> 
> So in this example we could see right away that the average duration is around 10-15 ms, but we have some outliers that took more than 20 ms, so it would be interesting to zoom in there to get more details. And eventually clicking on one particular occurrence could bring you to the exact point in the trace where it happened (or create bookmarks for the outlier entries, etc.)
> 

Looks good. It should be better if users are able to define their own matching rules.

In addition, do you think about chaining more than 2 events together? Some procedures, such as IO and network, can be traced at different layers, and we are able to chain events caused by same sources together with some cules. Please refer to:

http://comments.gmane.org/gmane.linux.kernel/1883569

Which we are working on.

> 
>> [...]
>>
>>
>> However, when I working on the first problem, what I think about is a SQL query like:
>>
>> select * from events
>> inner join
>>    ( select max(timestamp) as lasttime from events groupby cpu sortby timestamp where
>>      events.timestamp < xxxxxxx and
>>      events.type == "raw_syscalls:sys_exit"
>>      group by cpu ) X
>> on
>> events.timestamp = X.lasttime and
>> events.cpu = X.cpu and
>> events.type = "raw_syscalls:sys_exit"
>>
>> (I'm not sure whether it works or need some debugging, but I think my idea is expressed.)
> 
> Interesting, and what would the output of such a query look like? Would it be a list of events, but with the "groupby" that would be more like a tree? Like this:
> 
> CPU0
>   |__event 1
>   |__event 2
>   |__...
> CPU1
>   |__event 4
>   |__event 5
>   |__...
> ...
> 
> ?
> 

Correct. Even a flat table with sort and filters enable should be helpful.

> We already have the Event table, on which we can apply filters of this type (timestamp == something && cpu == otherthing...). But we only show the results in the same flat list. I'm not sure what would be best here. Showing results in a separate view, or supporting sub-trees in the base Event table?
> 
> 
> Cheers,
> Alexandre



^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection.
  2015-02-10  4:55               ` Wang Nan
@ 2015-02-10  8:02                 ` Alexandre Montplaisir
  2015-02-10  8:02                 ` Alexandre Montplaisir
  1 sibling, 0 replies; 35+ messages in thread
From: Alexandre Montplaisir @ 2015-02-10  8:02 UTC (permalink / raw)
  To: Wang Nan
  Cc: Xinwei Hu, diamon-discuss, lttng-dev, tracecompass developer discussions

On 2015-02-09 11:55 PM, Wang Nan wrote:
> On 2015/2/10 4:13, Alexandre Montplaisir wrote:
>> [...]
>>
>> So in this example we could see right away that the average duration is around 10-15 ms, but we have some outliers that took more than 20 ms, so it would be interesting to zoom in there to get more details. And eventually clicking on one particular occurrence could bring you to the exact point in the trace where it happened (or create bookmarks for the outlier entries, etc.)
>>
> Looks good. It should be better if users are able to define their own matching rules.

Absolutely! We can have pre-defined rules for the known domain (system 
calls, IRQs, and so on), but then it'd be easy to let the user enter an 
event name for the "start" and an event name for the "end".

I was also thinking, we could even use general filters to define what is 
a start and an end event. It would require some refactoring due to the 
way filters are organized at the moment, so it wouldn't be trivial, but 
eventually we may want to get there.

I'm currently working on the side on a prototype of the "event matching" 
table view, I will let you know once I have an initial version working, 
so we can see if/how useful it is.

>
> In addition, do you think about chaining more than 2 events together? Some procedures, such as IO and network, can be traced at different layers, and we are able to chain events caused by same sources together with some cules. Please refer to:
>
> http://comments.gmane.org/gmane.linux.kernel/1883569
>
> Which we are working on.
>

Geneviève (added in CC) is currently working on virtual machine 
analyses, where we can match events from a kernel trace on a host to a 
kernel trace in a guest, and so on. The "layering" and general matching 
of events should be similar. Maybe she has some ideas if the event 
matching part could be reused somehow?


Cheers,
Alexandre

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection.
  2015-02-10  4:55               ` Wang Nan
  2015-02-10  8:02                 ` Alexandre Montplaisir
@ 2015-02-10  8:02                 ` Alexandre Montplaisir
  1 sibling, 0 replies; 35+ messages in thread
From: Alexandre Montplaisir @ 2015-02-10  8:02 UTC (permalink / raw)
  To: Wang Nan
  Cc: Xinwei Hu, diamon-discuss, lttng-dev, Geneviève Bastien,
	tracecompass developer discussions

On 2015-02-09 11:55 PM, Wang Nan wrote:
> On 2015/2/10 4:13, Alexandre Montplaisir wrote:
>> [...]
>>
>> So in this example we could see right away that the average duration is around 10-15 ms, but we have some outliers that took more than 20 ms, so it would be interesting to zoom in there to get more details. And eventually clicking on one particular occurrence could bring you to the exact point in the trace where it happened (or create bookmarks for the outlier entries, etc.)
>>
> Looks good. It should be better if users are able to define their own matching rules.

Absolutely! We can have pre-defined rules for the known domain (system 
calls, IRQs, and so on), but then it'd be easy to let the user enter an 
event name for the "start" and an event name for the "end".

I was also thinking, we could even use general filters to define what is 
a start and an end event. It would require some refactoring due to the 
way filters are organized at the moment, so it wouldn't be trivial, but 
eventually we may want to get there.

I'm currently working on the side on a prototype of the "event matching" 
table view, I will let you know once I have an initial version working, 
so we can see if/how useful it is.

>
> In addition, do you think about chaining more than 2 events together? Some procedures, such as IO and network, can be traced at different layers, and we are able to chain events caused by same sources together with some cules. Please refer to:
>
> http://comments.gmane.org/gmane.linux.kernel/1883569
>
> Which we are working on.
>

Geneviève (added in CC) is currently working on virtual machine 
analyses, where we can match events from a kernel trace on a host to a 
kernel trace in a guest, and so on. The "layering" and general matching 
of events should be similar. Maybe she has some ideas if the event 
matching part could be reused somehow?


Cheers,
Alexandre

^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2015-02-10  8:02 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-01-23  9:35 [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection Wang Nan
2015-01-23 16:30 ` Alexandre Montplaisir
2015-01-23 16:56   ` [diamon-discuss] [lttng-dev] " Geneviève Bastien
2015-01-23 16:56   ` [diamon-discuss] " Geneviève Bastien
2015-01-31  7:14   ` Wang Nan
2015-02-04 20:24     ` Alexandre Montplaisir
2015-02-07  3:13       ` Wang Nan
2015-02-07  5:14         ` Alexandre Montplaisir
2015-02-07  7:25           ` Wang Nan
2015-02-09 20:13             ` Alexandre Montplaisir
2015-02-09 20:13             ` Alexandre Montplaisir
2015-02-10  4:55               ` Wang Nan
2015-02-10  4:55               ` Wang Nan
2015-02-10  8:02                 ` Alexandre Montplaisir
2015-02-10  8:02                 ` Alexandre Montplaisir
2015-02-07  7:25           ` Wang Nan
2015-02-07  5:14         ` Alexandre Montplaisir
2015-02-07  3:13       ` Wang Nan
2015-02-04 20:24     ` Alexandre Montplaisir
2015-02-07 13:22     ` Mathieu Desnoyers
2015-02-07 13:22     ` Mathieu Desnoyers
2015-01-31  7:14   ` Wang Nan
2015-01-23 16:30 ` Alexandre Montplaisir
2015-01-23 20:12 ` Alexandre Montplaisir
2015-01-23 20:12 ` Alexandre Montplaisir
2015-01-26  4:08   ` Wang Nan
2015-01-26  4:08   ` Wang Nan
2015-01-26 16:25 ` Jérémie Galarneau
2015-01-27  0:54   ` Wang Nan
2015-01-27  0:54   ` Wang Nan
2015-01-27  5:31     ` Jérémie Galarneau
2015-01-27  5:31     ` Jérémie Galarneau
2015-01-27  6:31       ` Wang Nan
2015-01-27  6:31       ` Wang Nan
2015-01-26 16:25 ` Jérémie Galarneau

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.