* [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection.
@ 2015-01-23 9:35 Wang Nan
2015-01-23 16:30 ` Alexandre Montplaisir
` (5 more replies)
0 siblings, 6 replies; 35+ messages in thread
From: Wang Nan @ 2015-01-23 9:35 UTC (permalink / raw)
To: diamon-discuss, lttng-dev
Hi folks,
I'd like to share my first tuning experience with perf, ctf and
TraceCompass here, and I hope my experience helpful to diamon.org. Most
part of this mail is talking about my work. If you don't
interest in it, you can directly jump to conclusion part.
*My Task*
What I'm working on is finding the reason why CPU idle rate is high when
we benchmarking a database. I think it should be a very simple task:
tracing scheduling and system calls, finding the previous syscall issued
before idle, then based on statistics, collecting some user spaces call
stack, I can give an answer. I use perf to collect trace,
perf-convert-to-ctf to get ctf output and TraceCompass for
visualization.
*My Experience*
First of all I use perf to collect trace:
# perf record -a -e sched:* -e raw_syscalls:* sleep 1
then
# perf data convert --to-ctf out.ctf
Which is simple. However, raw_syscalls:* tracepoints export less
information than syscalls:* tracepoints. Without them I have to manually
find syscall name from syscall id. I prefer to use:
# perf record -a -e sched:* -e syscalls:* sleep 1
However there are some bugs and I have to make some patches. They are
posted and being disscussed currently, those bugs are still exist
upstream.
Then I need to convert perf.data to ctf. It tooks 140.57s to convert
2598513 samples, which are collected during only 1 second execution. My
working server has 64 2.0GHz Intel Xeon cores, but perf conversion
utilizes only 1 of them. I think this is another thing can be improved.
The next step is visualization. Output ctf trace can be opened with
TraceCompass without problem. The most important views for me should be
resources view (I use them to check CPU usage) and control flow view (I
use them to check thread activities).
The first uncomfortable thing is TraceCompass' slow response time. For
the trace I mentioned above, on resource view, after I click on CPU
idle area, I have to wait more than 10 seconds for event list updating
to get the previous event before the idle area.
Then I found through resources view that perf itself tooks lots of CPU
time. In my case 33.5% samples are generated by perf itself. One core is
dedicated to perf and never idle or taken by others. I think this should
be another thing needs to be improved: perf should give a way to
blacklist itself when tracing all CPUs.
TraceCompass doesn't recognize syscall:* tracepoints as CPU status
changing point. I have to also catch raw_syscall:*, and which doubles
the number of samples.
Finally I found the syscall which cause idle. However I need to write a
script to do statistics. TraceCompass itself is lack a mean to count
different events in my way.
The next thing I should do is to find the calltrace which issue the
syscall. This time TraceCompass won't help, mostly because perf
convertion now doesn't support converting calltrace.
*Conclusion*
I suggest perf and TraceCompass to think about following improvements:
1. Reducing the cost of perf recording. There are one third events are
generated by perf itself in my case. Is it possible that perf could
provide an ability that blacklist itself and collect all other
events?
2. Improving perf converting performance. Converting perf.data to CTF is
slow, but it should be offline most of the time. We can utilize the
abilities multi-core server to make it working in parallel.
3. Improving TraceCompass responding performance, especially when
synchronizing different views.
4. Support converting userspace call trace. I think perf side should already
have a plan on it.
5. Ad-Hoc visualization and statistics. Currently TraceCompass only
support dwaring pre-defined events and processes. When I try to
capture syscalls:*, I won't get benefit from TraceCompass because it
doesn't know them. I believe that during system tuning we will
finally get somewhere unable to be pre-defined by TraceCompass
designer. Therefore give users abilities to define their own events
and model should be much helpful.
Thank you.
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection.
2015-01-23 9:35 [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection Wang Nan
2015-01-23 16:30 ` Alexandre Montplaisir
@ 2015-01-23 16:30 ` Alexandre Montplaisir
2015-01-23 20:12 ` Alexandre Montplaisir
` (3 subsequent siblings)
5 siblings, 0 replies; 35+ messages in thread
From: Alexandre Montplaisir @ 2015-01-23 16:30 UTC (permalink / raw)
To: Wang Nan; +Cc: diamon-discuss, lttng-dev, tracecompass developer discussions
Hi Wang,
First of all, thank you very much for posting this use case. This is
exactly the type of user feedback that will help make the toolchain
better and more useful for users!
Some comments and questions below,
On 01/23/2015 04:35 AM, Wang Nan wrote:
> [...]
>
> Then I need to convert perf.data to ctf. It tooks 140.57s to convert
> 2598513 samples, which are collected during only 1 second execution. My
> working server has 64 2.0GHz Intel Xeon cores, but perf conversion
> utilizes only 1 of them. I think this is another thing can be improved.
Out of curiosity, approximately how big (in bytes) is the generated CTF
trace directory?
>
> The next step is visualization. Output ctf trace can be opened with
> TraceCompass without problem. The most important views for me should be
> resources view (I use them to check CPU usage) and control flow view (I
> use them to check thread activities).
>
> The first uncomfortable thing is TraceCompass' slow response time. For
> the trace I mentioned above, on resource view, after I click on CPU
> idle area, I have to wait more than 10 seconds for event list updating
> to get the previous event before the idle area.
Interesting. It is expected that opening a very large trace would take a
long time to load the first time, as everything gets indexed. But once
that step is done, seeking within the trace should be relatively quick
((log n) wrt to the trace size). In theory ;)
The perf-to-CTF conversion brings a completely new type of CTF traces
that was not seen before. It is possible that the CTF parser in Trace
Compass has some inefficiencies that were not exposed by other trace
types. Are you able to share that trace publicly? Or a trace taken in
the same environment, with no sensible information in it? It could be
very helpful in finding such problem.
> Then I found through resources view that perf itself tooks lots of CPU
> time. In my case 33.5% samples are generated by perf itself. One core is
> dedicated to perf and never idle or taken by others. I think this should
> be another thing needs to be improved: perf should give a way to
> blacklist itself when tracing all CPUs.
I don't want to start a tracer-war here :) but have you investigated
using LTTng for recording syscall/sched events ? Compared to perf, LTTng
is only about "getting trace events", and is a bit more involved to set
up, but it is more focused on performance and minimizing the impact on
the traced applications. And it outputs in CTF format too.
I remember when testing the perf-CTF patches, comparing a perf trace to
an LTTng one, perf would be doing system calls continuously on one of
the CPUs for the whole duration of the trace. Whereas in LTTng traces,
the session daemon would be a bit active at the beginning and at then
end, but otherwise completely invisible from the trace.
> TraceCompass doesn't recognize syscall:* tracepoints as CPU status
> changing point. I have to also catch raw_syscall:*, and which doubles
> the number of samples.
This is a gap in the definition of the analysis it seems. I don't
remember implementing two types of "syscall" events in the perf
analysis, so it should just be a matter of getting the exact event name
and adding it to the list. I will take a look and keep you posted!
> Finally I found the syscall which cause idle. However I need to write a
> script to do statistics. TraceCompass itself is lack a mean to count
> different events in my way.
Could you elaborate on this please? I agree the "Statistics" view in TC
is severely lacking, we could be gathering and displaying much more
information. The only question is what information would actually be useful.
What exactly would you have liked to be able to see in the tool?
> [...]
>
>
> 5. Ad-Hoc visualization and statistics. Currently TraceCompass only
> support dwaring pre-defined events and processes. When I try to
> capture syscalls:*, I won't get benefit from TraceCompass because it
> doesn't know them. I believe that during system tuning we will
> finally get somewhere unable to be pre-defined by TraceCompass
> designer. Therefore give users abilities to define their own events
> and model should be much helpful.
As I mentioned earlier, the pre-defined "perf analysis" in Trace Compass
should be fixed to handle the syscall events.
But it's interesting that you mention wanting to add your own events and
model. I completely agree with you, we will never be able to predict
every and all use cases the users will want to use the tool for, so
there should be a way for the user to add their own.
Well good news, it *is* possible for the user to define their own
analysis and views! This is still undergoing a lot of development, and
there is no nice UI yet, which is why it is not really advertized. But
starting from any supported trace type, a user today can define a time
graph view (like the Resource View for example) or a XY chart, using a
data-driven XML syntax.
If you are curious, you can take a look at a full example of doing such
a thing on this page:
https://github.com/alexmonthy/ust-tc-example
(the example uses an LTTng UST trace as a source, but it could work with
any supported trace type, even a custom text trace defined in the UI).
>
> Thank you.
Thanks again for taking the time to write about your experience!
Cheers,
Alexandre
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection.
2015-01-23 9:35 [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection Wang Nan
@ 2015-01-23 16:30 ` Alexandre Montplaisir
2015-01-23 16:56 ` [diamon-discuss] [lttng-dev] " Geneviève Bastien
` (3 more replies)
2015-01-23 16:30 ` Alexandre Montplaisir
` (4 subsequent siblings)
5 siblings, 4 replies; 35+ messages in thread
From: Alexandre Montplaisir @ 2015-01-23 16:30 UTC (permalink / raw)
To: Wang Nan; +Cc: diamon-discuss, lttng-dev, tracecompass developer discussions
Hi Wang,
First of all, thank you very much for posting this use case. This is
exactly the type of user feedback that will help make the toolchain
better and more useful for users!
Some comments and questions below,
On 01/23/2015 04:35 AM, Wang Nan wrote:
> [...]
>
> Then I need to convert perf.data to ctf. It tooks 140.57s to convert
> 2598513 samples, which are collected during only 1 second execution. My
> working server has 64 2.0GHz Intel Xeon cores, but perf conversion
> utilizes only 1 of them. I think this is another thing can be improved.
Out of curiosity, approximately how big (in bytes) is the generated CTF
trace directory?
>
> The next step is visualization. Output ctf trace can be opened with
> TraceCompass without problem. The most important views for me should be
> resources view (I use them to check CPU usage) and control flow view (I
> use them to check thread activities).
>
> The first uncomfortable thing is TraceCompass' slow response time. For
> the trace I mentioned above, on resource view, after I click on CPU
> idle area, I have to wait more than 10 seconds for event list updating
> to get the previous event before the idle area.
Interesting. It is expected that opening a very large trace would take a
long time to load the first time, as everything gets indexed. But once
that step is done, seeking within the trace should be relatively quick
((log n) wrt to the trace size). In theory ;)
The perf-to-CTF conversion brings a completely new type of CTF traces
that was not seen before. It is possible that the CTF parser in Trace
Compass has some inefficiencies that were not exposed by other trace
types. Are you able to share that trace publicly? Or a trace taken in
the same environment, with no sensible information in it? It could be
very helpful in finding such problem.
> Then I found through resources view that perf itself tooks lots of CPU
> time. In my case 33.5% samples are generated by perf itself. One core is
> dedicated to perf and never idle or taken by others. I think this should
> be another thing needs to be improved: perf should give a way to
> blacklist itself when tracing all CPUs.
I don't want to start a tracer-war here :) but have you investigated
using LTTng for recording syscall/sched events ? Compared to perf, LTTng
is only about "getting trace events", and is a bit more involved to set
up, but it is more focused on performance and minimizing the impact on
the traced applications. And it outputs in CTF format too.
I remember when testing the perf-CTF patches, comparing a perf trace to
an LTTng one, perf would be doing system calls continuously on one of
the CPUs for the whole duration of the trace. Whereas in LTTng traces,
the session daemon would be a bit active at the beginning and at then
end, but otherwise completely invisible from the trace.
> TraceCompass doesn't recognize syscall:* tracepoints as CPU status
> changing point. I have to also catch raw_syscall:*, and which doubles
> the number of samples.
This is a gap in the definition of the analysis it seems. I don't
remember implementing two types of "syscall" events in the perf
analysis, so it should just be a matter of getting the exact event name
and adding it to the list. I will take a look and keep you posted!
> Finally I found the syscall which cause idle. However I need to write a
> script to do statistics. TraceCompass itself is lack a mean to count
> different events in my way.
Could you elaborate on this please? I agree the "Statistics" view in TC
is severely lacking, we could be gathering and displaying much more
information. The only question is what information would actually be useful.
What exactly would you have liked to be able to see in the tool?
> [...]
>
>
> 5. Ad-Hoc visualization and statistics. Currently TraceCompass only
> support dwaring pre-defined events and processes. When I try to
> capture syscalls:*, I won't get benefit from TraceCompass because it
> doesn't know them. I believe that during system tuning we will
> finally get somewhere unable to be pre-defined by TraceCompass
> designer. Therefore give users abilities to define their own events
> and model should be much helpful.
As I mentioned earlier, the pre-defined "perf analysis" in Trace Compass
should be fixed to handle the syscall events.
But it's interesting that you mention wanting to add your own events and
model. I completely agree with you, we will never be able to predict
every and all use cases the users will want to use the tool for, so
there should be a way for the user to add their own.
Well good news, it *is* possible for the user to define their own
analysis and views! This is still undergoing a lot of development, and
there is no nice UI yet, which is why it is not really advertized. But
starting from any supported trace type, a user today can define a time
graph view (like the Resource View for example) or a XY chart, using a
data-driven XML syntax.
If you are curious, you can take a look at a full example of doing such
a thing on this page:
https://github.com/alexmonthy/ust-tc-example
(the example uses an LTTng UST trace as a source, but it could work with
any supported trace type, even a custom text trace defined in the UI).
>
> Thank you.
Thanks again for taking the time to write about your experience!
Cheers,
Alexandre
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection.
2015-01-23 16:30 ` Alexandre Montplaisir
2015-01-23 16:56 ` [diamon-discuss] [lttng-dev] " Geneviève Bastien
@ 2015-01-23 16:56 ` Geneviève Bastien
2015-01-31 7:14 ` Wang Nan
2015-01-31 7:14 ` Wang Nan
3 siblings, 0 replies; 35+ messages in thread
From: Geneviève Bastien @ 2015-01-23 16:56 UTC (permalink / raw)
To: Wang Nan; +Cc: diamon-discuss, lttng-dev, tracecompass developer discussions
Hi Wang,
Thanks for sharing your experience. It's always useful to have some real
live use case of using the tools.
Alex already made a quite complete answer.
I'll just add some information about your wish for Ad-Hoc visualization
and statistics. As Alex said, data driven analysis, using XML files is
already present in Trace Compass. Documentation on how to use it is
available here:
https://wiki.eclipse.org/Linux_Tools_Project/LTTng2/User_Guide#Data_driven_analysis
You can build your own analysis with any event type. The current support
is rather basic, you need to write the XML by yourself, starting from a
template and it supports only XY charts and time graph views. Current
work by students at Polytechnique in the data-driven analysis involves
defining data-driven custom filters for events and views, developing a
visual UI to build an analysis from a state diagram, supporting more use
cases of analysis from the event data. I'm cc'ing Naser Ezzati, who's
working currently on the XML analysis. He's been working among other
things on custom statistics, I don't know what's the status of this
development, but he may point you to his development branch, if it's
ready, so you can see if it fits your current need.
If you want some more details on the data-driven analysis work being
done at Poly right now, you can look at the presentations by Naser
Ezzati, Jean-Christian Kouamé and Simon Delisle on this page:
https://ahls.dorsal.polymtl.ca/dec2014. If you're interested in trying
out their [still experimental] work, let us know and we'll see if there
is an experimental working branch you could try.
Cheers,
Geneviève
On 01/23/2015 11:30 AM, Alexandre Montplaisir wrote:
> Hi Wang,
>
> First of all, thank you very much for posting this use case. This is
> exactly the type of user feedback that will help make the toolchain
> better and more useful for users!
>
> Some comments and questions below,
>
>
> On 01/23/2015 04:35 AM, Wang Nan wrote:
>> [...]
>>
>> Then I need to convert perf.data to ctf. It tooks 140.57s to convert
>> 2598513 samples, which are collected during only 1 second execution. My
>> working server has 64 2.0GHz Intel Xeon cores, but perf conversion
>> utilizes only 1 of them. I think this is another thing can be improved.
>
> Out of curiosity, approximately how big (in bytes) is the generated
> CTF trace directory?
>
>>
>> The next step is visualization. Output ctf trace can be opened with
>> TraceCompass without problem. The most important views for me should be
>> resources view (I use them to check CPU usage) and control flow view (I
>> use them to check thread activities).
>>
>> The first uncomfortable thing is TraceCompass' slow response time. For
>> the trace I mentioned above, on resource view, after I click on CPU
>> idle area, I have to wait more than 10 seconds for event list updating
>> to get the previous event before the idle area.
>
> Interesting. It is expected that opening a very large trace would take
> a long time to load the first time, as everything gets indexed. But
> once that step is done, seeking within the trace should be relatively
> quick ((log n) wrt to the trace size). In theory ;)
>
> The perf-to-CTF conversion brings a completely new type of CTF traces
> that was not seen before. It is possible that the CTF parser in Trace
> Compass has some inefficiencies that were not exposed by other trace
> types. Are you able to share that trace publicly? Or a trace taken in
> the same environment, with no sensible information in it? It could be
> very helpful in finding such problem.
>
>> Then I found through resources view that perf itself tooks lots of CPU
>> time. In my case 33.5% samples are generated by perf itself. One core is
>> dedicated to perf and never idle or taken by others. I think this should
>> be another thing needs to be improved: perf should give a way to
>> blacklist itself when tracing all CPUs.
>
> I don't want to start a tracer-war here :) but have you investigated
> using LTTng for recording syscall/sched events ? Compared to perf,
> LTTng is only about "getting trace events", and is a bit more involved
> to set up, but it is more focused on performance and minimizing the
> impact on the traced applications. And it outputs in CTF format too.
>
> I remember when testing the perf-CTF patches, comparing a perf trace
> to an LTTng one, perf would be doing system calls continuously on one
> of the CPUs for the whole duration of the trace. Whereas in LTTng
> traces, the session daemon would be a bit active at the beginning and
> at then end, but otherwise completely invisible from the trace.
>
>> TraceCompass doesn't recognize syscall:* tracepoints as CPU status
>> changing point. I have to also catch raw_syscall:*, and which doubles
>> the number of samples.
>
> This is a gap in the definition of the analysis it seems. I don't
> remember implementing two types of "syscall" events in the perf
> analysis, so it should just be a matter of getting the exact event
> name and adding it to the list. I will take a look and keep you posted!
>
>> Finally I found the syscall which cause idle. However I need to write a
>> script to do statistics. TraceCompass itself is lack a mean to count
>> different events in my way.
>
> Could you elaborate on this please? I agree the "Statistics" view in
> TC is severely lacking, we could be gathering and displaying much more
> information. The only question is what information would actually be
> useful.
>
> What exactly would you have liked to be able to see in the tool?
>
>> [...]
>>
>>
>> 5. Ad-Hoc visualization and statistics. Currently TraceCompass only
>> support dwaring pre-defined events and processes. When I try to
>> capture syscalls:*, I won't get benefit from TraceCompass
>> because it
>> doesn't know them. I believe that during system tuning we will
>> finally get somewhere unable to be pre-defined by TraceCompass
>> designer. Therefore give users abilities to define their own events
>> and model should be much helpful.
>
> As I mentioned earlier, the pre-defined "perf analysis" in Trace
> Compass should be fixed to handle the syscall events.
>
>
> But it's interesting that you mention wanting to add your own events
> and model. I completely agree with you, we will never be able to
> predict every and all use cases the users will want to use the tool
> for, so there should be a way for the user to add their own.
>
> Well good news, it *is* possible for the user to define their own
> analysis and views! This is still undergoing a lot of development, and
> there is no nice UI yet, which is why it is not really advertized. But
> starting from any supported trace type, a user today can define a time
> graph view (like the Resource View for example) or a XY chart, using a
> data-driven XML syntax.
>
> If you are curious, you can take a look at a full example of doing
> such a thing on this page:
> https://github.com/alexmonthy/ust-tc-example
> (the example uses an LTTng UST trace as a source, but it could work
> with any supported trace type, even a custom text trace defined in the
> UI).
>
>>
>> Thank you.
>
> Thanks again for taking the time to write about your experience!
>
> Cheers,
> Alexandre
>
>
> _______________________________________________
> lttng-dev mailing list
> lttng-dev@lists.lttng.org
> http://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [diamon-discuss] [lttng-dev] My experience on perf, CTF and TraceCompass, and some suggection.
2015-01-23 16:30 ` Alexandre Montplaisir
@ 2015-01-23 16:56 ` Geneviève Bastien
2015-01-23 16:56 ` [diamon-discuss] " Geneviève Bastien
` (2 subsequent siblings)
3 siblings, 0 replies; 35+ messages in thread
From: Geneviève Bastien @ 2015-01-23 16:56 UTC (permalink / raw)
To: Wang Nan
Cc: diamon-discuss, lttng-dev, Naser Ezzati,
tracecompass developer discussions
Hi Wang,
Thanks for sharing your experience. It's always useful to have some real
live use case of using the tools.
Alex already made a quite complete answer.
I'll just add some information about your wish for Ad-Hoc visualization
and statistics. As Alex said, data driven analysis, using XML files is
already present in Trace Compass. Documentation on how to use it is
available here:
https://wiki.eclipse.org/Linux_Tools_Project/LTTng2/User_Guide#Data_driven_analysis
You can build your own analysis with any event type. The current support
is rather basic, you need to write the XML by yourself, starting from a
template and it supports only XY charts and time graph views. Current
work by students at Polytechnique in the data-driven analysis involves
defining data-driven custom filters for events and views, developing a
visual UI to build an analysis from a state diagram, supporting more use
cases of analysis from the event data. I'm cc'ing Naser Ezzati, who's
working currently on the XML analysis. He's been working among other
things on custom statistics, I don't know what's the status of this
development, but he may point you to his development branch, if it's
ready, so you can see if it fits your current need.
If you want some more details on the data-driven analysis work being
done at Poly right now, you can look at the presentations by Naser
Ezzati, Jean-Christian Kouamé and Simon Delisle on this page:
https://ahls.dorsal.polymtl.ca/dec2014. If you're interested in trying
out their [still experimental] work, let us know and we'll see if there
is an experimental working branch you could try.
Cheers,
Geneviève
On 01/23/2015 11:30 AM, Alexandre Montplaisir wrote:
> Hi Wang,
>
> First of all, thank you very much for posting this use case. This is
> exactly the type of user feedback that will help make the toolchain
> better and more useful for users!
>
> Some comments and questions below,
>
>
> On 01/23/2015 04:35 AM, Wang Nan wrote:
>> [...]
>>
>> Then I need to convert perf.data to ctf. It tooks 140.57s to convert
>> 2598513 samples, which are collected during only 1 second execution. My
>> working server has 64 2.0GHz Intel Xeon cores, but perf conversion
>> utilizes only 1 of them. I think this is another thing can be improved.
>
> Out of curiosity, approximately how big (in bytes) is the generated
> CTF trace directory?
>
>>
>> The next step is visualization. Output ctf trace can be opened with
>> TraceCompass without problem. The most important views for me should be
>> resources view (I use them to check CPU usage) and control flow view (I
>> use them to check thread activities).
>>
>> The first uncomfortable thing is TraceCompass' slow response time. For
>> the trace I mentioned above, on resource view, after I click on CPU
>> idle area, I have to wait more than 10 seconds for event list updating
>> to get the previous event before the idle area.
>
> Interesting. It is expected that opening a very large trace would take
> a long time to load the first time, as everything gets indexed. But
> once that step is done, seeking within the trace should be relatively
> quick ((log n) wrt to the trace size). In theory ;)
>
> The perf-to-CTF conversion brings a completely new type of CTF traces
> that was not seen before. It is possible that the CTF parser in Trace
> Compass has some inefficiencies that were not exposed by other trace
> types. Are you able to share that trace publicly? Or a trace taken in
> the same environment, with no sensible information in it? It could be
> very helpful in finding such problem.
>
>> Then I found through resources view that perf itself tooks lots of CPU
>> time. In my case 33.5% samples are generated by perf itself. One core is
>> dedicated to perf and never idle or taken by others. I think this should
>> be another thing needs to be improved: perf should give a way to
>> blacklist itself when tracing all CPUs.
>
> I don't want to start a tracer-war here :) but have you investigated
> using LTTng for recording syscall/sched events ? Compared to perf,
> LTTng is only about "getting trace events", and is a bit more involved
> to set up, but it is more focused on performance and minimizing the
> impact on the traced applications. And it outputs in CTF format too.
>
> I remember when testing the perf-CTF patches, comparing a perf trace
> to an LTTng one, perf would be doing system calls continuously on one
> of the CPUs for the whole duration of the trace. Whereas in LTTng
> traces, the session daemon would be a bit active at the beginning and
> at then end, but otherwise completely invisible from the trace.
>
>> TraceCompass doesn't recognize syscall:* tracepoints as CPU status
>> changing point. I have to also catch raw_syscall:*, and which doubles
>> the number of samples.
>
> This is a gap in the definition of the analysis it seems. I don't
> remember implementing two types of "syscall" events in the perf
> analysis, so it should just be a matter of getting the exact event
> name and adding it to the list. I will take a look and keep you posted!
>
>> Finally I found the syscall which cause idle. However I need to write a
>> script to do statistics. TraceCompass itself is lack a mean to count
>> different events in my way.
>
> Could you elaborate on this please? I agree the "Statistics" view in
> TC is severely lacking, we could be gathering and displaying much more
> information. The only question is what information would actually be
> useful.
>
> What exactly would you have liked to be able to see in the tool?
>
>> [...]
>>
>>
>> 5. Ad-Hoc visualization and statistics. Currently TraceCompass only
>> support dwaring pre-defined events and processes. When I try to
>> capture syscalls:*, I won't get benefit from TraceCompass
>> because it
>> doesn't know them. I believe that during system tuning we will
>> finally get somewhere unable to be pre-defined by TraceCompass
>> designer. Therefore give users abilities to define their own events
>> and model should be much helpful.
>
> As I mentioned earlier, the pre-defined "perf analysis" in Trace
> Compass should be fixed to handle the syscall events.
>
>
> But it's interesting that you mention wanting to add your own events
> and model. I completely agree with you, we will never be able to
> predict every and all use cases the users will want to use the tool
> for, so there should be a way for the user to add their own.
>
> Well good news, it *is* possible for the user to define their own
> analysis and views! This is still undergoing a lot of development, and
> there is no nice UI yet, which is why it is not really advertized. But
> starting from any supported trace type, a user today can define a time
> graph view (like the Resource View for example) or a XY chart, using a
> data-driven XML syntax.
>
> If you are curious, you can take a look at a full example of doing
> such a thing on this page:
> https://github.com/alexmonthy/ust-tc-example
> (the example uses an LTTng UST trace as a source, but it could work
> with any supported trace type, even a custom text trace defined in the
> UI).
>
>>
>> Thank you.
>
> Thanks again for taking the time to write about your experience!
>
> Cheers,
> Alexandre
>
>
> _______________________________________________
> lttng-dev mailing list
> lttng-dev@lists.lttng.org
> http://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection.
2015-01-23 9:35 [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection Wang Nan
2015-01-23 16:30 ` Alexandre Montplaisir
2015-01-23 16:30 ` Alexandre Montplaisir
@ 2015-01-23 20:12 ` Alexandre Montplaisir
2015-01-23 20:12 ` Alexandre Montplaisir
` (2 subsequent siblings)
5 siblings, 0 replies; 35+ messages in thread
From: Alexandre Montplaisir @ 2015-01-23 20:12 UTC (permalink / raw)
To: Wang Nan; +Cc: diamon-discuss, lttng-dev, tracecompass developer discussions
On 01/23/2015 04:35 AM, Wang Nan wrote:
> [...] I prefer to use:
>
> # perf record -a -e sched:* -e syscalls:* sleep 1
>
> However there are some bugs and I have to make some patches. They are
> posted and being disscussed currently, those bugs are still exist
> upstream.
Hmm, I can confirm this problem, when trying to convert a trace taken
with these events enabled, I get the following error:
Failed to add field 'nr
Failed to add event 'syscalls:sys_enter_io_submit'.
Good to know it's being looked at!
But when you say:
> TraceCompass doesn't recognize syscall:* tracepoints as CPU status
> changing point. I have to also catch raw_syscall:*, and which doubles
> the number of samples.
I assume this was that with your patches applied, so that you could take
a trace of the syscall:* tracepoints?
Do you have a link to those patches somewhere? Or just let us know when
it gets integrated upstream. I was trying to update the analysis in
Trace Compass to handle syscall:* too, but we can't really do it without
knowing how it will look like in CTF format ;)
Cheers,
Alexandre
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection.
2015-01-23 9:35 [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection Wang Nan
` (2 preceding siblings ...)
2015-01-23 20:12 ` Alexandre Montplaisir
@ 2015-01-23 20:12 ` Alexandre Montplaisir
2015-01-26 4:08 ` Wang Nan
2015-01-26 4:08 ` Wang Nan
2015-01-26 16:25 ` Jérémie Galarneau
2015-01-26 16:25 ` Jérémie Galarneau
5 siblings, 2 replies; 35+ messages in thread
From: Alexandre Montplaisir @ 2015-01-23 20:12 UTC (permalink / raw)
To: Wang Nan; +Cc: diamon-discuss, lttng-dev, tracecompass developer discussions
On 01/23/2015 04:35 AM, Wang Nan wrote:
> [...] I prefer to use:
>
> # perf record -a -e sched:* -e syscalls:* sleep 1
>
> However there are some bugs and I have to make some patches. They are
> posted and being disscussed currently, those bugs are still exist
> upstream.
Hmm, I can confirm this problem, when trying to convert a trace taken
with these events enabled, I get the following error:
Failed to add field 'nr
Failed to add event 'syscalls:sys_enter_io_submit'.
Good to know it's being looked at!
But when you say:
> TraceCompass doesn't recognize syscall:* tracepoints as CPU status
> changing point. I have to also catch raw_syscall:*, and which doubles
> the number of samples.
I assume this was that with your patches applied, so that you could take
a trace of the syscall:* tracepoints?
Do you have a link to those patches somewhere? Or just let us know when
it gets integrated upstream. I was trying to update the analysis in
Trace Compass to handle syscall:* too, but we can't really do it without
knowing how it will look like in CTF format ;)
Cheers,
Alexandre
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection.
2015-01-23 20:12 ` Alexandre Montplaisir
2015-01-26 4:08 ` Wang Nan
@ 2015-01-26 4:08 ` Wang Nan
1 sibling, 0 replies; 35+ messages in thread
From: Wang Nan @ 2015-01-26 4:08 UTC (permalink / raw)
To: Alexandre Montplaisir
Cc: diamon-discuss, lttng-dev, tracecompass developer discussions
On 2015/1/24 4:12, Alexandre Montplaisir wrote:
>
> On 01/23/2015 04:35 AM, Wang Nan wrote:
>> [...] I prefer to use:
>>
>> # perf record -a -e sched:* -e syscalls:* sleep 1
>>
>> However there are some bugs and I have to make some patches. They are
>> posted and being disscussed currently, those bugs are still exist
>> upstream.
>
> Hmm, I can confirm this problem, when trying to convert a trace taken with these events enabled, I get the following error:
>
> Failed to add field 'nr
> Failed to add event 'syscalls:sys_enter_io_submit'.
>
> Good to know it's being looked at!
>
>
> But when you say:
>
>> TraceCompass doesn't recognize syscall:* tracepoints as CPU status
>> changing point. I have to also catch raw_syscall:*, and which doubles
>> the number of samples.
>
> I assume this was that with your patches applied, so that you could take a trace of the syscall:* tracepoints?
>
> Do you have a link to those patches somewhere? Or just let us know when it gets integrated upstream. I was trying to update the analysis in Trace Compass to handle syscall:* too, but we can't really do it without knowing how it will look like in CTF format ;)
>
>
Hi,
You can follow this thread in LKML:
https://lkml.org/lkml/2015/1/20/330
In https://lkml.org/lkml/2015/1/22/24 , I posted 2 RFC patches can solve the problem.
However I'm still waiting response from Steven Rostedt whether he allow us to improve
traceevent.
> Cheers,
> Alexandre
>
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection.
2015-01-23 20:12 ` Alexandre Montplaisir
@ 2015-01-26 4:08 ` Wang Nan
2015-01-26 4:08 ` Wang Nan
1 sibling, 0 replies; 35+ messages in thread
From: Wang Nan @ 2015-01-26 4:08 UTC (permalink / raw)
To: Alexandre Montplaisir
Cc: diamon-discuss, lttng-dev, tracecompass developer discussions
On 2015/1/24 4:12, Alexandre Montplaisir wrote:
>
> On 01/23/2015 04:35 AM, Wang Nan wrote:
>> [...] I prefer to use:
>>
>> # perf record -a -e sched:* -e syscalls:* sleep 1
>>
>> However there are some bugs and I have to make some patches. They are
>> posted and being disscussed currently, those bugs are still exist
>> upstream.
>
> Hmm, I can confirm this problem, when trying to convert a trace taken with these events enabled, I get the following error:
>
> Failed to add field 'nr
> Failed to add event 'syscalls:sys_enter_io_submit'.
>
> Good to know it's being looked at!
>
>
> But when you say:
>
>> TraceCompass doesn't recognize syscall:* tracepoints as CPU status
>> changing point. I have to also catch raw_syscall:*, and which doubles
>> the number of samples.
>
> I assume this was that with your patches applied, so that you could take a trace of the syscall:* tracepoints?
>
> Do you have a link to those patches somewhere? Or just let us know when it gets integrated upstream. I was trying to update the analysis in Trace Compass to handle syscall:* too, but we can't really do it without knowing how it will look like in CTF format ;)
>
>
Hi,
You can follow this thread in LKML:
https://lkml.org/lkml/2015/1/20/330
In https://lkml.org/lkml/2015/1/22/24 , I posted 2 RFC patches can solve the problem.
However I'm still waiting response from Steven Rostedt whether he allow us to improve
traceevent.
> Cheers,
> Alexandre
>
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection.
2015-01-23 9:35 [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection Wang Nan
` (4 preceding siblings ...)
2015-01-26 16:25 ` Jérémie Galarneau
@ 2015-01-26 16:25 ` Jérémie Galarneau
5 siblings, 0 replies; 35+ messages in thread
From: Jérémie Galarneau @ 2015-01-26 16:25 UTC (permalink / raw)
To: Wang Nan; +Cc: diamon-discuss, lttng-dev
On Fri, Jan 23, 2015 at 4:35 AM, Wang Nan <wangnan0@huawei.com> wrote:
> Hi folks,
>
> I'd like to share my first tuning experience with perf, ctf and
> TraceCompass here, and I hope my experience helpful to diamon.org. Most
> part of this mail is talking about my work. If you don't
> interest in it, you can directly jump to conclusion part.
>
> *My Task*
>
> What I'm working on is finding the reason why CPU idle rate is high when
> we benchmarking a database. I think it should be a very simple task:
> tracing scheduling and system calls, finding the previous syscall issued
> before idle, then based on statistics, collecting some user spaces call
> stack, I can give an answer. I use perf to collect trace,
> perf-convert-to-ctf to get ctf output and TraceCompass for
> visualization.
>
>
> *My Experience*
>
> First of all I use perf to collect trace:
>
> # perf record -a -e sched:* -e raw_syscalls:* sleep 1
>
> then
>
> # perf data convert --to-ctf out.ctf
>
> Which is simple. However, raw_syscalls:* tracepoints export less
> information than syscalls:* tracepoints. Without them I have to manually
> find syscall name from syscall id. I prefer to use:
>
> # perf record -a -e sched:* -e syscalls:* sleep 1
>
> However there are some bugs and I have to make some patches. They are
> posted and being disscussed currently, those bugs are still exist
> upstream.
>
> Then I need to convert perf.data to ctf. It tooks 140.57s to convert
> 2598513 samples, which are collected during only 1 second execution. My
> working server has 64 2.0GHz Intel Xeon cores, but perf conversion
> utilizes only 1 of them. I think this is another thing can be improved.
Thanks for taking the time to do this write-up!
Would it be possible to profile perf-to-ctf so we can spot the bottleneck?
Regards,
Jérémie
>
> The next step is visualization. Output ctf trace can be opened with
> TraceCompass without problem. The most important views for me should be
> resources view (I use them to check CPU usage) and control flow view (I
> use them to check thread activities).
>
> The first uncomfortable thing is TraceCompass' slow response time. For
> the trace I mentioned above, on resource view, after I click on CPU
> idle area, I have to wait more than 10 seconds for event list updating
> to get the previous event before the idle area.
>
> Then I found through resources view that perf itself tooks lots of CPU
> time. In my case 33.5% samples are generated by perf itself. One core is
> dedicated to perf and never idle or taken by others. I think this should
> be another thing needs to be improved: perf should give a way to
> blacklist itself when tracing all CPUs.
>
> TraceCompass doesn't recognize syscall:* tracepoints as CPU status
> changing point. I have to also catch raw_syscall:*, and which doubles
> the number of samples.
>
> Finally I found the syscall which cause idle. However I need to write a
> script to do statistics. TraceCompass itself is lack a mean to count
> different events in my way.
>
> The next thing I should do is to find the calltrace which issue the
> syscall. This time TraceCompass won't help, mostly because perf
> convertion now doesn't support converting calltrace.
>
> *Conclusion*
>
> I suggest perf and TraceCompass to think about following improvements:
>
> 1. Reducing the cost of perf recording. There are one third events are
> generated by perf itself in my case. Is it possible that perf could
> provide an ability that blacklist itself and collect all other
> events?
>
> 2. Improving perf converting performance. Converting perf.data to CTF is
> slow, but it should be offline most of the time. We can utilize the
> abilities multi-core server to make it working in parallel.
>
> 3. Improving TraceCompass responding performance, especially when
> synchronizing different views.
>
> 4. Support converting userspace call trace. I think perf side should already
> have a plan on it.
>
> 5. Ad-Hoc visualization and statistics. Currently TraceCompass only
> support dwaring pre-defined events and processes. When I try to
> capture syscalls:*, I won't get benefit from TraceCompass because it
> doesn't know them. I believe that during system tuning we will
> finally get somewhere unable to be pre-defined by TraceCompass
> designer. Therefore give users abilities to define their own events
> and model should be much helpful.
>
> Thank you.
>
> _______________________________________________
> diamon-discuss mailing list
> diamon-discuss@lists.linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/diamon-discuss
--
Jérémie Galarneau
EfficiOS Inc.
http://www.efficios.com
_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
http://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection.
2015-01-23 9:35 [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection Wang Nan
` (3 preceding siblings ...)
2015-01-23 20:12 ` Alexandre Montplaisir
@ 2015-01-26 16:25 ` Jérémie Galarneau
2015-01-27 0:54 ` Wang Nan
2015-01-27 0:54 ` Wang Nan
2015-01-26 16:25 ` Jérémie Galarneau
5 siblings, 2 replies; 35+ messages in thread
From: Jérémie Galarneau @ 2015-01-26 16:25 UTC (permalink / raw)
To: Wang Nan; +Cc: diamon-discuss, lttng-dev
On Fri, Jan 23, 2015 at 4:35 AM, Wang Nan <wangnan0@huawei.com> wrote:
> Hi folks,
>
> I'd like to share my first tuning experience with perf, ctf and
> TraceCompass here, and I hope my experience helpful to diamon.org. Most
> part of this mail is talking about my work. If you don't
> interest in it, you can directly jump to conclusion part.
>
> *My Task*
>
> What I'm working on is finding the reason why CPU idle rate is high when
> we benchmarking a database. I think it should be a very simple task:
> tracing scheduling and system calls, finding the previous syscall issued
> before idle, then based on statistics, collecting some user spaces call
> stack, I can give an answer. I use perf to collect trace,
> perf-convert-to-ctf to get ctf output and TraceCompass for
> visualization.
>
>
> *My Experience*
>
> First of all I use perf to collect trace:
>
> # perf record -a -e sched:* -e raw_syscalls:* sleep 1
>
> then
>
> # perf data convert --to-ctf out.ctf
>
> Which is simple. However, raw_syscalls:* tracepoints export less
> information than syscalls:* tracepoints. Without them I have to manually
> find syscall name from syscall id. I prefer to use:
>
> # perf record -a -e sched:* -e syscalls:* sleep 1
>
> However there are some bugs and I have to make some patches. They are
> posted and being disscussed currently, those bugs are still exist
> upstream.
>
> Then I need to convert perf.data to ctf. It tooks 140.57s to convert
> 2598513 samples, which are collected during only 1 second execution. My
> working server has 64 2.0GHz Intel Xeon cores, but perf conversion
> utilizes only 1 of them. I think this is another thing can be improved.
Thanks for taking the time to do this write-up!
Would it be possible to profile perf-to-ctf so we can spot the bottleneck?
Regards,
Jérémie
>
> The next step is visualization. Output ctf trace can be opened with
> TraceCompass without problem. The most important views for me should be
> resources view (I use them to check CPU usage) and control flow view (I
> use them to check thread activities).
>
> The first uncomfortable thing is TraceCompass' slow response time. For
> the trace I mentioned above, on resource view, after I click on CPU
> idle area, I have to wait more than 10 seconds for event list updating
> to get the previous event before the idle area.
>
> Then I found through resources view that perf itself tooks lots of CPU
> time. In my case 33.5% samples are generated by perf itself. One core is
> dedicated to perf and never idle or taken by others. I think this should
> be another thing needs to be improved: perf should give a way to
> blacklist itself when tracing all CPUs.
>
> TraceCompass doesn't recognize syscall:* tracepoints as CPU status
> changing point. I have to also catch raw_syscall:*, and which doubles
> the number of samples.
>
> Finally I found the syscall which cause idle. However I need to write a
> script to do statistics. TraceCompass itself is lack a mean to count
> different events in my way.
>
> The next thing I should do is to find the calltrace which issue the
> syscall. This time TraceCompass won't help, mostly because perf
> convertion now doesn't support converting calltrace.
>
> *Conclusion*
>
> I suggest perf and TraceCompass to think about following improvements:
>
> 1. Reducing the cost of perf recording. There are one third events are
> generated by perf itself in my case. Is it possible that perf could
> provide an ability that blacklist itself and collect all other
> events?
>
> 2. Improving perf converting performance. Converting perf.data to CTF is
> slow, but it should be offline most of the time. We can utilize the
> abilities multi-core server to make it working in parallel.
>
> 3. Improving TraceCompass responding performance, especially when
> synchronizing different views.
>
> 4. Support converting userspace call trace. I think perf side should already
> have a plan on it.
>
> 5. Ad-Hoc visualization and statistics. Currently TraceCompass only
> support dwaring pre-defined events and processes. When I try to
> capture syscalls:*, I won't get benefit from TraceCompass because it
> doesn't know them. I believe that during system tuning we will
> finally get somewhere unable to be pre-defined by TraceCompass
> designer. Therefore give users abilities to define their own events
> and model should be much helpful.
>
> Thank you.
>
> _______________________________________________
> diamon-discuss mailing list
> diamon-discuss@lists.linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/diamon-discuss
--
Jérémie Galarneau
EfficiOS Inc.
http://www.efficios.com
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection.
2015-01-26 16:25 ` Jérémie Galarneau
@ 2015-01-27 0:54 ` Wang Nan
2015-01-27 0:54 ` Wang Nan
1 sibling, 0 replies; 35+ messages in thread
From: Wang Nan @ 2015-01-27 0:54 UTC (permalink / raw)
To: Jérémie Galarneau; +Cc: Xinwei Hu, diamon-discuss, lttng-dev
On 2015/1/27 0:25, Jérémie Galarneau wrote:
> On Fri, Jan 23, 2015 at 4:35 AM, Wang Nan <wangnan0@huawei.com> wrote:
>> Hi folks,
>>
>> I'd like to share my first tuning experience with perf, ctf and
>> TraceCompass here, and I hope my experience helpful to diamon.org. Most
>> part of this mail is talking about my work. If you don't
>> interest in it, you can directly jump to conclusion part.
>>
>> *My Task*
>>
>> What I'm working on is finding the reason why CPU idle rate is high when
>> we benchmarking a database. I think it should be a very simple task:
>> tracing scheduling and system calls, finding the previous syscall issued
>> before idle, then based on statistics, collecting some user spaces call
>> stack, I can give an answer. I use perf to collect trace,
>> perf-convert-to-ctf to get ctf output and TraceCompass for
>> visualization.
>>
>>
>> *My Experience*
>>
>> First of all I use perf to collect trace:
>>
>> # perf record -a -e sched:* -e raw_syscalls:* sleep 1
>>
>> then
>>
>> # perf data convert --to-ctf out.ctf
>>
>> Which is simple. However, raw_syscalls:* tracepoints export less
>> information than syscalls:* tracepoints. Without them I have to manually
>> find syscall name from syscall id. I prefer to use:
>>
>> # perf record -a -e sched:* -e syscalls:* sleep 1
>>
>> However there are some bugs and I have to make some patches. They are
>> posted and being disscussed currently, those bugs are still exist
>> upstream.
>>
>> Then I need to convert perf.data to ctf. It tooks 140.57s to convert
>> 2598513 samples, which are collected during only 1 second execution. My
>> working server has 64 2.0GHz Intel Xeon cores, but perf conversion
>> utilizes only 1 of them. I think this is another thing can be improved.
>
> Thanks for taking the time to do this write-up!
>
> Would it be possible to profile perf-to-ctf so we can spot the bottleneck?
>
> Regards,
> Jérémie
>
We should, but I don't have enough time to do this. In addition, I think perf-to-ctf
conversion is possible to be parallelized without too much work. I suggest to consider
code tuning after we done that at perf side. What do you think?
>>
>> The next step is visualization. Output ctf trace can be opened with
>> TraceCompass without problem. The most important views for me should be
>> resources view (I use them to check CPU usage) and control flow view (I
>> use them to check thread activities).
>>
>> The first uncomfortable thing is TraceCompass' slow response time. For
>> the trace I mentioned above, on resource view, after I click on CPU
>> idle area, I have to wait more than 10 seconds for event list updating
>> to get the previous event before the idle area.
>>
>> Then I found through resources view that perf itself tooks lots of CPU
>> time. In my case 33.5% samples are generated by perf itself. One core is
>> dedicated to perf and never idle or taken by others. I think this should
>> be another thing needs to be improved: perf should give a way to
>> blacklist itself when tracing all CPUs.
>>
>> TraceCompass doesn't recognize syscall:* tracepoints as CPU status
>> changing point. I have to also catch raw_syscall:*, and which doubles
>> the number of samples.
>>
>> Finally I found the syscall which cause idle. However I need to write a
>> script to do statistics. TraceCompass itself is lack a mean to count
>> different events in my way.
>>
>> The next thing I should do is to find the calltrace which issue the
>> syscall. This time TraceCompass won't help, mostly because perf
>> convertion now doesn't support converting calltrace.
>>
>> *Conclusion*
>>
>> I suggest perf and TraceCompass to think about following improvements:
>>
>> 1. Reducing the cost of perf recording. There are one third events are
>> generated by perf itself in my case. Is it possible that perf could
>> provide an ability that blacklist itself and collect all other
>> events?
>>
>> 2. Improving perf converting performance. Converting perf.data to CTF is
>> slow, but it should be offline most of the time. We can utilize the
>> abilities multi-core server to make it working in parallel.
>>
>> 3. Improving TraceCompass responding performance, especially when
>> synchronizing different views.
>>
>> 4. Support converting userspace call trace. I think perf side should already
>> have a plan on it.
>>
>> 5. Ad-Hoc visualization and statistics. Currently TraceCompass only
>> support dwaring pre-defined events and processes. When I try to
>> capture syscalls:*, I won't get benefit from TraceCompass because it
>> doesn't know them. I believe that during system tuning we will
>> finally get somewhere unable to be pre-defined by TraceCompass
>> designer. Therefore give users abilities to define their own events
>> and model should be much helpful.
>>
>> Thank you.
>>
>> _______________________________________________
>> diamon-discuss mailing list
>> diamon-discuss@lists.linuxfoundation.org
>> https://lists.linuxfoundation.org/mailman/listinfo/diamon-discuss
>
>
>
_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
http://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection.
2015-01-26 16:25 ` Jérémie Galarneau
2015-01-27 0:54 ` Wang Nan
@ 2015-01-27 0:54 ` Wang Nan
2015-01-27 5:31 ` Jérémie Galarneau
2015-01-27 5:31 ` Jérémie Galarneau
1 sibling, 2 replies; 35+ messages in thread
From: Wang Nan @ 2015-01-27 0:54 UTC (permalink / raw)
To: Jérémie Galarneau; +Cc: Xinwei Hu, diamon-discuss, lttng-dev
On 2015/1/27 0:25, Jérémie Galarneau wrote:
> On Fri, Jan 23, 2015 at 4:35 AM, Wang Nan <wangnan0@huawei.com> wrote:
>> Hi folks,
>>
>> I'd like to share my first tuning experience with perf, ctf and
>> TraceCompass here, and I hope my experience helpful to diamon.org. Most
>> part of this mail is talking about my work. If you don't
>> interest in it, you can directly jump to conclusion part.
>>
>> *My Task*
>>
>> What I'm working on is finding the reason why CPU idle rate is high when
>> we benchmarking a database. I think it should be a very simple task:
>> tracing scheduling and system calls, finding the previous syscall issued
>> before idle, then based on statistics, collecting some user spaces call
>> stack, I can give an answer. I use perf to collect trace,
>> perf-convert-to-ctf to get ctf output and TraceCompass for
>> visualization.
>>
>>
>> *My Experience*
>>
>> First of all I use perf to collect trace:
>>
>> # perf record -a -e sched:* -e raw_syscalls:* sleep 1
>>
>> then
>>
>> # perf data convert --to-ctf out.ctf
>>
>> Which is simple. However, raw_syscalls:* tracepoints export less
>> information than syscalls:* tracepoints. Without them I have to manually
>> find syscall name from syscall id. I prefer to use:
>>
>> # perf record -a -e sched:* -e syscalls:* sleep 1
>>
>> However there are some bugs and I have to make some patches. They are
>> posted and being disscussed currently, those bugs are still exist
>> upstream.
>>
>> Then I need to convert perf.data to ctf. It tooks 140.57s to convert
>> 2598513 samples, which are collected during only 1 second execution. My
>> working server has 64 2.0GHz Intel Xeon cores, but perf conversion
>> utilizes only 1 of them. I think this is another thing can be improved.
>
> Thanks for taking the time to do this write-up!
>
> Would it be possible to profile perf-to-ctf so we can spot the bottleneck?
>
> Regards,
> Jérémie
>
We should, but I don't have enough time to do this. In addition, I think perf-to-ctf
conversion is possible to be parallelized without too much work. I suggest to consider
code tuning after we done that at perf side. What do you think?
>>
>> The next step is visualization. Output ctf trace can be opened with
>> TraceCompass without problem. The most important views for me should be
>> resources view (I use them to check CPU usage) and control flow view (I
>> use them to check thread activities).
>>
>> The first uncomfortable thing is TraceCompass' slow response time. For
>> the trace I mentioned above, on resource view, after I click on CPU
>> idle area, I have to wait more than 10 seconds for event list updating
>> to get the previous event before the idle area.
>>
>> Then I found through resources view that perf itself tooks lots of CPU
>> time. In my case 33.5% samples are generated by perf itself. One core is
>> dedicated to perf and never idle or taken by others. I think this should
>> be another thing needs to be improved: perf should give a way to
>> blacklist itself when tracing all CPUs.
>>
>> TraceCompass doesn't recognize syscall:* tracepoints as CPU status
>> changing point. I have to also catch raw_syscall:*, and which doubles
>> the number of samples.
>>
>> Finally I found the syscall which cause idle. However I need to write a
>> script to do statistics. TraceCompass itself is lack a mean to count
>> different events in my way.
>>
>> The next thing I should do is to find the calltrace which issue the
>> syscall. This time TraceCompass won't help, mostly because perf
>> convertion now doesn't support converting calltrace.
>>
>> *Conclusion*
>>
>> I suggest perf and TraceCompass to think about following improvements:
>>
>> 1. Reducing the cost of perf recording. There are one third events are
>> generated by perf itself in my case. Is it possible that perf could
>> provide an ability that blacklist itself and collect all other
>> events?
>>
>> 2. Improving perf converting performance. Converting perf.data to CTF is
>> slow, but it should be offline most of the time. We can utilize the
>> abilities multi-core server to make it working in parallel.
>>
>> 3. Improving TraceCompass responding performance, especially when
>> synchronizing different views.
>>
>> 4. Support converting userspace call trace. I think perf side should already
>> have a plan on it.
>>
>> 5. Ad-Hoc visualization and statistics. Currently TraceCompass only
>> support dwaring pre-defined events and processes. When I try to
>> capture syscalls:*, I won't get benefit from TraceCompass because it
>> doesn't know them. I believe that during system tuning we will
>> finally get somewhere unable to be pre-defined by TraceCompass
>> designer. Therefore give users abilities to define their own events
>> and model should be much helpful.
>>
>> Thank you.
>>
>> _______________________________________________
>> diamon-discuss mailing list
>> diamon-discuss@lists.linuxfoundation.org
>> https://lists.linuxfoundation.org/mailman/listinfo/diamon-discuss
>
>
>
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection.
2015-01-27 0:54 ` Wang Nan
@ 2015-01-27 5:31 ` Jérémie Galarneau
2015-01-27 5:31 ` Jérémie Galarneau
1 sibling, 0 replies; 35+ messages in thread
From: Jérémie Galarneau @ 2015-01-27 5:31 UTC (permalink / raw)
To: Wang Nan; +Cc: Xinwei Hu, diamon-discuss, lttng-dev
On Mon, Jan 26, 2015 at 7:54 PM, Wang Nan <wangnan0@huawei.com> wrote:
> On 2015/1/27 0:25, Jérémie Galarneau wrote:
>> On Fri, Jan 23, 2015 at 4:35 AM, Wang Nan <wangnan0@huawei.com> wrote:
>>> Hi folks,
>>>
>>> I'd like to share my first tuning experience with perf, ctf and
>>> TraceCompass here, and I hope my experience helpful to diamon.org. Most
>>> part of this mail is talking about my work. If you don't
>>> interest in it, you can directly jump to conclusion part.
>>>
>>> *My Task*
>>>
>>> What I'm working on is finding the reason why CPU idle rate is high when
>>> we benchmarking a database. I think it should be a very simple task:
>>> tracing scheduling and system calls, finding the previous syscall issued
>>> before idle, then based on statistics, collecting some user spaces call
>>> stack, I can give an answer. I use perf to collect trace,
>>> perf-convert-to-ctf to get ctf output and TraceCompass for
>>> visualization.
>>>
>>>
>>> *My Experience*
>>>
>>> First of all I use perf to collect trace:
>>>
>>> # perf record -a -e sched:* -e raw_syscalls:* sleep 1
>>>
>>> then
>>>
>>> # perf data convert --to-ctf out.ctf
>>>
>>> Which is simple. However, raw_syscalls:* tracepoints export less
>>> information than syscalls:* tracepoints. Without them I have to manually
>>> find syscall name from syscall id. I prefer to use:
>>>
>>> # perf record -a -e sched:* -e syscalls:* sleep 1
>>>
>>> However there are some bugs and I have to make some patches. They are
>>> posted and being disscussed currently, those bugs are still exist
>>> upstream.
>>>
>>> Then I need to convert perf.data to ctf. It tooks 140.57s to convert
>>> 2598513 samples, which are collected during only 1 second execution. My
>>> working server has 64 2.0GHz Intel Xeon cores, but perf conversion
>>> utilizes only 1 of them. I think this is another thing can be improved.
>>
>> Thanks for taking the time to do this write-up!
>>
>> Would it be possible to profile perf-to-ctf so we can spot the bottleneck?
>>
>> Regards,
>> Jérémie
>>
>
> We should, but I don't have enough time to do this. In addition, I think perf-to-ctf
> conversion is possible to be parallelized without too much work. I suggest to consider
> code tuning after we done that at perf side. What do you think?
I'm not sure I agree with the "without too much work" statement since
Babeltrace's CTF Writer API is not thread-safe at this moment. As
Alexandre previously mentionned, the size of the resulting CTF trace
could be a good indication of whether or not IO is the culprit here.
We have never done any optimization work on Babeltrace and CTF Writer.
We could be smarter on a number of front, notably with regards to
memory allocation; there is currently no pooling of allocated objects
(events, fields, types, etc.). We'll have to take a look at hard data
before getting our hands dirty. However, I'm sure there are tons of
optimization opportunities.
Jérémie
>
>>>
>>> The next step is visualization. Output ctf trace can be opened with
>>> TraceCompass without problem. The most important views for me should be
>>> resources view (I use them to check CPU usage) and control flow view (I
>>> use them to check thread activities).
>>>
>>> The first uncomfortable thing is TraceCompass' slow response time. For
>>> the trace I mentioned above, on resource view, after I click on CPU
>>> idle area, I have to wait more than 10 seconds for event list updating
>>> to get the previous event before the idle area.
>>>
>>> Then I found through resources view that perf itself tooks lots of CPU
>>> time. In my case 33.5% samples are generated by perf itself. One core is
>>> dedicated to perf and never idle or taken by others. I think this should
>>> be another thing needs to be improved: perf should give a way to
>>> blacklist itself when tracing all CPUs.
>>>
>>> TraceCompass doesn't recognize syscall:* tracepoints as CPU status
>>> changing point. I have to also catch raw_syscall:*, and which doubles
>>> the number of samples.
>>>
>>> Finally I found the syscall which cause idle. However I need to write a
>>> script to do statistics. TraceCompass itself is lack a mean to count
>>> different events in my way.
>>>
>>> The next thing I should do is to find the calltrace which issue the
>>> syscall. This time TraceCompass won't help, mostly because perf
>>> convertion now doesn't support converting calltrace.
>>>
>>> *Conclusion*
>>>
>>> I suggest perf and TraceCompass to think about following improvements:
>>>
>>> 1. Reducing the cost of perf recording. There are one third events are
>>> generated by perf itself in my case. Is it possible that perf could
>>> provide an ability that blacklist itself and collect all other
>>> events?
>>>
>>> 2. Improving perf converting performance. Converting perf.data to CTF is
>>> slow, but it should be offline most of the time. We can utilize the
>>> abilities multi-core server to make it working in parallel.
>>>
>>> 3. Improving TraceCompass responding performance, especially when
>>> synchronizing different views.
>>>
>>> 4. Support converting userspace call trace. I think perf side should already
>>> have a plan on it.
>>>
>>> 5. Ad-Hoc visualization and statistics. Currently TraceCompass only
>>> support dwaring pre-defined events and processes. When I try to
>>> capture syscalls:*, I won't get benefit from TraceCompass because it
>>> doesn't know them. I believe that during system tuning we will
>>> finally get somewhere unable to be pre-defined by TraceCompass
>>> designer. Therefore give users abilities to define their own events
>>> and model should be much helpful.
>>>
>>> Thank you.
>>>
>>> _______________________________________________
>>> diamon-discuss mailing list
>>> diamon-discuss@lists.linuxfoundation.org
>>> https://lists.linuxfoundation.org/mailman/listinfo/diamon-discuss
>>
>>
>>
>
>
--
Jérémie Galarneau
EfficiOS Inc.
http://www.efficios.com
_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
http://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection.
2015-01-27 0:54 ` Wang Nan
2015-01-27 5:31 ` Jérémie Galarneau
@ 2015-01-27 5:31 ` Jérémie Galarneau
2015-01-27 6:31 ` Wang Nan
2015-01-27 6:31 ` Wang Nan
1 sibling, 2 replies; 35+ messages in thread
From: Jérémie Galarneau @ 2015-01-27 5:31 UTC (permalink / raw)
To: Wang Nan; +Cc: Xinwei Hu, diamon-discuss, lttng-dev
On Mon, Jan 26, 2015 at 7:54 PM, Wang Nan <wangnan0@huawei.com> wrote:
> On 2015/1/27 0:25, Jérémie Galarneau wrote:
>> On Fri, Jan 23, 2015 at 4:35 AM, Wang Nan <wangnan0@huawei.com> wrote:
>>> Hi folks,
>>>
>>> I'd like to share my first tuning experience with perf, ctf and
>>> TraceCompass here, and I hope my experience helpful to diamon.org. Most
>>> part of this mail is talking about my work. If you don't
>>> interest in it, you can directly jump to conclusion part.
>>>
>>> *My Task*
>>>
>>> What I'm working on is finding the reason why CPU idle rate is high when
>>> we benchmarking a database. I think it should be a very simple task:
>>> tracing scheduling and system calls, finding the previous syscall issued
>>> before idle, then based on statistics, collecting some user spaces call
>>> stack, I can give an answer. I use perf to collect trace,
>>> perf-convert-to-ctf to get ctf output and TraceCompass for
>>> visualization.
>>>
>>>
>>> *My Experience*
>>>
>>> First of all I use perf to collect trace:
>>>
>>> # perf record -a -e sched:* -e raw_syscalls:* sleep 1
>>>
>>> then
>>>
>>> # perf data convert --to-ctf out.ctf
>>>
>>> Which is simple. However, raw_syscalls:* tracepoints export less
>>> information than syscalls:* tracepoints. Without them I have to manually
>>> find syscall name from syscall id. I prefer to use:
>>>
>>> # perf record -a -e sched:* -e syscalls:* sleep 1
>>>
>>> However there are some bugs and I have to make some patches. They are
>>> posted and being disscussed currently, those bugs are still exist
>>> upstream.
>>>
>>> Then I need to convert perf.data to ctf. It tooks 140.57s to convert
>>> 2598513 samples, which are collected during only 1 second execution. My
>>> working server has 64 2.0GHz Intel Xeon cores, but perf conversion
>>> utilizes only 1 of them. I think this is another thing can be improved.
>>
>> Thanks for taking the time to do this write-up!
>>
>> Would it be possible to profile perf-to-ctf so we can spot the bottleneck?
>>
>> Regards,
>> Jérémie
>>
>
> We should, but I don't have enough time to do this. In addition, I think perf-to-ctf
> conversion is possible to be parallelized without too much work. I suggest to consider
> code tuning after we done that at perf side. What do you think?
I'm not sure I agree with the "without too much work" statement since
Babeltrace's CTF Writer API is not thread-safe at this moment. As
Alexandre previously mentionned, the size of the resulting CTF trace
could be a good indication of whether or not IO is the culprit here.
We have never done any optimization work on Babeltrace and CTF Writer.
We could be smarter on a number of front, notably with regards to
memory allocation; there is currently no pooling of allocated objects
(events, fields, types, etc.). We'll have to take a look at hard data
before getting our hands dirty. However, I'm sure there are tons of
optimization opportunities.
Jérémie
>
>>>
>>> The next step is visualization. Output ctf trace can be opened with
>>> TraceCompass without problem. The most important views for me should be
>>> resources view (I use them to check CPU usage) and control flow view (I
>>> use them to check thread activities).
>>>
>>> The first uncomfortable thing is TraceCompass' slow response time. For
>>> the trace I mentioned above, on resource view, after I click on CPU
>>> idle area, I have to wait more than 10 seconds for event list updating
>>> to get the previous event before the idle area.
>>>
>>> Then I found through resources view that perf itself tooks lots of CPU
>>> time. In my case 33.5% samples are generated by perf itself. One core is
>>> dedicated to perf and never idle or taken by others. I think this should
>>> be another thing needs to be improved: perf should give a way to
>>> blacklist itself when tracing all CPUs.
>>>
>>> TraceCompass doesn't recognize syscall:* tracepoints as CPU status
>>> changing point. I have to also catch raw_syscall:*, and which doubles
>>> the number of samples.
>>>
>>> Finally I found the syscall which cause idle. However I need to write a
>>> script to do statistics. TraceCompass itself is lack a mean to count
>>> different events in my way.
>>>
>>> The next thing I should do is to find the calltrace which issue the
>>> syscall. This time TraceCompass won't help, mostly because perf
>>> convertion now doesn't support converting calltrace.
>>>
>>> *Conclusion*
>>>
>>> I suggest perf and TraceCompass to think about following improvements:
>>>
>>> 1. Reducing the cost of perf recording. There are one third events are
>>> generated by perf itself in my case. Is it possible that perf could
>>> provide an ability that blacklist itself and collect all other
>>> events?
>>>
>>> 2. Improving perf converting performance. Converting perf.data to CTF is
>>> slow, but it should be offline most of the time. We can utilize the
>>> abilities multi-core server to make it working in parallel.
>>>
>>> 3. Improving TraceCompass responding performance, especially when
>>> synchronizing different views.
>>>
>>> 4. Support converting userspace call trace. I think perf side should already
>>> have a plan on it.
>>>
>>> 5. Ad-Hoc visualization and statistics. Currently TraceCompass only
>>> support dwaring pre-defined events and processes. When I try to
>>> capture syscalls:*, I won't get benefit from TraceCompass because it
>>> doesn't know them. I believe that during system tuning we will
>>> finally get somewhere unable to be pre-defined by TraceCompass
>>> designer. Therefore give users abilities to define their own events
>>> and model should be much helpful.
>>>
>>> Thank you.
>>>
>>> _______________________________________________
>>> diamon-discuss mailing list
>>> diamon-discuss@lists.linuxfoundation.org
>>> https://lists.linuxfoundation.org/mailman/listinfo/diamon-discuss
>>
>>
>>
>
>
--
Jérémie Galarneau
EfficiOS Inc.
http://www.efficios.com
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection.
2015-01-27 5:31 ` Jérémie Galarneau
2015-01-27 6:31 ` Wang Nan
@ 2015-01-27 6:31 ` Wang Nan
1 sibling, 0 replies; 35+ messages in thread
From: Wang Nan @ 2015-01-27 6:31 UTC (permalink / raw)
To: Jérémie Galarneau; +Cc: Xinwei Hu, diamon-discuss, lttng-dev
On 2015/1/27 13:31, Jérémie Galarneau wrote:
> On Mon, Jan 26, 2015 at 7:54 PM, Wang Nan <wangnan0@huawei.com> wrote:
>> On 2015/1/27 0:25, Jérémie Galarneau wrote:
>>> On Fri, Jan 23, 2015 at 4:35 AM, Wang Nan <wangnan0@huawei.com> wrote:
>>>> Hi folks,
>>>>
>>>> I'd like to share my first tuning experience with perf, ctf and
>>>> TraceCompass here, and I hope my experience helpful to diamon.org. Most
>>>> part of this mail is talking about my work. If you don't
>>>> interest in it, you can directly jump to conclusion part.
>>>>
>>>> *My Task*
>>>>
>>>> What I'm working on is finding the reason why CPU idle rate is high when
>>>> we benchmarking a database. I think it should be a very simple task:
>>>> tracing scheduling and system calls, finding the previous syscall issued
>>>> before idle, then based on statistics, collecting some user spaces call
>>>> stack, I can give an answer. I use perf to collect trace,
>>>> perf-convert-to-ctf to get ctf output and TraceCompass for
>>>> visualization.
>>>>
>>>>
>>>> *My Experience*
>>>>
>>>> First of all I use perf to collect trace:
>>>>
>>>> # perf record -a -e sched:* -e raw_syscalls:* sleep 1
>>>>
>>>> then
>>>>
>>>> # perf data convert --to-ctf out.ctf
>>>>
>>>> Which is simple. However, raw_syscalls:* tracepoints export less
>>>> information than syscalls:* tracepoints. Without them I have to manually
>>>> find syscall name from syscall id. I prefer to use:
>>>>
>>>> # perf record -a -e sched:* -e syscalls:* sleep 1
>>>>
>>>> However there are some bugs and I have to make some patches. They are
>>>> posted and being disscussed currently, those bugs are still exist
>>>> upstream.
>>>>
>>>> Then I need to convert perf.data to ctf. It tooks 140.57s to convert
>>>> 2598513 samples, which are collected during only 1 second execution. My
>>>> working server has 64 2.0GHz Intel Xeon cores, but perf conversion
>>>> utilizes only 1 of them. I think this is another thing can be improved.
>>>
>>> Thanks for taking the time to do this write-up!
>>>
>>> Would it be possible to profile perf-to-ctf so we can spot the bottleneck?
>>>
>>> Regards,
>>> Jérémie
>>>
>>
>> We should, but I don't have enough time to do this. In addition, I think perf-to-ctf
>> conversion is possible to be parallelized without too much work. I suggest to consider
>> code tuning after we done that at perf side. What do you think?
>
> I'm not sure I agree with the "without too much work" statement since
> Babeltrace's CTF Writer API is not thread-safe at this moment. As
> Alexandre previously mentionned, the size of the resulting CTF trace
> could be a good indication of whether or not IO is the culprit here.
>
> We have never done any optimization work on Babeltrace and CTF Writer.
> We could be smarter on a number of front, notably with regards to
> memory allocation; there is currently no pooling of allocated objects
> (events, fields, types, etc.). We'll have to take a look at hard data
> before getting our hands dirty. However, I'm sure there are tons of
> optimization opportunities.
>
> Jérémie
>
Agree. I'll try to tune it when I have time, then we can get another
tuning experience.
Thank you.
>>
>>>>
>>>> The next step is visualization. Output ctf trace can be opened with
>>>> TraceCompass without problem. The most important views for me should be
>>>> resources view (I use them to check CPU usage) and control flow view (I
>>>> use them to check thread activities).
>>>>
>>>> The first uncomfortable thing is TraceCompass' slow response time. For
>>>> the trace I mentioned above, on resource view, after I click on CPU
>>>> idle area, I have to wait more than 10 seconds for event list updating
>>>> to get the previous event before the idle area.
>>>>
>>>> Then I found through resources view that perf itself tooks lots of CPU
>>>> time. In my case 33.5% samples are generated by perf itself. One core is
>>>> dedicated to perf and never idle or taken by others. I think this should
>>>> be another thing needs to be improved: perf should give a way to
>>>> blacklist itself when tracing all CPUs.
>>>>
>>>> TraceCompass doesn't recognize syscall:* tracepoints as CPU status
>>>> changing point. I have to also catch raw_syscall:*, and which doubles
>>>> the number of samples.
>>>>
>>>> Finally I found the syscall which cause idle. However I need to write a
>>>> script to do statistics. TraceCompass itself is lack a mean to count
>>>> different events in my way.
>>>>
>>>> The next thing I should do is to find the calltrace which issue the
>>>> syscall. This time TraceCompass won't help, mostly because perf
>>>> convertion now doesn't support converting calltrace.
>>>>
>>>> *Conclusion*
>>>>
>>>> I suggest perf and TraceCompass to think about following improvements:
>>>>
>>>> 1. Reducing the cost of perf recording. There are one third events are
>>>> generated by perf itself in my case. Is it possible that perf could
>>>> provide an ability that blacklist itself and collect all other
>>>> events?
>>>>
>>>> 2. Improving perf converting performance. Converting perf.data to CTF is
>>>> slow, but it should be offline most of the time. We can utilize the
>>>> abilities multi-core server to make it working in parallel.
>>>>
>>>> 3. Improving TraceCompass responding performance, especially when
>>>> synchronizing different views.
>>>>
>>>> 4. Support converting userspace call trace. I think perf side should already
>>>> have a plan on it.
>>>>
>>>> 5. Ad-Hoc visualization and statistics. Currently TraceCompass only
>>>> support dwaring pre-defined events and processes. When I try to
>>>> capture syscalls:*, I won't get benefit from TraceCompass because it
>>>> doesn't know them. I believe that during system tuning we will
>>>> finally get somewhere unable to be pre-defined by TraceCompass
>>>> designer. Therefore give users abilities to define their own events
>>>> and model should be much helpful.
>>>>
>>>> Thank you.
>>>>
>>>> _______________________________________________
>>>> diamon-discuss mailing list
>>>> diamon-discuss@lists.linuxfoundation.org
>>>> https://lists.linuxfoundation.org/mailman/listinfo/diamon-discuss
>>>
>>>
>>>
>>
>>
>
>
>
_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
http://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection.
2015-01-27 5:31 ` Jérémie Galarneau
@ 2015-01-27 6:31 ` Wang Nan
2015-01-27 6:31 ` Wang Nan
1 sibling, 0 replies; 35+ messages in thread
From: Wang Nan @ 2015-01-27 6:31 UTC (permalink / raw)
To: Jérémie Galarneau; +Cc: Xinwei Hu, diamon-discuss, lttng-dev
On 2015/1/27 13:31, Jérémie Galarneau wrote:
> On Mon, Jan 26, 2015 at 7:54 PM, Wang Nan <wangnan0@huawei.com> wrote:
>> On 2015/1/27 0:25, Jérémie Galarneau wrote:
>>> On Fri, Jan 23, 2015 at 4:35 AM, Wang Nan <wangnan0@huawei.com> wrote:
>>>> Hi folks,
>>>>
>>>> I'd like to share my first tuning experience with perf, ctf and
>>>> TraceCompass here, and I hope my experience helpful to diamon.org. Most
>>>> part of this mail is talking about my work. If you don't
>>>> interest in it, you can directly jump to conclusion part.
>>>>
>>>> *My Task*
>>>>
>>>> What I'm working on is finding the reason why CPU idle rate is high when
>>>> we benchmarking a database. I think it should be a very simple task:
>>>> tracing scheduling and system calls, finding the previous syscall issued
>>>> before idle, then based on statistics, collecting some user spaces call
>>>> stack, I can give an answer. I use perf to collect trace,
>>>> perf-convert-to-ctf to get ctf output and TraceCompass for
>>>> visualization.
>>>>
>>>>
>>>> *My Experience*
>>>>
>>>> First of all I use perf to collect trace:
>>>>
>>>> # perf record -a -e sched:* -e raw_syscalls:* sleep 1
>>>>
>>>> then
>>>>
>>>> # perf data convert --to-ctf out.ctf
>>>>
>>>> Which is simple. However, raw_syscalls:* tracepoints export less
>>>> information than syscalls:* tracepoints. Without them I have to manually
>>>> find syscall name from syscall id. I prefer to use:
>>>>
>>>> # perf record -a -e sched:* -e syscalls:* sleep 1
>>>>
>>>> However there are some bugs and I have to make some patches. They are
>>>> posted and being disscussed currently, those bugs are still exist
>>>> upstream.
>>>>
>>>> Then I need to convert perf.data to ctf. It tooks 140.57s to convert
>>>> 2598513 samples, which are collected during only 1 second execution. My
>>>> working server has 64 2.0GHz Intel Xeon cores, but perf conversion
>>>> utilizes only 1 of them. I think this is another thing can be improved.
>>>
>>> Thanks for taking the time to do this write-up!
>>>
>>> Would it be possible to profile perf-to-ctf so we can spot the bottleneck?
>>>
>>> Regards,
>>> Jérémie
>>>
>>
>> We should, but I don't have enough time to do this. In addition, I think perf-to-ctf
>> conversion is possible to be parallelized without too much work. I suggest to consider
>> code tuning after we done that at perf side. What do you think?
>
> I'm not sure I agree with the "without too much work" statement since
> Babeltrace's CTF Writer API is not thread-safe at this moment. As
> Alexandre previously mentionned, the size of the resulting CTF trace
> could be a good indication of whether or not IO is the culprit here.
>
> We have never done any optimization work on Babeltrace and CTF Writer.
> We could be smarter on a number of front, notably with regards to
> memory allocation; there is currently no pooling of allocated objects
> (events, fields, types, etc.). We'll have to take a look at hard data
> before getting our hands dirty. However, I'm sure there are tons of
> optimization opportunities.
>
> Jérémie
>
Agree. I'll try to tune it when I have time, then we can get another
tuning experience.
Thank you.
>>
>>>>
>>>> The next step is visualization. Output ctf trace can be opened with
>>>> TraceCompass without problem. The most important views for me should be
>>>> resources view (I use them to check CPU usage) and control flow view (I
>>>> use them to check thread activities).
>>>>
>>>> The first uncomfortable thing is TraceCompass' slow response time. For
>>>> the trace I mentioned above, on resource view, after I click on CPU
>>>> idle area, I have to wait more than 10 seconds for event list updating
>>>> to get the previous event before the idle area.
>>>>
>>>> Then I found through resources view that perf itself tooks lots of CPU
>>>> time. In my case 33.5% samples are generated by perf itself. One core is
>>>> dedicated to perf and never idle or taken by others. I think this should
>>>> be another thing needs to be improved: perf should give a way to
>>>> blacklist itself when tracing all CPUs.
>>>>
>>>> TraceCompass doesn't recognize syscall:* tracepoints as CPU status
>>>> changing point. I have to also catch raw_syscall:*, and which doubles
>>>> the number of samples.
>>>>
>>>> Finally I found the syscall which cause idle. However I need to write a
>>>> script to do statistics. TraceCompass itself is lack a mean to count
>>>> different events in my way.
>>>>
>>>> The next thing I should do is to find the calltrace which issue the
>>>> syscall. This time TraceCompass won't help, mostly because perf
>>>> convertion now doesn't support converting calltrace.
>>>>
>>>> *Conclusion*
>>>>
>>>> I suggest perf and TraceCompass to think about following improvements:
>>>>
>>>> 1. Reducing the cost of perf recording. There are one third events are
>>>> generated by perf itself in my case. Is it possible that perf could
>>>> provide an ability that blacklist itself and collect all other
>>>> events?
>>>>
>>>> 2. Improving perf converting performance. Converting perf.data to CTF is
>>>> slow, but it should be offline most of the time. We can utilize the
>>>> abilities multi-core server to make it working in parallel.
>>>>
>>>> 3. Improving TraceCompass responding performance, especially when
>>>> synchronizing different views.
>>>>
>>>> 4. Support converting userspace call trace. I think perf side should already
>>>> have a plan on it.
>>>>
>>>> 5. Ad-Hoc visualization and statistics. Currently TraceCompass only
>>>> support dwaring pre-defined events and processes. When I try to
>>>> capture syscalls:*, I won't get benefit from TraceCompass because it
>>>> doesn't know them. I believe that during system tuning we will
>>>> finally get somewhere unable to be pre-defined by TraceCompass
>>>> designer. Therefore give users abilities to define their own events
>>>> and model should be much helpful.
>>>>
>>>> Thank you.
>>>>
>>>> _______________________________________________
>>>> diamon-discuss mailing list
>>>> diamon-discuss@lists.linuxfoundation.org
>>>> https://lists.linuxfoundation.org/mailman/listinfo/diamon-discuss
>>>
>>>
>>>
>>
>>
>
>
>
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection.
2015-01-23 16:30 ` Alexandre Montplaisir
` (2 preceding siblings ...)
2015-01-31 7:14 ` Wang Nan
@ 2015-01-31 7:14 ` Wang Nan
3 siblings, 0 replies; 35+ messages in thread
From: Wang Nan @ 2015-01-31 7:14 UTC (permalink / raw)
To: Alexandre Montplaisir
Cc: Xinwei Hu, diamon-discuss, lttng-dev, tracecompass developer discussions
> This is a gap in the definition of the analysis it seems. I don't remember implementing two types of "syscall" events in the perf analysis, so it should just be a matter of getting the exact event name and adding it to the list. I will take a look and keep you posted!
>
>> Finally I found the syscall which cause idle. However I need to write a
>> script to do statistics. TraceCompass itself is lack a mean to count
>> different events in my way.
>
> Could you elaborate on this please? I agree the "Statistics" view in TC is severely lacking, we could be gathering and displaying much more information. The only question is what information would actually be useful.
I'd like to describe some cases of ad-hoc statisics, which I have to write python
scripts to do.
*First case: matching sys_enter and sys_exit*
The first case is to find the reason why most of CPUs are idle. From TraceCompass
resource view, I find some gray gaps for about 300us. During these gaps, there is only
1 running CPU, all other CPUs are idle. I can find the reason why a particular CPU
is idle using TraceCompass with following steps:
1. In TraceCompass resource view, click the idle gap of that CPU, find next event
with the 'Select Next Event' button, continous select next event until find
a 'raw_syscalls:sys_exit' event, then by checking 'id' field I can find what syscall
cause the CPU idle. (I have mentioned before, that in my case, each time when I click
that button, I have to wait for 8 to 10 seconds for the trace table update so I can kown
which event it is. This is painful for me...)
2. Then I need to find corresponding "raw_syscalls:sys_enter" event to see when the syscall
is issued. I switch to control flow view then use 'Select Previous Event' to find it, then
back to resource view I can understand how long this syscall takes, whether the CPU
does some work or simply idle after the syscall is issued, and whether the task is scheduled
across CPUs.
3. For each CPU do step 1 and step 2.
In some high-end servers the number of cores may exceeds 100. Even in my case the number of traced
CPUs is 32. Doing such searching is time consuming. I have to write a python script to do that.
My result is: half of cores are waiting on different futexs, half of then are waiting on
pselect() (caused by sleep()).
*Second case: matching futex WAKE and WAIT*
Therefore the next case I'd like to share is to maching futex wait, futex wakeup and futex waken
events. This time TraceCompass can't help much. However, the python script is also not very easy
to design. I have to track CPU and process state transition by myself, match all futex sys_enter
and sys_exit events, consider different cases including FUTEX_WAKE before FUTEX_WAIT, failures,
timeout and compare retval of futex wake and the number of threads waken by it. This
disposable python script has 115 lines of code (I have to admit that I'm not a very good python
programmer), I create and debug it for serval hours.
My final result is: threads wakeup each other in a tree-like manner. The first futex WAKE command is
issued by the only running CPU, wakeup only one thread. It wakeups others, other wakeup more, finally
nearly all CPUs are wakenup. There are some threads get executed after a relative long time
after the corresponding futex WAKE command, even if there are idle CPUs at that time. Therefore we
should look into scheduler. However, the gap itself should be a normal phenomenon.
*Third case: device utilization*
We have a high-speed storage device, but writing to filesystems on it is not as fast as we expected.
I deploy serval tracepoints at device driver to track device activities. However, I have to create
some tools to draw cumulative curve and speed curve to find whether there is irregular idle. I use
gnuplot for ploting, but have to write another python script to extrace data. I think languages like
R should be useful in this case but I'm not familiary with it.
*Conclusion*
In this email I list 3 use cases related to Ad-Hoc statistics I mentioned earlier. Case 1 and 2 are
in fact not a statistics problem. They should be considered as query problems. I suspect case
1 and 2 can be expressed using SQL. If TraceCompass can provide a query language like SQL, we can quickly
find the information we need to know so will have more time for tuning. I expressed my SQL-like query
idea on one of my early email:
http://lists.linuxfoundation.org/pipermail/diamon-discuss/2014-November/000003.html
However I was not very sure the query problem we would meet.
Case 3 requires a plotting tool like gnuplot or R. I don't know whether TraceCompass designers want to
integrate such function, but at lease TraceCompass should export data for those tools.
In case 1 and 2, I spent a lot of time to analyze a phenomenon which is finally shown to be normal.
I think this should be common in real performance analysis and tuning tasks. Many ideas may
appear after the first trace is collected. I think tools like TraceCompass should consider a
way to reduce the cost of trials and error.
Thank you for reading this!
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection.
2015-01-23 16:30 ` Alexandre Montplaisir
2015-01-23 16:56 ` [diamon-discuss] [lttng-dev] " Geneviève Bastien
2015-01-23 16:56 ` [diamon-discuss] " Geneviève Bastien
@ 2015-01-31 7:14 ` Wang Nan
2015-02-04 20:24 ` Alexandre Montplaisir
` (3 more replies)
2015-01-31 7:14 ` Wang Nan
3 siblings, 4 replies; 35+ messages in thread
From: Wang Nan @ 2015-01-31 7:14 UTC (permalink / raw)
To: Alexandre Montplaisir
Cc: Xinwei Hu, diamon-discuss, lttng-dev, tracecompass developer discussions
> This is a gap in the definition of the analysis it seems. I don't remember implementing two types of "syscall" events in the perf analysis, so it should just be a matter of getting the exact event name and adding it to the list. I will take a look and keep you posted!
>
>> Finally I found the syscall which cause idle. However I need to write a
>> script to do statistics. TraceCompass itself is lack a mean to count
>> different events in my way.
>
> Could you elaborate on this please? I agree the "Statistics" view in TC is severely lacking, we could be gathering and displaying much more information. The only question is what information would actually be useful.
I'd like to describe some cases of ad-hoc statisics, which I have to write python
scripts to do.
*First case: matching sys_enter and sys_exit*
The first case is to find the reason why most of CPUs are idle. From TraceCompass
resource view, I find some gray gaps for about 300us. During these gaps, there is only
1 running CPU, all other CPUs are idle. I can find the reason why a particular CPU
is idle using TraceCompass with following steps:
1. In TraceCompass resource view, click the idle gap of that CPU, find next event
with the 'Select Next Event' button, continous select next event until find
a 'raw_syscalls:sys_exit' event, then by checking 'id' field I can find what syscall
cause the CPU idle. (I have mentioned before, that in my case, each time when I click
that button, I have to wait for 8 to 10 seconds for the trace table update so I can kown
which event it is. This is painful for me...)
2. Then I need to find corresponding "raw_syscalls:sys_enter" event to see when the syscall
is issued. I switch to control flow view then use 'Select Previous Event' to find it, then
back to resource view I can understand how long this syscall takes, whether the CPU
does some work or simply idle after the syscall is issued, and whether the task is scheduled
across CPUs.
3. For each CPU do step 1 and step 2.
In some high-end servers the number of cores may exceeds 100. Even in my case the number of traced
CPUs is 32. Doing such searching is time consuming. I have to write a python script to do that.
My result is: half of cores are waiting on different futexs, half of then are waiting on
pselect() (caused by sleep()).
*Second case: matching futex WAKE and WAIT*
Therefore the next case I'd like to share is to maching futex wait, futex wakeup and futex waken
events. This time TraceCompass can't help much. However, the python script is also not very easy
to design. I have to track CPU and process state transition by myself, match all futex sys_enter
and sys_exit events, consider different cases including FUTEX_WAKE before FUTEX_WAIT, failures,
timeout and compare retval of futex wake and the number of threads waken by it. This
disposable python script has 115 lines of code (I have to admit that I'm not a very good python
programmer), I create and debug it for serval hours.
My final result is: threads wakeup each other in a tree-like manner. The first futex WAKE command is
issued by the only running CPU, wakeup only one thread. It wakeups others, other wakeup more, finally
nearly all CPUs are wakenup. There are some threads get executed after a relative long time
after the corresponding futex WAKE command, even if there are idle CPUs at that time. Therefore we
should look into scheduler. However, the gap itself should be a normal phenomenon.
*Third case: device utilization*
We have a high-speed storage device, but writing to filesystems on it is not as fast as we expected.
I deploy serval tracepoints at device driver to track device activities. However, I have to create
some tools to draw cumulative curve and speed curve to find whether there is irregular idle. I use
gnuplot for ploting, but have to write another python script to extrace data. I think languages like
R should be useful in this case but I'm not familiary with it.
*Conclusion*
In this email I list 3 use cases related to Ad-Hoc statistics I mentioned earlier. Case 1 and 2 are
in fact not a statistics problem. They should be considered as query problems. I suspect case
1 and 2 can be expressed using SQL. If TraceCompass can provide a query language like SQL, we can quickly
find the information we need to know so will have more time for tuning. I expressed my SQL-like query
idea on one of my early email:
http://lists.linuxfoundation.org/pipermail/diamon-discuss/2014-November/000003.html
However I was not very sure the query problem we would meet.
Case 3 requires a plotting tool like gnuplot or R. I don't know whether TraceCompass designers want to
integrate such function, but at lease TraceCompass should export data for those tools.
In case 1 and 2, I spent a lot of time to analyze a phenomenon which is finally shown to be normal.
I think this should be common in real performance analysis and tuning tasks. Many ideas may
appear after the first trace is collected. I think tools like TraceCompass should consider a
way to reduce the cost of trials and error.
Thank you for reading this!
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection.
2015-01-31 7:14 ` Wang Nan
2015-02-04 20:24 ` Alexandre Montplaisir
@ 2015-02-04 20:24 ` Alexandre Montplaisir
2015-02-07 13:22 ` Mathieu Desnoyers
2015-02-07 13:22 ` Mathieu Desnoyers
3 siblings, 0 replies; 35+ messages in thread
From: Alexandre Montplaisir @ 2015-02-04 20:24 UTC (permalink / raw)
To: Wang Nan
Cc: Xinwei Hu, diamon-discuss, lttng-dev, tracecompass developer discussions
Hi Wang,
Once again, thanks a lot for this *very* valuable feedback! A good tool
should be able to help in real-world use cases, like the ones you described.
Right now, Trace Compass is focused on showing information on the time
scale. Views like the Control Flow and Resource Views are useful when
you want to look at a particular location of a trace, and want to know
exactly what the system is doing. However, when you are looking at a
large trace completely zoomed-out, it's not extremely useful. It should
be able to point out the "points of interest" in this trace, where the
user can then zoom in and make use of the time-scale views.
We've been having many discussions internally about adding views on a
frequency scale, rather than a time scale. For example, a view that
could aggregate the latency of system calls (or IRQ handlers, or IO
operations, etc.), and show them either as a histogram or as a point
cloud, so that it becomes easy to see the "outliers" of the
distribution. The user could then click on these outliers to be brought
to the location in the trace where they happened.
These frequency views could provide the missing step of identifying
points of interest in a trace, after which the user could switch to the
time-based views to examine the highlighted locations in more details.
I think such an approach could help in all 3 use cases you have
presented. What do you think?
Also, for known domains like the Linux kernel, we can provide pre-made
frequency statistics, like system call latencies per syscall type and so
on. However it would be important to keep it customizable, since we can
never predict all the use cases that the users will need. That would
probably be added a bit later though.
We don't have a defined roadmap for the "frequency analysis" right now,
but we will definitely try to have some working prototypes for our 1.0
release in June. I will keep you (and these mailing lists) posted!
Some more comments below.
On 01/31/2015 02:14 AM, Wang Nan wrote:
>> This is a gap in the definition of the analysis it seems. I don't remember implementing two types of "syscall" events in the perf analysis, so it should just be a matter of getting the exact event name and adding it to the list. I will take a look and keep you posted!
>>
>>> Finally I found the syscall which cause idle. However I need to write a
>>> script to do statistics. TraceCompass itself is lack a mean to count
>>> different events in my way.
>> Could you elaborate on this please? I agree the "Statistics" view in TC is severely lacking, we could be gathering and displaying much more information. The only question is what information would actually be useful.
> I'd like to describe some cases of ad-hoc statisics, which I have to write python
> scripts to do.
>
> *First case: matching sys_enter and sys_exit*
>
> The first case is to find the reason why most of CPUs are idle. From TraceCompass
> resource view, I find some gray gaps for about 300us. During these gaps, there is only
> 1 running CPU, all other CPUs are idle. I can find the reason why a particular CPU
> is idle using TraceCompass with following steps:
>
> 1. In TraceCompass resource view, click the idle gap of that CPU, find next event
> with the 'Select Next Event' button, continous select next event until find
> a 'raw_syscalls:sys_exit' event, then by checking 'id' field I can find what syscall
> cause the CPU idle. (I have mentioned before, that in my case, each time when I click
> that button, I have to wait for 8 to 10 seconds for the trace table update so I can kown
> which event it is. This is painful for me...)
Yes, this was because the perf-CTF traces put all their content in one
big packet. I think this is being worked on, right?
As for the missing system call names, we have
https://bugs.eclipse.org/bugs/show_bug.cgi?id=453361 opened about it. It
should be possible to give knowledge of the id's to the analysis so that
it makes it easier to identify the system call type.
>
> 2. Then I need to find corresponding "raw_syscalls:sys_enter" event to see when the syscall
> is issued. I switch to control flow view then use 'Select Previous Event' to find it, then
> back to resource view I can understand how long this syscall takes, whether the CPU
> does some work or simply idle after the syscall is issued, and whether the task is scheduled
> across CPUs.
>
> 3. For each CPU do step 1 and step 2.
>
> In some high-end servers the number of cores may exceeds 100. Even in my case the number of traced
> CPUs is 32. Doing such searching is time consuming. I have to write a python script to do that.
> My result is: half of cores are waiting on different futexs, half of then are waiting on
> pselect() (caused by sleep()).
>
> *Second case: matching futex WAKE and WAIT*
>
> Therefore the next case I'd like to share is to maching futex wait, futex wakeup and futex waken
> events. This time TraceCompass can't help much. However, the python script is also not very easy
> to design. I have to track CPU and process state transition by myself, match all futex sys_enter
> and sys_exit events, consider different cases including FUTEX_WAKE before FUTEX_WAIT, failures,
> timeout and compare retval of futex wake and the number of threads waken by it. This
> disposable python script has 115 lines of code (I have to admit that I'm not a very good python
> programmer), I create and debug it for serval hours.
>
> My final result is: threads wakeup each other in a tree-like manner. The first futex WAKE command is
> issued by the only running CPU, wakeup only one thread. It wakeups others, other wakeup more, finally
> nearly all CPUs are wakenup. There are some threads get executed after a relative long time
> after the corresponding futex WAKE command, even if there are idle CPUs at that time. Therefore we
> should look into scheduler. However, the gap itself should be a normal phenomenon.
>
> *Third case: device utilization*
>
> We have a high-speed storage device, but writing to filesystems on it is not as fast as we expected.
> I deploy serval tracepoints at device driver to track device activities. However, I have to create
> some tools to draw cumulative curve and speed curve to find whether there is irregular idle. I use
> gnuplot for ploting, but have to write another python script to extrace data. I think languages like
> R should be useful in this case but I'm not familiary with it.
>
> *Conclusion*
>
> In this email I list 3 use cases related to Ad-Hoc statistics I mentioned earlier. Case 1 and 2 are
> in fact not a statistics problem. They should be considered as query problems. I suspect case
> 1 and 2 can be expressed using SQL. If TraceCompass can provide a query language like SQL, we can quickly
> find the information we need to know so will have more time for tuning. I expressed my SQL-like query
> idea on one of my early email:
>
> http://lists.linuxfoundation.org/pipermail/diamon-discuss/2014-November/000003.html
>
> However I was not very sure the query problem we would meet.
>
> Case 3 requires a plotting tool like gnuplot or R. I don't know whether TraceCompass designers want to
> integrate such function, but at lease TraceCompass should export data for those tools.
Right now we use the SWTChart library (http://swtchart.org/) for
outputting graphs like the CPU Usage graph, the UST memory usage view, a
prototype of the Histogram, and a couple others. We had also thought
that we should provide an easy way to export the graphs, like
right-click -> export to PNG/SVG, etc.
But internally, what we provide to SWTChart is basically an array of
double's. So it should be easy to have an internal abstraction layer
that can export either to an SWTChart view, or to a text file to be read
by R or Gnuplot and so on, if such a feature is deemed useful.
Cheers,
Alexandre
>
> In case 1 and 2, I spent a lot of time to analyze a phenomenon which is finally shown to be normal.
> I think this should be common in real performance analysis and tuning tasks. Many ideas may
> appear after the first trace is collected. I think tools like TraceCompass should consider a
> way to reduce the cost of trials and error.
>
> Thank you for reading this!
>
>
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection.
2015-01-31 7:14 ` Wang Nan
@ 2015-02-04 20:24 ` Alexandre Montplaisir
2015-02-07 3:13 ` Wang Nan
2015-02-07 3:13 ` Wang Nan
2015-02-04 20:24 ` Alexandre Montplaisir
` (2 subsequent siblings)
3 siblings, 2 replies; 35+ messages in thread
From: Alexandre Montplaisir @ 2015-02-04 20:24 UTC (permalink / raw)
To: Wang Nan
Cc: Xinwei Hu, diamon-discuss, lttng-dev, tracecompass developer discussions
Hi Wang,
Once again, thanks a lot for this *very* valuable feedback! A good tool
should be able to help in real-world use cases, like the ones you described.
Right now, Trace Compass is focused on showing information on the time
scale. Views like the Control Flow and Resource Views are useful when
you want to look at a particular location of a trace, and want to know
exactly what the system is doing. However, when you are looking at a
large trace completely zoomed-out, it's not extremely useful. It should
be able to point out the "points of interest" in this trace, where the
user can then zoom in and make use of the time-scale views.
We've been having many discussions internally about adding views on a
frequency scale, rather than a time scale. For example, a view that
could aggregate the latency of system calls (or IRQ handlers, or IO
operations, etc.), and show them either as a histogram or as a point
cloud, so that it becomes easy to see the "outliers" of the
distribution. The user could then click on these outliers to be brought
to the location in the trace where they happened.
These frequency views could provide the missing step of identifying
points of interest in a trace, after which the user could switch to the
time-based views to examine the highlighted locations in more details.
I think such an approach could help in all 3 use cases you have
presented. What do you think?
Also, for known domains like the Linux kernel, we can provide pre-made
frequency statistics, like system call latencies per syscall type and so
on. However it would be important to keep it customizable, since we can
never predict all the use cases that the users will need. That would
probably be added a bit later though.
We don't have a defined roadmap for the "frequency analysis" right now,
but we will definitely try to have some working prototypes for our 1.0
release in June. I will keep you (and these mailing lists) posted!
Some more comments below.
On 01/31/2015 02:14 AM, Wang Nan wrote:
>> This is a gap in the definition of the analysis it seems. I don't remember implementing two types of "syscall" events in the perf analysis, so it should just be a matter of getting the exact event name and adding it to the list. I will take a look and keep you posted!
>>
>>> Finally I found the syscall which cause idle. However I need to write a
>>> script to do statistics. TraceCompass itself is lack a mean to count
>>> different events in my way.
>> Could you elaborate on this please? I agree the "Statistics" view in TC is severely lacking, we could be gathering and displaying much more information. The only question is what information would actually be useful.
> I'd like to describe some cases of ad-hoc statisics, which I have to write python
> scripts to do.
>
> *First case: matching sys_enter and sys_exit*
>
> The first case is to find the reason why most of CPUs are idle. From TraceCompass
> resource view, I find some gray gaps for about 300us. During these gaps, there is only
> 1 running CPU, all other CPUs are idle. I can find the reason why a particular CPU
> is idle using TraceCompass with following steps:
>
> 1. In TraceCompass resource view, click the idle gap of that CPU, find next event
> with the 'Select Next Event' button, continous select next event until find
> a 'raw_syscalls:sys_exit' event, then by checking 'id' field I can find what syscall
> cause the CPU idle. (I have mentioned before, that in my case, each time when I click
> that button, I have to wait for 8 to 10 seconds for the trace table update so I can kown
> which event it is. This is painful for me...)
Yes, this was because the perf-CTF traces put all their content in one
big packet. I think this is being worked on, right?
As for the missing system call names, we have
https://bugs.eclipse.org/bugs/show_bug.cgi?id=453361 opened about it. It
should be possible to give knowledge of the id's to the analysis so that
it makes it easier to identify the system call type.
>
> 2. Then I need to find corresponding "raw_syscalls:sys_enter" event to see when the syscall
> is issued. I switch to control flow view then use 'Select Previous Event' to find it, then
> back to resource view I can understand how long this syscall takes, whether the CPU
> does some work or simply idle after the syscall is issued, and whether the task is scheduled
> across CPUs.
>
> 3. For each CPU do step 1 and step 2.
>
> In some high-end servers the number of cores may exceeds 100. Even in my case the number of traced
> CPUs is 32. Doing such searching is time consuming. I have to write a python script to do that.
> My result is: half of cores are waiting on different futexs, half of then are waiting on
> pselect() (caused by sleep()).
>
> *Second case: matching futex WAKE and WAIT*
>
> Therefore the next case I'd like to share is to maching futex wait, futex wakeup and futex waken
> events. This time TraceCompass can't help much. However, the python script is also not very easy
> to design. I have to track CPU and process state transition by myself, match all futex sys_enter
> and sys_exit events, consider different cases including FUTEX_WAKE before FUTEX_WAIT, failures,
> timeout and compare retval of futex wake and the number of threads waken by it. This
> disposable python script has 115 lines of code (I have to admit that I'm not a very good python
> programmer), I create and debug it for serval hours.
>
> My final result is: threads wakeup each other in a tree-like manner. The first futex WAKE command is
> issued by the only running CPU, wakeup only one thread. It wakeups others, other wakeup more, finally
> nearly all CPUs are wakenup. There are some threads get executed after a relative long time
> after the corresponding futex WAKE command, even if there are idle CPUs at that time. Therefore we
> should look into scheduler. However, the gap itself should be a normal phenomenon.
>
> *Third case: device utilization*
>
> We have a high-speed storage device, but writing to filesystems on it is not as fast as we expected.
> I deploy serval tracepoints at device driver to track device activities. However, I have to create
> some tools to draw cumulative curve and speed curve to find whether there is irregular idle. I use
> gnuplot for ploting, but have to write another python script to extrace data. I think languages like
> R should be useful in this case but I'm not familiary with it.
>
> *Conclusion*
>
> In this email I list 3 use cases related to Ad-Hoc statistics I mentioned earlier. Case 1 and 2 are
> in fact not a statistics problem. They should be considered as query problems. I suspect case
> 1 and 2 can be expressed using SQL. If TraceCompass can provide a query language like SQL, we can quickly
> find the information we need to know so will have more time for tuning. I expressed my SQL-like query
> idea on one of my early email:
>
> http://lists.linuxfoundation.org/pipermail/diamon-discuss/2014-November/000003.html
>
> However I was not very sure the query problem we would meet.
>
> Case 3 requires a plotting tool like gnuplot or R. I don't know whether TraceCompass designers want to
> integrate such function, but at lease TraceCompass should export data for those tools.
Right now we use the SWTChart library (http://swtchart.org/) for
outputting graphs like the CPU Usage graph, the UST memory usage view, a
prototype of the Histogram, and a couple others. We had also thought
that we should provide an easy way to export the graphs, like
right-click -> export to PNG/SVG, etc.
But internally, what we provide to SWTChart is basically an array of
double's. So it should be easy to have an internal abstraction layer
that can export either to an SWTChart view, or to a text file to be read
by R or Gnuplot and so on, if such a feature is deemed useful.
Cheers,
Alexandre
>
> In case 1 and 2, I spent a lot of time to analyze a phenomenon which is finally shown to be normal.
> I think this should be common in real performance analysis and tuning tasks. Many ideas may
> appear after the first trace is collected. I think tools like TraceCompass should consider a
> way to reduce the cost of trials and error.
>
> Thank you for reading this!
>
>
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection.
2015-02-04 20:24 ` Alexandre Montplaisir
2015-02-07 3:13 ` Wang Nan
@ 2015-02-07 3:13 ` Wang Nan
1 sibling, 0 replies; 35+ messages in thread
From: Wang Nan @ 2015-02-07 3:13 UTC (permalink / raw)
To: Alexandre Montplaisir
Cc: Xinwei Hu, diamon-discuss, lttng-dev, tracecompass developer discussions
On 2015/2/5 4:24, Alexandre Montplaisir wrote:
> Hi Wang,
>
> Once again, thanks a lot for this *very* valuable feedback! A good tool should be able to help in real-world use cases, like the ones you described.
>
> Right now, Trace Compass is focused on showing information on the time scale. Views like the Control Flow and Resource Views are useful when you want to look at a particular location of a trace, and want to know exactly what the system is doing. However, when you are looking at a large trace completely zoomed-out, it's not extremely useful. It should be able to point out the "points of interest" in this trace, where the user can then zoom in and make use of the time-scale views.
>
> We've been having many discussions internally about adding views on a frequency scale, rather than a time scale. For example, a view that could aggregate the latency of system calls (or IRQ handlers, or IO operations, etc.), and show them either as a histogram or as a point cloud, so that it becomes easy to see the "outliers" of the distribution. The user could then click on these outliers to be brought to the location in the trace where they happened.
>
> These frequency views could provide the missing step of identifying points of interest in a trace, after which the user could switch to the time-based views to examine the highlighted locations in more details.
>
> I think such an approach could help in all 3 use cases you have presented. What do you think?
>
Good to see you are looking at this problem.
"Frequency analysis" you mentioned is a good viewpoint for finding outliner. However, it should not be the only one we consider. Could you please explain how "frequency analysis" can solve my first problem "finding the reason why most of CPUs are idle by matching syscalls events?"
Thank you!
> Also, for known domains like the Linux kernel, we can provide pre-made frequency statistics, like system call latencies per syscall type and so on. However it would be important to keep it customizable, since we can never predict all the use cases that the users will need. That would probably be added a bit later though.
>
> We don't have a defined roadmap for the "frequency analysis" right now, but we will definitely try to have some working prototypes for our 1.0 release in June. I will keep you (and these mailing lists) posted!
>
>
> Some more comments below.
>
>
> On 01/31/2015 02:14 AM, Wang Nan wrote:
>>> This is a gap in the definition of the analysis it seems. I don't remember implementing two types of "syscall" events in the perf analysis, so it should just be a matter of getting the exact event name and adding it to the list. I will take a look and keep you posted!
>>>
>>>> Finally I found the syscall which cause idle. However I need to write a
>>>> script to do statistics. TraceCompass itself is lack a mean to count
>>>> different events in my way.
>>> Could you elaborate on this please? I agree the "Statistics" view in TC is severely lacking, we could be gathering and displaying much more information. The only question is what information would actually be useful.
>> I'd like to describe some cases of ad-hoc statisics, which I have to write python
>> scripts to do.
>>
>> *First case: matching sys_enter and sys_exit*
>>
>> The first case is to find the reason why most of CPUs are idle. From TraceCompass
>> resource view, I find some gray gaps for about 300us. During these gaps, there is only
>> 1 running CPU, all other CPUs are idle. I can find the reason why a particular CPU
>> is idle using TraceCompass with following steps:
>>
>> 1. In TraceCompass resource view, click the idle gap of that CPU, find next event
>> with the 'Select Next Event' button, continous select next event until find
>> a 'raw_syscalls:sys_exit' event, then by checking 'id' field I can find what syscall
>> cause the CPU idle. (I have mentioned before, that in my case, each time when I click
>> that button, I have to wait for 8 to 10 seconds for the trace table update so I can kown
>> which event it is. This is painful for me...)
>
> Yes, this was because the perf-CTF traces put all their content in one big packet. I think this is being worked on, right?
>
> As for the missing system call names, we have https://bugs.eclipse.org/bugs/show_bug.cgi?id=453361 opened about it. It should be possible to give knowledge of the id's to the analysis so that it makes it easier to identify the system call type.
>
>>
>> 2. Then I need to find corresponding "raw_syscalls:sys_enter" event to see when the syscall
>> is issued. I switch to control flow view then use 'Select Previous Event' to find it, then
>> back to resource view I can understand how long this syscall takes, whether the CPU
>> does some work or simply idle after the syscall is issued, and whether the task is scheduled
>> across CPUs.
>>
>> 3. For each CPU do step 1 and step 2.
>>
>> In some high-end servers the number of cores may exceeds 100. Even in my case the number of traced
>> CPUs is 32. Doing such searching is time consuming. I have to write a python script to do that.
>> My result is: half of cores are waiting on different futexs, half of then are waiting on
>> pselect() (caused by sleep()).
>>
>> *Second case: matching futex WAKE and WAIT*
>>
>> Therefore the next case I'd like to share is to maching futex wait, futex wakeup and futex waken
>> events. This time TraceCompass can't help much. However, the python script is also not very easy
>> to design. I have to track CPU and process state transition by myself, match all futex sys_enter
>> and sys_exit events, consider different cases including FUTEX_WAKE before FUTEX_WAIT, failures,
>> timeout and compare retval of futex wake and the number of threads waken by it. This
>> disposable python script has 115 lines of code (I have to admit that I'm not a very good python
>> programmer), I create and debug it for serval hours.
>>
>> My final result is: threads wakeup each other in a tree-like manner. The first futex WAKE command is
>> issued by the only running CPU, wakeup only one thread. It wakeups others, other wakeup more, finally
>> nearly all CPUs are wakenup. There are some threads get executed after a relative long time
>> after the corresponding futex WAKE command, even if there are idle CPUs at that time. Therefore we
>> should look into scheduler. However, the gap itself should be a normal phenomenon.
>>
>> *Third case: device utilization*
>>
>> We have a high-speed storage device, but writing to filesystems on it is not as fast as we expected.
>> I deploy serval tracepoints at device driver to track device activities. However, I have to create
>> some tools to draw cumulative curve and speed curve to find whether there is irregular idle. I use
>> gnuplot for ploting, but have to write another python script to extrace data. I think languages like
>> R should be useful in this case but I'm not familiary with it.
>>
>> *Conclusion*
>>
>> In this email I list 3 use cases related to Ad-Hoc statistics I mentioned earlier. Case 1 and 2 are
>> in fact not a statistics problem. They should be considered as query problems. I suspect case
>> 1 and 2 can be expressed using SQL. If TraceCompass can provide a query language like SQL, we can quickly
>> find the information we need to know so will have more time for tuning. I expressed my SQL-like query
>> idea on one of my early email:
>>
>> http://lists.linuxfoundation.org/pipermail/diamon-discuss/2014-November/000003.html
>>
>> However I was not very sure the query problem we would meet.
>>
>> Case 3 requires a plotting tool like gnuplot or R. I don't know whether TraceCompass designers want to
>> integrate such function, but at lease TraceCompass should export data for those tools.
>
> Right now we use the SWTChart library (http://swtchart.org/) for outputting graphs like the CPU Usage graph, the UST memory usage view, a prototype of the Histogram, and a couple others. We had also thought that we should provide an easy way to export the graphs, like right-click -> export to PNG/SVG, etc.
>
> But internally, what we provide to SWTChart is basically an array of double's. So it should be easy to have an internal abstraction layer that can export either to an SWTChart view, or to a text file to be read by R or Gnuplot and so on, if such a feature is deemed useful.
>
>
> Cheers,
> Alexandre
>
>>
>> In case 1 and 2, I spent a lot of time to analyze a phenomenon which is finally shown to be normal.
>> I think this should be common in real performance analysis and tuning tasks. Many ideas may
>> appear after the first trace is collected. I think tools like TraceCompass should consider a
>> way to reduce the cost of trials and error.
>>
>> Thank you for reading this!
>>
>>
>
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection.
2015-02-04 20:24 ` Alexandre Montplaisir
@ 2015-02-07 3:13 ` Wang Nan
2015-02-07 5:14 ` Alexandre Montplaisir
2015-02-07 5:14 ` Alexandre Montplaisir
2015-02-07 3:13 ` Wang Nan
1 sibling, 2 replies; 35+ messages in thread
From: Wang Nan @ 2015-02-07 3:13 UTC (permalink / raw)
To: Alexandre Montplaisir
Cc: Xinwei Hu, diamon-discuss, lttng-dev, tracecompass developer discussions
On 2015/2/5 4:24, Alexandre Montplaisir wrote:
> Hi Wang,
>
> Once again, thanks a lot for this *very* valuable feedback! A good tool should be able to help in real-world use cases, like the ones you described.
>
> Right now, Trace Compass is focused on showing information on the time scale. Views like the Control Flow and Resource Views are useful when you want to look at a particular location of a trace, and want to know exactly what the system is doing. However, when you are looking at a large trace completely zoomed-out, it's not extremely useful. It should be able to point out the "points of interest" in this trace, where the user can then zoom in and make use of the time-scale views.
>
> We've been having many discussions internally about adding views on a frequency scale, rather than a time scale. For example, a view that could aggregate the latency of system calls (or IRQ handlers, or IO operations, etc.), and show them either as a histogram or as a point cloud, so that it becomes easy to see the "outliers" of the distribution. The user could then click on these outliers to be brought to the location in the trace where they happened.
>
> These frequency views could provide the missing step of identifying points of interest in a trace, after which the user could switch to the time-based views to examine the highlighted locations in more details.
>
> I think such an approach could help in all 3 use cases you have presented. What do you think?
>
Good to see you are looking at this problem.
"Frequency analysis" you mentioned is a good viewpoint for finding outliner. However, it should not be the only one we consider. Could you please explain how "frequency analysis" can solve my first problem "finding the reason why most of CPUs are idle by matching syscalls events?"
Thank you!
> Also, for known domains like the Linux kernel, we can provide pre-made frequency statistics, like system call latencies per syscall type and so on. However it would be important to keep it customizable, since we can never predict all the use cases that the users will need. That would probably be added a bit later though.
>
> We don't have a defined roadmap for the "frequency analysis" right now, but we will definitely try to have some working prototypes for our 1.0 release in June. I will keep you (and these mailing lists) posted!
>
>
> Some more comments below.
>
>
> On 01/31/2015 02:14 AM, Wang Nan wrote:
>>> This is a gap in the definition of the analysis it seems. I don't remember implementing two types of "syscall" events in the perf analysis, so it should just be a matter of getting the exact event name and adding it to the list. I will take a look and keep you posted!
>>>
>>>> Finally I found the syscall which cause idle. However I need to write a
>>>> script to do statistics. TraceCompass itself is lack a mean to count
>>>> different events in my way.
>>> Could you elaborate on this please? I agree the "Statistics" view in TC is severely lacking, we could be gathering and displaying much more information. The only question is what information would actually be useful.
>> I'd like to describe some cases of ad-hoc statisics, which I have to write python
>> scripts to do.
>>
>> *First case: matching sys_enter and sys_exit*
>>
>> The first case is to find the reason why most of CPUs are idle. From TraceCompass
>> resource view, I find some gray gaps for about 300us. During these gaps, there is only
>> 1 running CPU, all other CPUs are idle. I can find the reason why a particular CPU
>> is idle using TraceCompass with following steps:
>>
>> 1. In TraceCompass resource view, click the idle gap of that CPU, find next event
>> with the 'Select Next Event' button, continous select next event until find
>> a 'raw_syscalls:sys_exit' event, then by checking 'id' field I can find what syscall
>> cause the CPU idle. (I have mentioned before, that in my case, each time when I click
>> that button, I have to wait for 8 to 10 seconds for the trace table update so I can kown
>> which event it is. This is painful for me...)
>
> Yes, this was because the perf-CTF traces put all their content in one big packet. I think this is being worked on, right?
>
> As for the missing system call names, we have https://bugs.eclipse.org/bugs/show_bug.cgi?id=453361 opened about it. It should be possible to give knowledge of the id's to the analysis so that it makes it easier to identify the system call type.
>
>>
>> 2. Then I need to find corresponding "raw_syscalls:sys_enter" event to see when the syscall
>> is issued. I switch to control flow view then use 'Select Previous Event' to find it, then
>> back to resource view I can understand how long this syscall takes, whether the CPU
>> does some work or simply idle after the syscall is issued, and whether the task is scheduled
>> across CPUs.
>>
>> 3. For each CPU do step 1 and step 2.
>>
>> In some high-end servers the number of cores may exceeds 100. Even in my case the number of traced
>> CPUs is 32. Doing such searching is time consuming. I have to write a python script to do that.
>> My result is: half of cores are waiting on different futexs, half of then are waiting on
>> pselect() (caused by sleep()).
>>
>> *Second case: matching futex WAKE and WAIT*
>>
>> Therefore the next case I'd like to share is to maching futex wait, futex wakeup and futex waken
>> events. This time TraceCompass can't help much. However, the python script is also not very easy
>> to design. I have to track CPU and process state transition by myself, match all futex sys_enter
>> and sys_exit events, consider different cases including FUTEX_WAKE before FUTEX_WAIT, failures,
>> timeout and compare retval of futex wake and the number of threads waken by it. This
>> disposable python script has 115 lines of code (I have to admit that I'm not a very good python
>> programmer), I create and debug it for serval hours.
>>
>> My final result is: threads wakeup each other in a tree-like manner. The first futex WAKE command is
>> issued by the only running CPU, wakeup only one thread. It wakeups others, other wakeup more, finally
>> nearly all CPUs are wakenup. There are some threads get executed after a relative long time
>> after the corresponding futex WAKE command, even if there are idle CPUs at that time. Therefore we
>> should look into scheduler. However, the gap itself should be a normal phenomenon.
>>
>> *Third case: device utilization*
>>
>> We have a high-speed storage device, but writing to filesystems on it is not as fast as we expected.
>> I deploy serval tracepoints at device driver to track device activities. However, I have to create
>> some tools to draw cumulative curve and speed curve to find whether there is irregular idle. I use
>> gnuplot for ploting, but have to write another python script to extrace data. I think languages like
>> R should be useful in this case but I'm not familiary with it.
>>
>> *Conclusion*
>>
>> In this email I list 3 use cases related to Ad-Hoc statistics I mentioned earlier. Case 1 and 2 are
>> in fact not a statistics problem. They should be considered as query problems. I suspect case
>> 1 and 2 can be expressed using SQL. If TraceCompass can provide a query language like SQL, we can quickly
>> find the information we need to know so will have more time for tuning. I expressed my SQL-like query
>> idea on one of my early email:
>>
>> http://lists.linuxfoundation.org/pipermail/diamon-discuss/2014-November/000003.html
>>
>> However I was not very sure the query problem we would meet.
>>
>> Case 3 requires a plotting tool like gnuplot or R. I don't know whether TraceCompass designers want to
>> integrate such function, but at lease TraceCompass should export data for those tools.
>
> Right now we use the SWTChart library (http://swtchart.org/) for outputting graphs like the CPU Usage graph, the UST memory usage view, a prototype of the Histogram, and a couple others. We had also thought that we should provide an easy way to export the graphs, like right-click -> export to PNG/SVG, etc.
>
> But internally, what we provide to SWTChart is basically an array of double's. So it should be easy to have an internal abstraction layer that can export either to an SWTChart view, or to a text file to be read by R or Gnuplot and so on, if such a feature is deemed useful.
>
>
> Cheers,
> Alexandre
>
>>
>> In case 1 and 2, I spent a lot of time to analyze a phenomenon which is finally shown to be normal.
>> I think this should be common in real performance analysis and tuning tasks. Many ideas may
>> appear after the first trace is collected. I think tools like TraceCompass should consider a
>> way to reduce the cost of trials and error.
>>
>> Thank you for reading this!
>>
>>
>
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection.
2015-02-07 3:13 ` Wang Nan
2015-02-07 5:14 ` Alexandre Montplaisir
@ 2015-02-07 5:14 ` Alexandre Montplaisir
1 sibling, 0 replies; 35+ messages in thread
From: Alexandre Montplaisir @ 2015-02-07 5:14 UTC (permalink / raw)
To: Wang Nan
Cc: Xinwei Hu, diamon-discuss, lttng-dev, tracecompass developer discussions
On 2015-02-06 10:13 PM, Wang Nan wrote:
>> I think such an approach could help in all 3 use cases you have presented. What do you think?
>> >
> Good to see you are looking at this problem.
>
> "Frequency analysis" you mentioned is a good viewpoint for finding outliner. However, it should not be the only one we consider. Could you please explain how "frequency analysis" can solve my first problem "finding the reason why most of CPUs are idle by matching syscalls events?"
>
> Thank you!
>
I was thinking of the automatic event matching, for example matching
syscall_entry_* events with their corresponding syscall_exit_*. This is
a pre-requisite of doing frequency analysis, but could be useful on its
own. (Re-reading myself now I notice I didn't mention event matching at
all, my bad!)
If I understand your first problem correctly, it boils down to wanting
to identify the system call that is ongoing when a CPU is idle. And this
is not straightforward, because the Select Next/Previous Event buttons
in the time graph views will stop at every state transition, like CPU
idle or IRQs, which are "in the way" of the system call entry event.
Correct?
Now, what if we had a view that simply lists all the system calls in the
trace? Each row would contain the complete information of a system call,
so its start time (timestamp of the sys_entry event), end time
(timestamp of the sys_exit event), duration, name/id, arguments, return
value, etc.
And this view could be synchronized with the other ones, where clicking
on a row would bring you to the corresponding syscall_entry event. And
inversely, clicking on any timestamp would bring this view to the row
corresponding to the system call that was active at that time, if there
is one. I believe this could speed up your problem of identifying the
running system call for any arbitrary point in the trace.
Also, that view could be re-used for other types of intervals, like
IRQs, IO operations, and so on. And if the user sorts by the Duration
column, bam, they have a sorted list of the worst offenders for longer
system calls, IRQs, etc.
Would this be helpful?
Cheers,
Alexandre
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection.
2015-02-07 3:13 ` Wang Nan
@ 2015-02-07 5:14 ` Alexandre Montplaisir
2015-02-07 7:25 ` Wang Nan
2015-02-07 7:25 ` Wang Nan
2015-02-07 5:14 ` Alexandre Montplaisir
1 sibling, 2 replies; 35+ messages in thread
From: Alexandre Montplaisir @ 2015-02-07 5:14 UTC (permalink / raw)
To: Wang Nan
Cc: Xinwei Hu, diamon-discuss, lttng-dev, tracecompass developer discussions
On 2015-02-06 10:13 PM, Wang Nan wrote:
>> I think such an approach could help in all 3 use cases you have presented. What do you think?
>> >
> Good to see you are looking at this problem.
>
> "Frequency analysis" you mentioned is a good viewpoint for finding outliner. However, it should not be the only one we consider. Could you please explain how "frequency analysis" can solve my first problem "finding the reason why most of CPUs are idle by matching syscalls events?"
>
> Thank you!
>
I was thinking of the automatic event matching, for example matching
syscall_entry_* events with their corresponding syscall_exit_*. This is
a pre-requisite of doing frequency analysis, but could be useful on its
own. (Re-reading myself now I notice I didn't mention event matching at
all, my bad!)
If I understand your first problem correctly, it boils down to wanting
to identify the system call that is ongoing when a CPU is idle. And this
is not straightforward, because the Select Next/Previous Event buttons
in the time graph views will stop at every state transition, like CPU
idle or IRQs, which are "in the way" of the system call entry event.
Correct?
Now, what if we had a view that simply lists all the system calls in the
trace? Each row would contain the complete information of a system call,
so its start time (timestamp of the sys_entry event), end time
(timestamp of the sys_exit event), duration, name/id, arguments, return
value, etc.
And this view could be synchronized with the other ones, where clicking
on a row would bring you to the corresponding syscall_entry event. And
inversely, clicking on any timestamp would bring this view to the row
corresponding to the system call that was active at that time, if there
is one. I believe this could speed up your problem of identifying the
running system call for any arbitrary point in the trace.
Also, that view could be re-used for other types of intervals, like
IRQs, IO operations, and so on. And if the user sorts by the Duration
column, bam, they have a sorted list of the worst offenders for longer
system calls, IRQs, etc.
Would this be helpful?
Cheers,
Alexandre
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection.
2015-02-07 5:14 ` Alexandre Montplaisir
2015-02-07 7:25 ` Wang Nan
@ 2015-02-07 7:25 ` Wang Nan
1 sibling, 0 replies; 35+ messages in thread
From: Wang Nan @ 2015-02-07 7:25 UTC (permalink / raw)
To: Alexandre Montplaisir
Cc: Xinwei Hu, diamon-discuss, lttng-dev, tracecompass developer discussions
On 2015/2/7 13:14, Alexandre Montplaisir wrote:
> On 2015-02-06 10:13 PM, Wang Nan wrote:
>>> I think such an approach could help in all 3 use cases you have presented. What do you think?
>>> >
>> Good to see you are looking at this problem.
>>
>> "Frequency analysis" you mentioned is a good viewpoint for finding outliner. However, it should not be the only one we consider. Could you please explain how "frequency analysis" can solve my first problem "finding the reason why most of CPUs are idle by matching syscalls events?"
>>
>> Thank you!
>>
>
> I was thinking of the automatic event matching, for example matching syscall_entry_* events with their corresponding syscall_exit_*. This is a pre-requisite of doing frequency analysis, but could be useful on its own. (Re-reading myself now I notice I didn't mention event matching at all, my bad!)
>
> If I understand your first problem correctly, it boils down to wanting to identify the system call that is ongoing when a CPU is idle. And this is not straightforward, because the Select Next/Previous Event buttons in the time graph views will stop at every state transition, like CPU idle or IRQs, which are "in the way" of the system call entry event. Correct?
>
>
> Now, what if we had a view that simply lists all the system calls in the trace? Each row would contain the complete information of a system call, so its start time (timestamp of the sys_entry event), end time (timestamp of the sys_exit event), duration, name/id, arguments, return value, etc.
>
> And this view could be synchronized with the other ones, where clicking on a row would bring you to the corresponding syscall_entry event. And inversely, clicking on any timestamp would bring this view to the row corresponding to the system call that was active at that time, if there is one. I believe this could speed up your problem of identifying the running system call for any arbitrary point in the trace.
>
> Also, that view could be re-used for other types of intervals, like IRQs, IO operations, and so on. And if the user sorts by the Duration column, bam, they have a sorted list of the worst offenders for longer system calls, IRQs, etc.
>
>
> Would this be helpful?
>
Correct, it should be helpful. If I understand your description correctly, you are thinking another time
chart view like resource view and control flow view, this time each row is a syscall, like this:
<-------- time -------->
/-------------------
sys_write | ***
+-------------------
sys_read | *** ****
+-------------------
sys_poll | **************
+-------------------
.................
And it will be synchronized with other ones. So if I click on the idle area on resource view, then switch
to this new view, I can realize which syscall is ongoing at that time.
However, unlike CPUs, types of syscalls are not stateful entities. If we are going to draw such diagram, it should looks like:
<-------- time -------->
/-------------------
sys_write(pid: 1234) | ***
+-------------------
sys_write(pid: 5678) |
+-------------------
sys_read(pid: 5678) | ****
+-------------------
sys_read(pid: 4321) | ****
+------------------
sys_poll(pid: 1212) | *******
+-------------------
sys_poll(pid: 1213) | **************
+-------------------
sys_poll(pid: 1214) | *****
+-------------------
.................
We must draw one raw for each syscall (there are hundreds of then) issued by each pid (also hundreds of).
This view will be extremely large and sparse and we should have a mean to sort them for finding meanful
rows at a specific time. But correct, it is still useful.
We can also aggregate same syscalls, use different color to express how many procedures are ongoing at that
time, like this:
<-------- time -------->
/-------------------
sys_write | ~~~
+-------------------
sys_read | ~~~~ ~~~~
+------------------
sys_poll | ~~~~==!!====~~~
+-------------------
.................
Is your "frequency analysis" something like that?
However, when I working on the first problem, what I think about is a SQL query like:
select * from events
inner join
( select max(timestamp) as lasttime from events groupby cpu sortby timestamp where
events.timestamp < xxxxxxx and
events.type == "raw_syscalls:sys_exit"
group by cpu ) X
on
events.timestamp = X.lasttime and
events.cpu = X.cpu and
events.type = "raw_syscalls:sys_exit"
(I'm not sure whether it works or need some debugging, but I think my idea is expressed.)
I think query like that should be more expressive than timechart (but yes, SQL is not suitable
for query events). Think about my 'matching futex WAIT and WAKE' then, I must first define my
futex model use XML, then pass it to TraceCompass, reprocess the trace then I can open the new
view.
> Cheers,
> Alexandre
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection.
2015-02-07 5:14 ` Alexandre Montplaisir
@ 2015-02-07 7:25 ` Wang Nan
2015-02-09 20:13 ` Alexandre Montplaisir
2015-02-09 20:13 ` Alexandre Montplaisir
2015-02-07 7:25 ` Wang Nan
1 sibling, 2 replies; 35+ messages in thread
From: Wang Nan @ 2015-02-07 7:25 UTC (permalink / raw)
To: Alexandre Montplaisir
Cc: Xinwei Hu, diamon-discuss, lttng-dev, tracecompass developer discussions
On 2015/2/7 13:14, Alexandre Montplaisir wrote:
> On 2015-02-06 10:13 PM, Wang Nan wrote:
>>> I think such an approach could help in all 3 use cases you have presented. What do you think?
>>> >
>> Good to see you are looking at this problem.
>>
>> "Frequency analysis" you mentioned is a good viewpoint for finding outliner. However, it should not be the only one we consider. Could you please explain how "frequency analysis" can solve my first problem "finding the reason why most of CPUs are idle by matching syscalls events?"
>>
>> Thank you!
>>
>
> I was thinking of the automatic event matching, for example matching syscall_entry_* events with their corresponding syscall_exit_*. This is a pre-requisite of doing frequency analysis, but could be useful on its own. (Re-reading myself now I notice I didn't mention event matching at all, my bad!)
>
> If I understand your first problem correctly, it boils down to wanting to identify the system call that is ongoing when a CPU is idle. And this is not straightforward, because the Select Next/Previous Event buttons in the time graph views will stop at every state transition, like CPU idle or IRQs, which are "in the way" of the system call entry event. Correct?
>
>
> Now, what if we had a view that simply lists all the system calls in the trace? Each row would contain the complete information of a system call, so its start time (timestamp of the sys_entry event), end time (timestamp of the sys_exit event), duration, name/id, arguments, return value, etc.
>
> And this view could be synchronized with the other ones, where clicking on a row would bring you to the corresponding syscall_entry event. And inversely, clicking on any timestamp would bring this view to the row corresponding to the system call that was active at that time, if there is one. I believe this could speed up your problem of identifying the running system call for any arbitrary point in the trace.
>
> Also, that view could be re-used for other types of intervals, like IRQs, IO operations, and so on. And if the user sorts by the Duration column, bam, they have a sorted list of the worst offenders for longer system calls, IRQs, etc.
>
>
> Would this be helpful?
>
Correct, it should be helpful. If I understand your description correctly, you are thinking another time
chart view like resource view and control flow view, this time each row is a syscall, like this:
<-------- time -------->
/-------------------
sys_write | ***
+-------------------
sys_read | *** ****
+-------------------
sys_poll | **************
+-------------------
.................
And it will be synchronized with other ones. So if I click on the idle area on resource view, then switch
to this new view, I can realize which syscall is ongoing at that time.
However, unlike CPUs, types of syscalls are not stateful entities. If we are going to draw such diagram, it should looks like:
<-------- time -------->
/-------------------
sys_write(pid: 1234) | ***
+-------------------
sys_write(pid: 5678) |
+-------------------
sys_read(pid: 5678) | ****
+-------------------
sys_read(pid: 4321) | ****
+------------------
sys_poll(pid: 1212) | *******
+-------------------
sys_poll(pid: 1213) | **************
+-------------------
sys_poll(pid: 1214) | *****
+-------------------
.................
We must draw one raw for each syscall (there are hundreds of then) issued by each pid (also hundreds of).
This view will be extremely large and sparse and we should have a mean to sort them for finding meanful
rows at a specific time. But correct, it is still useful.
We can also aggregate same syscalls, use different color to express how many procedures are ongoing at that
time, like this:
<-------- time -------->
/-------------------
sys_write | ~~~
+-------------------
sys_read | ~~~~ ~~~~
+------------------
sys_poll | ~~~~==!!====~~~
+-------------------
.................
Is your "frequency analysis" something like that?
However, when I working on the first problem, what I think about is a SQL query like:
select * from events
inner join
( select max(timestamp) as lasttime from events groupby cpu sortby timestamp where
events.timestamp < xxxxxxx and
events.type == "raw_syscalls:sys_exit"
group by cpu ) X
on
events.timestamp = X.lasttime and
events.cpu = X.cpu and
events.type = "raw_syscalls:sys_exit"
(I'm not sure whether it works or need some debugging, but I think my idea is expressed.)
I think query like that should be more expressive than timechart (but yes, SQL is not suitable
for query events). Think about my 'matching futex WAIT and WAKE' then, I must first define my
futex model use XML, then pass it to TraceCompass, reprocess the trace then I can open the new
view.
> Cheers,
> Alexandre
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection.
2015-01-31 7:14 ` Wang Nan
2015-02-04 20:24 ` Alexandre Montplaisir
2015-02-04 20:24 ` Alexandre Montplaisir
@ 2015-02-07 13:22 ` Mathieu Desnoyers
2015-02-07 13:22 ` Mathieu Desnoyers
3 siblings, 0 replies; 35+ messages in thread
From: Mathieu Desnoyers @ 2015-02-07 13:22 UTC (permalink / raw)
To: Wang Nan
Cc: Xinwei Hu, diamon-discuss, lttng-dev, tracecompass developer discussions
----- Original Message -----
> From: "Wang Nan" <wangnan0@huawei.com>
> To: "Alexandre Montplaisir" <alexmonthy@voxpopuli.im>
> Cc: "Xinwei Hu" <huxinwei@huawei.com>, diamon-discuss@lists.linuxfoundation.org, lttng-dev@lists.lttng.org,
> "tracecompass developer discussions" <tracecompass-dev@eclipse.org>
> Sent: Saturday, January 31, 2015 2:14:18 AM
> Subject: Re: [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection.
>
>
> > This is a gap in the definition of the analysis it seems. I don't remember
> > implementing two types of "syscall" events in the perf analysis, so it
> > should just be a matter of getting the exact event name and adding it to
> > the list. I will take a look and keep you posted!
> >
> >> Finally I found the syscall which cause idle. However I need to write a
> >> script to do statistics. TraceCompass itself is lack a mean to count
> >> different events in my way.
> >
> > Could you elaborate on this please? I agree the "Statistics" view in TC is
> > severely lacking, we could be gathering and displaying much more
> > information. The only question is what information would actually be
> > useful.
>
> I'd like to describe some cases of ad-hoc statisics, which I have to write
> python
> scripts to do.
>
> *First case: matching sys_enter and sys_exit*
>
> The first case is to find the reason why most of CPUs are idle. From
> TraceCompass
> resource view, I find some gray gaps for about 300us. During these gaps,
> there is only
> 1 running CPU, all other CPUs are idle. I can find the reason why a
> particular CPU
> is idle using TraceCompass with following steps:
>
> 1. In TraceCompass resource view, click the idle gap of that CPU, find next
> event
> with the 'Select Next Event' button, continous select next event until
> find
> a 'raw_syscalls:sys_exit' event, then by checking 'id' field I can find
> what syscall
> cause the CPU idle. (I have mentioned before, that in my case, each time
> when I click
> that button, I have to wait for 8 to 10 seconds for the trace table
> update so I can kown
> which event it is. This is painful for me...)
For this use-case, I think the kind of analysis you need is a "critical path
analysis", which can identify what made each task you care about block during
your time range. See work published by Francis Giraldeau on this topic.
CPU being idle is a side-effect of the tasks being blocked. If you would have
a lesser priority task running in the background, that task would run instead
of the CPU idle loop. Therefore, the idle CPU is really just a side-effect, not
a cause.
Thanks!
Mathieu
>
> 2. Then I need to find corresponding "raw_syscalls:sys_enter" event to see
> when the syscall
> is issued. I switch to control flow view then use 'Select Previous
> Event' to find it, then
> back to resource view I can understand how long this syscall takes,
> whether the CPU
> does some work or simply idle after the syscall is issued, and whether
> the task is scheduled
> across CPUs.
>
> 3. For each CPU do step 1 and step 2.
>
> In some high-end servers the number of cores may exceeds 100. Even in my case
> the number of traced
> CPUs is 32. Doing such searching is time consuming. I have to write a python
> script to do that.
> My result is: half of cores are waiting on different futexs, half of then are
> waiting on
> pselect() (caused by sleep()).
>
> *Second case: matching futex WAKE and WAIT*
>
> Therefore the next case I'd like to share is to maching futex wait, futex
> wakeup and futex waken
> events. This time TraceCompass can't help much. However, the python script is
> also not very easy
> to design. I have to track CPU and process state transition by myself, match
> all futex sys_enter
> and sys_exit events, consider different cases including FUTEX_WAKE before
> FUTEX_WAIT, failures,
> timeout and compare retval of futex wake and the number of threads waken by
> it. This
> disposable python script has 115 lines of code (I have to admit that I'm not
> a very good python
> programmer), I create and debug it for serval hours.
>
> My final result is: threads wakeup each other in a tree-like manner. The
> first futex WAKE command is
> issued by the only running CPU, wakeup only one thread. It wakeups others,
> other wakeup more, finally
> nearly all CPUs are wakenup. There are some threads get executed after a
> relative long time
> after the corresponding futex WAKE command, even if there are idle CPUs at
> that time. Therefore we
> should look into scheduler. However, the gap itself should be a normal
> phenomenon.
>
> *Third case: device utilization*
>
> We have a high-speed storage device, but writing to filesystems on it is not
> as fast as we expected.
> I deploy serval tracepoints at device driver to track device activities.
> However, I have to create
> some tools to draw cumulative curve and speed curve to find whether there is
> irregular idle. I use
> gnuplot for ploting, but have to write another python script to extrace data.
> I think languages like
> R should be useful in this case but I'm not familiary with it.
>
> *Conclusion*
>
> In this email I list 3 use cases related to Ad-Hoc statistics I mentioned
> earlier. Case 1 and 2 are
> in fact not a statistics problem. They should be considered as query
> problems. I suspect case
> 1 and 2 can be expressed using SQL. If TraceCompass can provide a query
> language like SQL, we can quickly
> find the information we need to know so will have more time for tuning. I
> expressed my SQL-like query
> idea on one of my early email:
>
> http://lists.linuxfoundation.org/pipermail/diamon-discuss/2014-November/000003.html
>
> However I was not very sure the query problem we would meet.
>
> Case 3 requires a plotting tool like gnuplot or R. I don't know whether
> TraceCompass designers want to
> integrate such function, but at lease TraceCompass should export data for
> those tools.
>
> In case 1 and 2, I spent a lot of time to analyze a phenomenon which is
> finally shown to be normal.
> I think this should be common in real performance analysis and tuning tasks.
> Many ideas may
> appear after the first trace is collected. I think tools like TraceCompass
> should consider a
> way to reduce the cost of trials and error.
>
> Thank you for reading this!
>
>
> _______________________________________________
> diamon-discuss mailing list
> diamon-discuss@lists.linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/diamon-discuss
>
--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection.
2015-01-31 7:14 ` Wang Nan
` (2 preceding siblings ...)
2015-02-07 13:22 ` Mathieu Desnoyers
@ 2015-02-07 13:22 ` Mathieu Desnoyers
3 siblings, 0 replies; 35+ messages in thread
From: Mathieu Desnoyers @ 2015-02-07 13:22 UTC (permalink / raw)
To: Wang Nan
Cc: Xinwei Hu, diamon-discuss, lttng-dev, tracecompass developer discussions
----- Original Message -----
> From: "Wang Nan" <wangnan0@huawei.com>
> To: "Alexandre Montplaisir" <alexmonthy@voxpopuli.im>
> Cc: "Xinwei Hu" <huxinwei@huawei.com>, diamon-discuss@lists.linuxfoundation.org, lttng-dev@lists.lttng.org,
> "tracecompass developer discussions" <tracecompass-dev@eclipse.org>
> Sent: Saturday, January 31, 2015 2:14:18 AM
> Subject: Re: [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection.
>
>
> > This is a gap in the definition of the analysis it seems. I don't remember
> > implementing two types of "syscall" events in the perf analysis, so it
> > should just be a matter of getting the exact event name and adding it to
> > the list. I will take a look and keep you posted!
> >
> >> Finally I found the syscall which cause idle. However I need to write a
> >> script to do statistics. TraceCompass itself is lack a mean to count
> >> different events in my way.
> >
> > Could you elaborate on this please? I agree the "Statistics" view in TC is
> > severely lacking, we could be gathering and displaying much more
> > information. The only question is what information would actually be
> > useful.
>
> I'd like to describe some cases of ad-hoc statisics, which I have to write
> python
> scripts to do.
>
> *First case: matching sys_enter and sys_exit*
>
> The first case is to find the reason why most of CPUs are idle. From
> TraceCompass
> resource view, I find some gray gaps for about 300us. During these gaps,
> there is only
> 1 running CPU, all other CPUs are idle. I can find the reason why a
> particular CPU
> is idle using TraceCompass with following steps:
>
> 1. In TraceCompass resource view, click the idle gap of that CPU, find next
> event
> with the 'Select Next Event' button, continous select next event until
> find
> a 'raw_syscalls:sys_exit' event, then by checking 'id' field I can find
> what syscall
> cause the CPU idle. (I have mentioned before, that in my case, each time
> when I click
> that button, I have to wait for 8 to 10 seconds for the trace table
> update so I can kown
> which event it is. This is painful for me...)
For this use-case, I think the kind of analysis you need is a "critical path
analysis", which can identify what made each task you care about block during
your time range. See work published by Francis Giraldeau on this topic.
CPU being idle is a side-effect of the tasks being blocked. If you would have
a lesser priority task running in the background, that task would run instead
of the CPU idle loop. Therefore, the idle CPU is really just a side-effect, not
a cause.
Thanks!
Mathieu
>
> 2. Then I need to find corresponding "raw_syscalls:sys_enter" event to see
> when the syscall
> is issued. I switch to control flow view then use 'Select Previous
> Event' to find it, then
> back to resource view I can understand how long this syscall takes,
> whether the CPU
> does some work or simply idle after the syscall is issued, and whether
> the task is scheduled
> across CPUs.
>
> 3. For each CPU do step 1 and step 2.
>
> In some high-end servers the number of cores may exceeds 100. Even in my case
> the number of traced
> CPUs is 32. Doing such searching is time consuming. I have to write a python
> script to do that.
> My result is: half of cores are waiting on different futexs, half of then are
> waiting on
> pselect() (caused by sleep()).
>
> *Second case: matching futex WAKE and WAIT*
>
> Therefore the next case I'd like to share is to maching futex wait, futex
> wakeup and futex waken
> events. This time TraceCompass can't help much. However, the python script is
> also not very easy
> to design. I have to track CPU and process state transition by myself, match
> all futex sys_enter
> and sys_exit events, consider different cases including FUTEX_WAKE before
> FUTEX_WAIT, failures,
> timeout and compare retval of futex wake and the number of threads waken by
> it. This
> disposable python script has 115 lines of code (I have to admit that I'm not
> a very good python
> programmer), I create and debug it for serval hours.
>
> My final result is: threads wakeup each other in a tree-like manner. The
> first futex WAKE command is
> issued by the only running CPU, wakeup only one thread. It wakeups others,
> other wakeup more, finally
> nearly all CPUs are wakenup. There are some threads get executed after a
> relative long time
> after the corresponding futex WAKE command, even if there are idle CPUs at
> that time. Therefore we
> should look into scheduler. However, the gap itself should be a normal
> phenomenon.
>
> *Third case: device utilization*
>
> We have a high-speed storage device, but writing to filesystems on it is not
> as fast as we expected.
> I deploy serval tracepoints at device driver to track device activities.
> However, I have to create
> some tools to draw cumulative curve and speed curve to find whether there is
> irregular idle. I use
> gnuplot for ploting, but have to write another python script to extrace data.
> I think languages like
> R should be useful in this case but I'm not familiary with it.
>
> *Conclusion*
>
> In this email I list 3 use cases related to Ad-Hoc statistics I mentioned
> earlier. Case 1 and 2 are
> in fact not a statistics problem. They should be considered as query
> problems. I suspect case
> 1 and 2 can be expressed using SQL. If TraceCompass can provide a query
> language like SQL, we can quickly
> find the information we need to know so will have more time for tuning. I
> expressed my SQL-like query
> idea on one of my early email:
>
> http://lists.linuxfoundation.org/pipermail/diamon-discuss/2014-November/000003.html
>
> However I was not very sure the query problem we would meet.
>
> Case 3 requires a plotting tool like gnuplot or R. I don't know whether
> TraceCompass designers want to
> integrate such function, but at lease TraceCompass should export data for
> those tools.
>
> In case 1 and 2, I spent a lot of time to analyze a phenomenon which is
> finally shown to be normal.
> I think this should be common in real performance analysis and tuning tasks.
> Many ideas may
> appear after the first trace is collected. I think tools like TraceCompass
> should consider a
> way to reduce the cost of trials and error.
>
> Thank you for reading this!
>
>
> _______________________________________________
> diamon-discuss mailing list
> diamon-discuss@lists.linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/diamon-discuss
>
--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection.
2015-02-07 7:25 ` Wang Nan
@ 2015-02-09 20:13 ` Alexandre Montplaisir
2015-02-09 20:13 ` Alexandre Montplaisir
1 sibling, 0 replies; 35+ messages in thread
From: Alexandre Montplaisir @ 2015-02-09 20:13 UTC (permalink / raw)
To: Wang Nan
Cc: Xinwei Hu, diamon-discuss, lttng-dev, tracecompass developer discussions
On 02/07/2015 02:25 AM, Wang Nan wrote:
> On 2015/2/7 13:14, Alexandre Montplaisir wrote:
>> On 2015-02-06 10:13 PM, Wang Nan wrote:
>>>> I think such an approach could help in all 3 use cases you have presented. What do you think?
>>> Good to see you are looking at this problem.
>>>
>>> "Frequency analysis" you mentioned is a good viewpoint for finding outliner. However, it should not be the only one we consider. Could you please explain how "frequency analysis" can solve my first problem "finding the reason why most of CPUs are idle by matching syscalls events?"
>>>
>>> Thank you!
>>>
>> I was thinking of the automatic event matching, for example matching syscall_entry_* events with their corresponding syscall_exit_*. This is a pre-requisite of doing frequency analysis, but could be useful on its own. (Re-reading myself now I notice I didn't mention event matching at all, my bad!)
>>
>> If I understand your first problem correctly, it boils down to wanting to identify the system call that is ongoing when a CPU is idle. And this is not straightforward, because the Select Next/Previous Event buttons in the time graph views will stop at every state transition, like CPU idle or IRQs, which are "in the way" of the system call entry event. Correct?
>>
>>
>> Now, what if we had a view that simply lists all the system calls in the trace? Each row would contain the complete information of a system call, so its start time (timestamp of the sys_entry event), end time (timestamp of the sys_exit event), duration, name/id, arguments, return value, etc.
>>
>> And this view could be synchronized with the other ones, where clicking on a row would bring you to the corresponding syscall_entry event. And inversely, clicking on any timestamp would bring this view to the row corresponding to the system call that was active at that time, if there is one. I believe this could speed up your problem of identifying the running system call for any arbitrary point in the trace.
>>
>> Also, that view could be re-used for other types of intervals, like IRQs, IO operations, and so on. And if the user sorts by the Duration column, bam, they have a sorted list of the worst offenders for longer system calls, IRQs, etc.
>>
>>
>> Would this be helpful?
>>
> Correct, it should be helpful. If I understand your description correctly, you are thinking another time
> chart view like resource view and control flow view, this time each row is a syscall, like this:
In fact, for the "event matching" what I had in mind was a simple text
table, where every line represents one (entire) system call, so you
would have columns like:
start | end | duration | name/id | arguments | return
------|-----|----------|---------|---------------------|-------
10 | 15 | 5 | open | path=abc, flags=... | 4
20 | 23 | 3 | close | fd=4 | 0
....
It would then be easy to synchronize this view with the others, so that
it would point to the ongoing system call(s) whenever the user selects a
timestamp in another view. And vice versa, clicking on an entry in this
table would move to the corresponding syscall_entry event.
The view could also be generalized for other things (IRQs?), either in
different views, or the same view with a drop-down showing which type of
information to display.
And for the so-called frequency analysis, you use the data from the
"duration" column to populate a histogram (or point cloud?) showing how
the distribution of latency of every system call (Y axis represents the
absolute number of occurrences):
|
| ___
| ___
| ___ ___ ___
|__ ___ ___ ___ ___
+-----------------------------
0-5 5-10 10-15 15-20 20+
duration of syscall (ms)
So in this example we could see right away that the average duration is
around 10-15 ms, but we have some outliers that took more than 20 ms, so
it would be interesting to zoom in there to get more details. And
eventually clicking on one particular occurrence could bring you to the
exact point in the trace where it happened (or create bookmarks for the
outlier entries, etc.)
> [...]
>
>
> However, when I working on the first problem, what I think about is a SQL query like:
>
> select * from events
> inner join
> ( select max(timestamp) as lasttime from events groupby cpu sortby timestamp where
> events.timestamp < xxxxxxx and
> events.type == "raw_syscalls:sys_exit"
> group by cpu ) X
> on
> events.timestamp = X.lasttime and
> events.cpu = X.cpu and
> events.type = "raw_syscalls:sys_exit"
>
> (I'm not sure whether it works or need some debugging, but I think my idea is expressed.)
Interesting, and what would the output of such a query look like? Would
it be a list of events, but with the "groupby" that would be more like a
tree? Like this:
CPU0
|__event 1
|__event 2
|__...
CPU1
|__event 4
|__event 5
|__...
...
?
We already have the Event table, on which we can apply filters of this
type (timestamp == something && cpu == otherthing...). But we only show
the results in the same flat list. I'm not sure what would be best here.
Showing results in a separate view, or supporting sub-trees in the base
Event table?
Cheers,
Alexandre
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection.
2015-02-07 7:25 ` Wang Nan
2015-02-09 20:13 ` Alexandre Montplaisir
@ 2015-02-09 20:13 ` Alexandre Montplaisir
2015-02-10 4:55 ` Wang Nan
2015-02-10 4:55 ` Wang Nan
1 sibling, 2 replies; 35+ messages in thread
From: Alexandre Montplaisir @ 2015-02-09 20:13 UTC (permalink / raw)
To: Wang Nan
Cc: Xinwei Hu, diamon-discuss, lttng-dev, tracecompass developer discussions
On 02/07/2015 02:25 AM, Wang Nan wrote:
> On 2015/2/7 13:14, Alexandre Montplaisir wrote:
>> On 2015-02-06 10:13 PM, Wang Nan wrote:
>>>> I think such an approach could help in all 3 use cases you have presented. What do you think?
>>> Good to see you are looking at this problem.
>>>
>>> "Frequency analysis" you mentioned is a good viewpoint for finding outliner. However, it should not be the only one we consider. Could you please explain how "frequency analysis" can solve my first problem "finding the reason why most of CPUs are idle by matching syscalls events?"
>>>
>>> Thank you!
>>>
>> I was thinking of the automatic event matching, for example matching syscall_entry_* events with their corresponding syscall_exit_*. This is a pre-requisite of doing frequency analysis, but could be useful on its own. (Re-reading myself now I notice I didn't mention event matching at all, my bad!)
>>
>> If I understand your first problem correctly, it boils down to wanting to identify the system call that is ongoing when a CPU is idle. And this is not straightforward, because the Select Next/Previous Event buttons in the time graph views will stop at every state transition, like CPU idle or IRQs, which are "in the way" of the system call entry event. Correct?
>>
>>
>> Now, what if we had a view that simply lists all the system calls in the trace? Each row would contain the complete information of a system call, so its start time (timestamp of the sys_entry event), end time (timestamp of the sys_exit event), duration, name/id, arguments, return value, etc.
>>
>> And this view could be synchronized with the other ones, where clicking on a row would bring you to the corresponding syscall_entry event. And inversely, clicking on any timestamp would bring this view to the row corresponding to the system call that was active at that time, if there is one. I believe this could speed up your problem of identifying the running system call for any arbitrary point in the trace.
>>
>> Also, that view could be re-used for other types of intervals, like IRQs, IO operations, and so on. And if the user sorts by the Duration column, bam, they have a sorted list of the worst offenders for longer system calls, IRQs, etc.
>>
>>
>> Would this be helpful?
>>
> Correct, it should be helpful. If I understand your description correctly, you are thinking another time
> chart view like resource view and control flow view, this time each row is a syscall, like this:
In fact, for the "event matching" what I had in mind was a simple text
table, where every line represents one (entire) system call, so you
would have columns like:
start | end | duration | name/id | arguments | return
------|-----|----------|---------|---------------------|-------
10 | 15 | 5 | open | path=abc, flags=... | 4
20 | 23 | 3 | close | fd=4 | 0
....
It would then be easy to synchronize this view with the others, so that
it would point to the ongoing system call(s) whenever the user selects a
timestamp in another view. And vice versa, clicking on an entry in this
table would move to the corresponding syscall_entry event.
The view could also be generalized for other things (IRQs?), either in
different views, or the same view with a drop-down showing which type of
information to display.
And for the so-called frequency analysis, you use the data from the
"duration" column to populate a histogram (or point cloud?) showing how
the distribution of latency of every system call (Y axis represents the
absolute number of occurrences):
|
| ___
| ___
| ___ ___ ___
|__ ___ ___ ___ ___
+-----------------------------
0-5 5-10 10-15 15-20 20+
duration of syscall (ms)
So in this example we could see right away that the average duration is
around 10-15 ms, but we have some outliers that took more than 20 ms, so
it would be interesting to zoom in there to get more details. And
eventually clicking on one particular occurrence could bring you to the
exact point in the trace where it happened (or create bookmarks for the
outlier entries, etc.)
> [...]
>
>
> However, when I working on the first problem, what I think about is a SQL query like:
>
> select * from events
> inner join
> ( select max(timestamp) as lasttime from events groupby cpu sortby timestamp where
> events.timestamp < xxxxxxx and
> events.type == "raw_syscalls:sys_exit"
> group by cpu ) X
> on
> events.timestamp = X.lasttime and
> events.cpu = X.cpu and
> events.type = "raw_syscalls:sys_exit"
>
> (I'm not sure whether it works or need some debugging, but I think my idea is expressed.)
Interesting, and what would the output of such a query look like? Would
it be a list of events, but with the "groupby" that would be more like a
tree? Like this:
CPU0
|__event 1
|__event 2
|__...
CPU1
|__event 4
|__event 5
|__...
...
?
We already have the Event table, on which we can apply filters of this
type (timestamp == something && cpu == otherthing...). But we only show
the results in the same flat list. I'm not sure what would be best here.
Showing results in a separate view, or supporting sub-trees in the base
Event table?
Cheers,
Alexandre
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection.
2015-02-09 20:13 ` Alexandre Montplaisir
@ 2015-02-10 4:55 ` Wang Nan
2015-02-10 4:55 ` Wang Nan
1 sibling, 0 replies; 35+ messages in thread
From: Wang Nan @ 2015-02-10 4:55 UTC (permalink / raw)
To: Alexandre Montplaisir
Cc: Xinwei Hu, diamon-discuss, lttng-dev, tracecompass developer discussions
On 2015/2/10 4:13, Alexandre Montplaisir wrote:
>
> On 02/07/2015 02:25 AM, Wang Nan wrote:
>> On 2015/2/7 13:14, Alexandre Montplaisir wrote:
>>> On 2015-02-06 10:13 PM, Wang Nan wrote:
>>>>> I think such an approach could help in all 3 use cases you have presented. What do you think?
>>>> Good to see you are looking at this problem.
>>>>
>>>> "Frequency analysis" you mentioned is a good viewpoint for finding outliner. However, it should not be the only one we consider. Could you please explain how "frequency analysis" can solve my first problem "finding the reason why most of CPUs are idle by matching syscalls events?"
>>>>
>>>> Thank you!
>>>>
>>> I was thinking of the automatic event matching, for example matching syscall_entry_* events with their corresponding syscall_exit_*. This is a pre-requisite of doing frequency analysis, but could be useful on its own. (Re-reading myself now I notice I didn't mention event matching at all, my bad!)
>>>
>>> If I understand your first problem correctly, it boils down to wanting to identify the system call that is ongoing when a CPU is idle. And this is not straightforward, because the Select Next/Previous Event buttons in the time graph views will stop at every state transition, like CPU idle or IRQs, which are "in the way" of the system call entry event. Correct?
>>>
>>>
>>> Now, what if we had a view that simply lists all the system calls in the trace? Each row would contain the complete information of a system call, so its start time (timestamp of the sys_entry event), end time (timestamp of the sys_exit event), duration, name/id, arguments, return value, etc.
>>>
>>> And this view could be synchronized with the other ones, where clicking on a row would bring you to the corresponding syscall_entry event. And inversely, clicking on any timestamp would bring this view to the row corresponding to the system call that was active at that time, if there is one. I believe this could speed up your problem of identifying the running system call for any arbitrary point in the trace.
>>>
>>> Also, that view could be re-used for other types of intervals, like IRQs, IO operations, and so on. And if the user sorts by the Duration column, bam, they have a sorted list of the worst offenders for longer system calls, IRQs, etc.
>>>
>>>
>>> Would this be helpful?
>>>
>> Correct, it should be helpful. If I understand your description correctly, you are thinking another time
>> chart view like resource view and control flow view, this time each row is a syscall, like this:
>
> In fact, for the "event matching" what I had in mind was a simple text table, where every line represents one (entire) system call, so you would have columns like:
>
> start | end | duration | name/id | arguments | return
>
> ------|-----|----------|---------|---------------------|-------
> 10 | 15 | 5 | open | path=abc, flags=... | 4
> 20 | 23 | 3 | close | fd=4 | 0
> ....
>
>
> It would then be easy to synchronize this view with the others, so that it would point to the ongoing system call(s) whenever the user selects a timestamp in another view. And vice versa, clicking on an entry in this table would move to the corresponding syscall_entry event.
>
> The view could also be generalized for other things (IRQs?), either in different views, or the same view with a drop-down showing which type of information to display.
>
>
> And for the so-called frequency analysis, you use the data from the "duration" column to populate a histogram (or point cloud?) showing how the distribution of latency of every system call (Y axis represents the absolute number of occurrences):
>
>
> |
>
> | ___
>
> | ___
>
> | ___ ___ ___
>
> |__ ___ ___ ___ ___
>
> +-----------------------------
>
> 0-5 5-10 10-15 15-20 20+
> duration of syscall (ms)
>
>
> So in this example we could see right away that the average duration is around 10-15 ms, but we have some outliers that took more than 20 ms, so it would be interesting to zoom in there to get more details. And eventually clicking on one particular occurrence could bring you to the exact point in the trace where it happened (or create bookmarks for the outlier entries, etc.)
>
Looks good. It should be better if users are able to define their own matching rules.
In addition, do you think about chaining more than 2 events together? Some procedures, such as IO and network, can be traced at different layers, and we are able to chain events caused by same sources together with some cules. Please refer to:
http://comments.gmane.org/gmane.linux.kernel/1883569
Which we are working on.
>
>> [...]
>>
>>
>> However, when I working on the first problem, what I think about is a SQL query like:
>>
>> select * from events
>> inner join
>> ( select max(timestamp) as lasttime from events groupby cpu sortby timestamp where
>> events.timestamp < xxxxxxx and
>> events.type == "raw_syscalls:sys_exit"
>> group by cpu ) X
>> on
>> events.timestamp = X.lasttime and
>> events.cpu = X.cpu and
>> events.type = "raw_syscalls:sys_exit"
>>
>> (I'm not sure whether it works or need some debugging, but I think my idea is expressed.)
>
> Interesting, and what would the output of such a query look like? Would it be a list of events, but with the "groupby" that would be more like a tree? Like this:
>
> CPU0
> |__event 1
> |__event 2
> |__...
> CPU1
> |__event 4
> |__event 5
> |__...
> ...
>
> ?
>
Correct. Even a flat table with sort and filters enable should be helpful.
> We already have the Event table, on which we can apply filters of this type (timestamp == something && cpu == otherthing...). But we only show the results in the same flat list. I'm not sure what would be best here. Showing results in a separate view, or supporting sub-trees in the base Event table?
>
>
> Cheers,
> Alexandre
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection.
2015-02-09 20:13 ` Alexandre Montplaisir
2015-02-10 4:55 ` Wang Nan
@ 2015-02-10 4:55 ` Wang Nan
2015-02-10 8:02 ` Alexandre Montplaisir
2015-02-10 8:02 ` Alexandre Montplaisir
1 sibling, 2 replies; 35+ messages in thread
From: Wang Nan @ 2015-02-10 4:55 UTC (permalink / raw)
To: Alexandre Montplaisir
Cc: Xinwei Hu, diamon-discuss, lttng-dev, tracecompass developer discussions
On 2015/2/10 4:13, Alexandre Montplaisir wrote:
>
> On 02/07/2015 02:25 AM, Wang Nan wrote:
>> On 2015/2/7 13:14, Alexandre Montplaisir wrote:
>>> On 2015-02-06 10:13 PM, Wang Nan wrote:
>>>>> I think such an approach could help in all 3 use cases you have presented. What do you think?
>>>> Good to see you are looking at this problem.
>>>>
>>>> "Frequency analysis" you mentioned is a good viewpoint for finding outliner. However, it should not be the only one we consider. Could you please explain how "frequency analysis" can solve my first problem "finding the reason why most of CPUs are idle by matching syscalls events?"
>>>>
>>>> Thank you!
>>>>
>>> I was thinking of the automatic event matching, for example matching syscall_entry_* events with their corresponding syscall_exit_*. This is a pre-requisite of doing frequency analysis, but could be useful on its own. (Re-reading myself now I notice I didn't mention event matching at all, my bad!)
>>>
>>> If I understand your first problem correctly, it boils down to wanting to identify the system call that is ongoing when a CPU is idle. And this is not straightforward, because the Select Next/Previous Event buttons in the time graph views will stop at every state transition, like CPU idle or IRQs, which are "in the way" of the system call entry event. Correct?
>>>
>>>
>>> Now, what if we had a view that simply lists all the system calls in the trace? Each row would contain the complete information of a system call, so its start time (timestamp of the sys_entry event), end time (timestamp of the sys_exit event), duration, name/id, arguments, return value, etc.
>>>
>>> And this view could be synchronized with the other ones, where clicking on a row would bring you to the corresponding syscall_entry event. And inversely, clicking on any timestamp would bring this view to the row corresponding to the system call that was active at that time, if there is one. I believe this could speed up your problem of identifying the running system call for any arbitrary point in the trace.
>>>
>>> Also, that view could be re-used for other types of intervals, like IRQs, IO operations, and so on. And if the user sorts by the Duration column, bam, they have a sorted list of the worst offenders for longer system calls, IRQs, etc.
>>>
>>>
>>> Would this be helpful?
>>>
>> Correct, it should be helpful. If I understand your description correctly, you are thinking another time
>> chart view like resource view and control flow view, this time each row is a syscall, like this:
>
> In fact, for the "event matching" what I had in mind was a simple text table, where every line represents one (entire) system call, so you would have columns like:
>
> start | end | duration | name/id | arguments | return
>
> ------|-----|----------|---------|---------------------|-------
> 10 | 15 | 5 | open | path=abc, flags=... | 4
> 20 | 23 | 3 | close | fd=4 | 0
> ....
>
>
> It would then be easy to synchronize this view with the others, so that it would point to the ongoing system call(s) whenever the user selects a timestamp in another view. And vice versa, clicking on an entry in this table would move to the corresponding syscall_entry event.
>
> The view could also be generalized for other things (IRQs?), either in different views, or the same view with a drop-down showing which type of information to display.
>
>
> And for the so-called frequency analysis, you use the data from the "duration" column to populate a histogram (or point cloud?) showing how the distribution of latency of every system call (Y axis represents the absolute number of occurrences):
>
>
> |
>
> | ___
>
> | ___
>
> | ___ ___ ___
>
> |__ ___ ___ ___ ___
>
> +-----------------------------
>
> 0-5 5-10 10-15 15-20 20+
> duration of syscall (ms)
>
>
> So in this example we could see right away that the average duration is around 10-15 ms, but we have some outliers that took more than 20 ms, so it would be interesting to zoom in there to get more details. And eventually clicking on one particular occurrence could bring you to the exact point in the trace where it happened (or create bookmarks for the outlier entries, etc.)
>
Looks good. It should be better if users are able to define their own matching rules.
In addition, do you think about chaining more than 2 events together? Some procedures, such as IO and network, can be traced at different layers, and we are able to chain events caused by same sources together with some cules. Please refer to:
http://comments.gmane.org/gmane.linux.kernel/1883569
Which we are working on.
>
>> [...]
>>
>>
>> However, when I working on the first problem, what I think about is a SQL query like:
>>
>> select * from events
>> inner join
>> ( select max(timestamp) as lasttime from events groupby cpu sortby timestamp where
>> events.timestamp < xxxxxxx and
>> events.type == "raw_syscalls:sys_exit"
>> group by cpu ) X
>> on
>> events.timestamp = X.lasttime and
>> events.cpu = X.cpu and
>> events.type = "raw_syscalls:sys_exit"
>>
>> (I'm not sure whether it works or need some debugging, but I think my idea is expressed.)
>
> Interesting, and what would the output of such a query look like? Would it be a list of events, but with the "groupby" that would be more like a tree? Like this:
>
> CPU0
> |__event 1
> |__event 2
> |__...
> CPU1
> |__event 4
> |__event 5
> |__...
> ...
>
> ?
>
Correct. Even a flat table with sort and filters enable should be helpful.
> We already have the Event table, on which we can apply filters of this type (timestamp == something && cpu == otherthing...). But we only show the results in the same flat list. I'm not sure what would be best here. Showing results in a separate view, or supporting sub-trees in the base Event table?
>
>
> Cheers,
> Alexandre
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection.
2015-02-10 4:55 ` Wang Nan
@ 2015-02-10 8:02 ` Alexandre Montplaisir
2015-02-10 8:02 ` Alexandre Montplaisir
1 sibling, 0 replies; 35+ messages in thread
From: Alexandre Montplaisir @ 2015-02-10 8:02 UTC (permalink / raw)
To: Wang Nan
Cc: Xinwei Hu, diamon-discuss, lttng-dev, tracecompass developer discussions
On 2015-02-09 11:55 PM, Wang Nan wrote:
> On 2015/2/10 4:13, Alexandre Montplaisir wrote:
>> [...]
>>
>> So in this example we could see right away that the average duration is around 10-15 ms, but we have some outliers that took more than 20 ms, so it would be interesting to zoom in there to get more details. And eventually clicking on one particular occurrence could bring you to the exact point in the trace where it happened (or create bookmarks for the outlier entries, etc.)
>>
> Looks good. It should be better if users are able to define their own matching rules.
Absolutely! We can have pre-defined rules for the known domain (system
calls, IRQs, and so on), but then it'd be easy to let the user enter an
event name for the "start" and an event name for the "end".
I was also thinking, we could even use general filters to define what is
a start and an end event. It would require some refactoring due to the
way filters are organized at the moment, so it wouldn't be trivial, but
eventually we may want to get there.
I'm currently working on the side on a prototype of the "event matching"
table view, I will let you know once I have an initial version working,
so we can see if/how useful it is.
>
> In addition, do you think about chaining more than 2 events together? Some procedures, such as IO and network, can be traced at different layers, and we are able to chain events caused by same sources together with some cules. Please refer to:
>
> http://comments.gmane.org/gmane.linux.kernel/1883569
>
> Which we are working on.
>
Geneviève (added in CC) is currently working on virtual machine
analyses, where we can match events from a kernel trace on a host to a
kernel trace in a guest, and so on. The "layering" and general matching
of events should be similar. Maybe she has some ideas if the event
matching part could be reused somehow?
Cheers,
Alexandre
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection.
2015-02-10 4:55 ` Wang Nan
2015-02-10 8:02 ` Alexandre Montplaisir
@ 2015-02-10 8:02 ` Alexandre Montplaisir
1 sibling, 0 replies; 35+ messages in thread
From: Alexandre Montplaisir @ 2015-02-10 8:02 UTC (permalink / raw)
To: Wang Nan
Cc: Xinwei Hu, diamon-discuss, lttng-dev, Geneviève Bastien,
tracecompass developer discussions
On 2015-02-09 11:55 PM, Wang Nan wrote:
> On 2015/2/10 4:13, Alexandre Montplaisir wrote:
>> [...]
>>
>> So in this example we could see right away that the average duration is around 10-15 ms, but we have some outliers that took more than 20 ms, so it would be interesting to zoom in there to get more details. And eventually clicking on one particular occurrence could bring you to the exact point in the trace where it happened (or create bookmarks for the outlier entries, etc.)
>>
> Looks good. It should be better if users are able to define their own matching rules.
Absolutely! We can have pre-defined rules for the known domain (system
calls, IRQs, and so on), but then it'd be easy to let the user enter an
event name for the "start" and an event name for the "end".
I was also thinking, we could even use general filters to define what is
a start and an end event. It would require some refactoring due to the
way filters are organized at the moment, so it wouldn't be trivial, but
eventually we may want to get there.
I'm currently working on the side on a prototype of the "event matching"
table view, I will let you know once I have an initial version working,
so we can see if/how useful it is.
>
> In addition, do you think about chaining more than 2 events together? Some procedures, such as IO and network, can be traced at different layers, and we are able to chain events caused by same sources together with some cules. Please refer to:
>
> http://comments.gmane.org/gmane.linux.kernel/1883569
>
> Which we are working on.
>
Geneviève (added in CC) is currently working on virtual machine
analyses, where we can match events from a kernel trace on a host to a
kernel trace in a guest, and so on. The "layering" and general matching
of events should be similar. Maybe she has some ideas if the event
matching part could be reused somehow?
Cheers,
Alexandre
^ permalink raw reply [flat|nested] 35+ messages in thread
end of thread, other threads:[~2015-02-10 8:02 UTC | newest]
Thread overview: 35+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-01-23 9:35 [diamon-discuss] My experience on perf, CTF and TraceCompass, and some suggection Wang Nan
2015-01-23 16:30 ` Alexandre Montplaisir
2015-01-23 16:56 ` [diamon-discuss] [lttng-dev] " Geneviève Bastien
2015-01-23 16:56 ` [diamon-discuss] " Geneviève Bastien
2015-01-31 7:14 ` Wang Nan
2015-02-04 20:24 ` Alexandre Montplaisir
2015-02-07 3:13 ` Wang Nan
2015-02-07 5:14 ` Alexandre Montplaisir
2015-02-07 7:25 ` Wang Nan
2015-02-09 20:13 ` Alexandre Montplaisir
2015-02-09 20:13 ` Alexandre Montplaisir
2015-02-10 4:55 ` Wang Nan
2015-02-10 4:55 ` Wang Nan
2015-02-10 8:02 ` Alexandre Montplaisir
2015-02-10 8:02 ` Alexandre Montplaisir
2015-02-07 7:25 ` Wang Nan
2015-02-07 5:14 ` Alexandre Montplaisir
2015-02-07 3:13 ` Wang Nan
2015-02-04 20:24 ` Alexandre Montplaisir
2015-02-07 13:22 ` Mathieu Desnoyers
2015-02-07 13:22 ` Mathieu Desnoyers
2015-01-31 7:14 ` Wang Nan
2015-01-23 16:30 ` Alexandre Montplaisir
2015-01-23 20:12 ` Alexandre Montplaisir
2015-01-23 20:12 ` Alexandre Montplaisir
2015-01-26 4:08 ` Wang Nan
2015-01-26 4:08 ` Wang Nan
2015-01-26 16:25 ` Jérémie Galarneau
2015-01-27 0:54 ` Wang Nan
2015-01-27 0:54 ` Wang Nan
2015-01-27 5:31 ` Jérémie Galarneau
2015-01-27 5:31 ` Jérémie Galarneau
2015-01-27 6:31 ` Wang Nan
2015-01-27 6:31 ` Wang Nan
2015-01-26 16:25 ` Jérémie Galarneau
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.