All of lore.kernel.org
 help / color / mirror / Atom feed
* A list of visual Profiler UIs for linux perf
@ 2021-09-05  6:19 Mark Hansen
  2021-09-05  6:50 ` Brendan Gregg
  2021-09-05 20:33 ` Viktor Rosendahl
  0 siblings, 2 replies; 6+ messages in thread
From: Mark Hansen @ 2021-09-05  6:19 UTC (permalink / raw)
  To: linux-perf-users

Hi perf-toolers,

I wrote a quick literature-review of profiler user interfaces
available for analysing the output of Linux perf:
https://www.markhansen.co.nz/profiler-uis/

I couldn't find a similar list of profiler UIs online. Hopefully this
can help people find the profiler UI that's right for them.

I'd appreciate any feedback; in particular: did I miss any great perf UIs?

Thank you for all your work on perf,
Mark

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: A list of visual Profiler UIs for linux perf
  2021-09-05  6:19 A list of visual Profiler UIs for linux perf Mark Hansen
@ 2021-09-05  6:50 ` Brendan Gregg
  2021-09-08 19:13   ` Stephen Brennan
  2021-09-05 20:33 ` Viktor Rosendahl
  1 sibling, 1 reply; 6+ messages in thread
From: Brendan Gregg @ 2021-09-05  6:50 UTC (permalink / raw)
  To: Mark Hansen; +Cc: linux-perf-use.

Thanks Mark,

Nice to see the adoption of flame graphs everywhere.

I'd encourage people not to overlook Flame Scope. I created this
visualization to address problems with flame graphs: how they can hide
variation, perturbations, and interval patterns. These are all
illustrated by the flame scope subsecond offset heat map, and then you
highlight detail and get the flame graph for those samples.

Or put it this way: If I wasn't a Netflix engineer, flame scope might
be my commercial startup and you'd be paying $30/instance/month to use
it. :-) Instead it's free: https://github.com/Netflix/flamescope

Brendan


On Sun, Sep 5, 2021 at 4:20 PM Mark Hansen <mark@markhansen.co.nz> wrote:
>
> Hi perf-toolers,
>
> I wrote a quick literature-review of profiler user interfaces
> available for analysing the output of Linux perf:
> https://www.markhansen.co.nz/profiler-uis/
>
> I couldn't find a similar list of profiler UIs online. Hopefully this
> can help people find the profiler UI that's right for them.
>
> I'd appreciate any feedback; in particular: did I miss any great perf UIs?
>
> Thank you for all your work on perf,
> Mark

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: A list of visual Profiler UIs for linux perf
  2021-09-05  6:19 A list of visual Profiler UIs for linux perf Mark Hansen
  2021-09-05  6:50 ` Brendan Gregg
@ 2021-09-05 20:33 ` Viktor Rosendahl
  1 sibling, 0 replies; 6+ messages in thread
From: Viktor Rosendahl @ 2021-09-05 20:33 UTC (permalink / raw)
  To: Mark Hansen, linux-perf-users

Hi Mark,

On 9/5/21 8:19 AM, Mark Hansen wrote:
> Hi perf-toolers,
>
> I wrote a quick literature-review of profiler user interfaces
> available for analysing the output of Linux perf:
> https://www.markhansen.co.nz/profiler-uis/
>
> I couldn't find a similar list of profiler UIs online. Hopefully this
> can help people find the profiler UI that's right for them.
>
> I'd appreciate any feedback; in particular: did I miss any great perf UIs?

I am not sure if it qualifies as great perf UI but there is traceshark:

https://github.com/cunctator/traceshark

It's a visualizer with a focus on visualizing things like scheduling and 
wakeup events. I guess it's mostly useful to see what a task has been 
waiting for but it's also possible to see which tasks are burning CPU 
time. It doesn't support perf.data files, it read files generated with 
"perf script -f".

It has the possibility of filtering and saving the filtered output, so 
it's possible to filter on various things, for example a certain task, 
or set of tasks, for a certain time window. It's then possible to save 
the filtered cycles events and use Brendan Gregg's flamegraph tools to 
generate a flamegraph.

In case you decide to try it out, I recommend building it from source, 
since the binary packages in Debian/Ubuntu don't have OpenGL support. 
For some older versions, that is older than what is currently in 
Debian/testing, there is also an OpenGL configuration mismatch issue 
between the application and a library that makes scrolling and zooming 
painfully slow.

best regards,

Viktor



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: A list of visual Profiler UIs for linux perf
  2021-09-05  6:50 ` Brendan Gregg
@ 2021-09-08 19:13   ` Stephen Brennan
  2021-09-09  2:15     ` Brendan Gregg
  0 siblings, 1 reply; 6+ messages in thread
From: Stephen Brennan @ 2021-09-08 19:13 UTC (permalink / raw)
  To: Brendan Gregg, Mark Hansen, Stephen Brennan; +Cc: linux-perf-use.

Hi Mark & Brendan,

Thanks for this thread - it's very useful.

Firefox Profiler is news to me and looks exciting. However, I can't see 
any clear documentation on their part that the tool is client-side only. 
I can't always put internal flamegraphs into a web form on the 
assumption that they won't be uploaded somewhere. Do you know if there's 
anything explicit that says data won't be shared? (unless I explicitly 
upload it to create a link)

Flamescope is quite exciting as well! I can see how the time dimension 
can be incredibly useful. Brendan, I had a couple questions regarding it:
1) Is the box color scale in terms of "number of samples in that time 
interval"? If so, it would only really be useful for cpu-cycles or 
instructions, correct? Something like the cpu-clock which tries to 
regularly sample at a set frequency would just look monochrome?

2) I'm curious if you've considered directly using perf.data in 
Flamescope, rather than perf.script? I've recently discovered the
"--symfs" and "--kallsyms" options for perf. By using perf buildid-list, 
you can identify all DSOs, capture their symbol tables, and create a 
minimal bundle of files to allow the perf.data to be read with useful 
symbols on any system. Since perf.data contains more information, 
usually with less disk space, I've started taking this approach to make 
capturing, transferring, and analyzing larger recordings (especially 
from customers) easier as well as more flexible and efficient. All the 
same analysis can be done via the Python engine in perf-script, without 
need to worry about text parsing.

Thanks,
Stephen

On 9/4/21 11:50 PM, Brendan Gregg wrote:
> Thanks Mark,
> 
> Nice to see the adoption of flame graphs everywhere.
> 
> I'd encourage people not to overlook Flame Scope. I created this
> visualization to address problems with flame graphs: how they can hide
> variation, perturbations, and interval patterns. These are all
> illustrated by the flame scope subsecond offset heat map, and then you
> highlight detail and get the flame graph for those samples.
> 
> Or put it this way: If I wasn't a Netflix engineer, flame scope might
> be my commercial startup and you'd be paying $30/instance/month to use
> it. :-) Instead it's free: https://github.com/Netflix/flamescope
> 
> Brendan
> 
> 
> On Sun, Sep 5, 2021 at 4:20 PM Mark Hansen <mark@markhansen.co.nz> wrote:
>>
>> Hi perf-toolers,
>>
>> I wrote a quick literature-review of profiler user interfaces
>> available for analysing the output of Linux perf:
>> https://www.markhansen.co.nz/profiler-uis/
>>
>> I couldn't find a similar list of profiler UIs online. Hopefully this
>> can help people find the profiler UI that's right for them.
>>
>> I'd appreciate any feedback; in particular: did I miss any great perf UIs?
>>
>> Thank you for all your work on perf,
>> Mark


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: A list of visual Profiler UIs for linux perf
  2021-09-08 19:13   ` Stephen Brennan
@ 2021-09-09  2:15     ` Brendan Gregg
  2021-09-09 18:45       ` Stephen Brennan
  0 siblings, 1 reply; 6+ messages in thread
From: Brendan Gregg @ 2021-09-09  2:15 UTC (permalink / raw)
  To: Stephen Brennan; +Cc: Mark Hansen, linux-perf-use.

G'Day Stephen,

On Thu, Sep 9, 2021 at 5:13 AM Stephen Brennan
<stephen.s.brennan@oracle.com> wrote:
>
> Hi Mark & Brendan,
>
> Thanks for this thread - it's very useful.
>
> Firefox Profiler is news to me and looks exciting. However, I can't see
> any clear documentation on their part that the tool is client-side only.
> I can't always put internal flamegraphs into a web form on the
> assumption that they won't be uploaded somewhere. Do you know if there's
> anything explicit that says data won't be shared? (unless I explicitly
> upload it to create a link)
>
> Flamescope is quite exciting as well! I can see how the time dimension
> can be incredibly useful. Brendan, I had a couple questions regarding it:
> 1) Is the box color scale in terms of "number of samples in that time
> interval"? If so, it would only really be useful for cpu-cycles or
> instructions, correct? Something like the cpu-clock which tries to
> regularly sample at a set frequency would just look monochrome?

Exclude idle stacks then cpu-cycles works. Most of our samples are
cpu-cycles based (only thing available in most of EC2). FlameScope
should already filter it:

app/perf/regexp.py:idle_stack =
re.compile("(cpuidle|cpu_idle|cpu_bringup_and_idle|native_safe_halt|xen_hypercall_sched_op|xen_hypercall_vcpu_op)")

I've also used it for other non-CPU events including off-CPU spans by
adapting it to sample equivalents.

>
> 2) I'm curious if you've considered directly using perf.data in
> Flamescope, rather than perf.script? I've recently discovered the
> "--symfs" and "--kallsyms" options for perf. By using perf buildid-list,
> you can identify all DSOs, capture their symbol tables, and create a
> minimal bundle of files to allow the perf.data to be read with useful
> symbols on any system. Since perf.data contains more information,
> usually with less disk space, I've started taking this approach to make
> capturing, transferring, and analyzing larger recordings (especially
> from customers) easier as well as more flexible and efficient. All the
> same analysis can be done via the Python engine in perf-script, without
> need to worry about text parsing.

We do gzip the perf script outputs. Just checking the README, I should
probably change 'perf script --header' to use -F to specify the
fields, to make it more future proof.

I haven't explored the buildid-list path since we have Java apps with
massive symbol tables that can be 100s of Mbytes of text, and other
binaries that use a mix of ELF symbol tables and DWARF debuginfo. I've
assumed this will be too big to include, but haven't tried yet. Maybe
it's better suited for some apps with smaller symbol tables?

Brendan

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: A list of visual Profiler UIs for linux perf
  2021-09-09  2:15     ` Brendan Gregg
@ 2021-09-09 18:45       ` Stephen Brennan
  0 siblings, 0 replies; 6+ messages in thread
From: Stephen Brennan @ 2021-09-09 18:45 UTC (permalink / raw)
  To: Brendan Gregg; +Cc: Mark Hansen, linux-perf-use.

Brendan Gregg <brendan.d.gregg@gmail.com> writes:
> G'Day Stephen,
>
> On Thu, Sep 9, 2021 at 5:13 AM Stephen Brennan
> <stephen.s.brennan@oracle.com> wrote:
>>
>> Hi Mark & Brendan,
>>
>> Thanks for this thread - it's very useful.
>>
>> Firefox Profiler is news to me and looks exciting. However, I can't see
>> any clear documentation on their part that the tool is client-side only.
>> I can't always put internal flamegraphs into a web form on the
>> assumption that they won't be uploaded somewhere. Do you know if there's
>> anything explicit that says data won't be shared? (unless I explicitly
>> upload it to create a link)
>>
>> Flamescope is quite exciting as well! I can see how the time dimension
>> can be incredibly useful. Brendan, I had a couple questions regarding it:
>> 1) Is the box color scale in terms of "number of samples in that time
>> interval"? If so, it would only really be useful for cpu-cycles or
>> instructions, correct? Something like the cpu-clock which tries to
>> regularly sample at a set frequency would just look monochrome?
>
> Exclude idle stacks then cpu-cycles works. Most of our samples are
> cpu-cycles based (only thing available in most of EC2). FlameScope
> should already filter it:
>
> app/perf/regexp.py:idle_stack =
> re.compile("(cpuidle|cpu_idle|cpu_bringup_and_idle|native_safe_halt|xen_hypercall_sched_op|xen_hypercall_vcpu_op)")

Ah that makes sense, I always visually ignored the idle stacks, to the
point that I forgot they existed in most of these profiles.

Looking at idle stacks actually always gives me grief, because when I
see them, I feel compelled to compare them to the %idle time accounted
by the kernel. I used to have this naive hope/belief that a `perf record
-e cycles -F 1000` would give me exactly 1000 samples each second, and
so I could compare the percentage of idle stacks with the %idle time.
But due to frequency scaling during idle, that's usually not the case.
I've tried looking at cpu-clock (which has its own downsides, like
firing in an IRQ context rather than NMI) to get a consistent frequency.
This works but isn't great, since I like the benefits of an NMI event.

Flamescope seems to make this frustration (why aren't my samples at an
exact rate???) less of an issue, since you can see the sample count over
time. You can see the idle periods and the times of heavy utilization,
so it matters less whether the sample frequency is clock-like in
precision.

>
> I've also used it for other non-CPU events including off-CPU spans by
> adapting it to sample equivalents.
>
>>
>> 2) I'm curious if you've considered directly using perf.data in
>> Flamescope, rather than perf.script? I've recently discovered the
>> "--symfs" and "--kallsyms" options for perf. By using perf buildid-list,
>> you can identify all DSOs, capture their symbol tables, and create a
>> minimal bundle of files to allow the perf.data to be read with useful
>> symbols on any system. Since perf.data contains more information,
>> usually with less disk space, I've started taking this approach to make
>> capturing, transferring, and analyzing larger recordings (especially
>> from customers) easier as well as more flexible and efficient. All the
>> same analysis can be done via the Python engine in perf-script, without
>> need to worry about text parsing.
>
> We do gzip the perf script outputs. Just checking the README, I should
> probably change 'perf script --header' to use -F to specify the
> fields, to make it more future proof.
>
> I haven't explored the buildid-list path since we have Java apps with
> massive symbol tables that can be 100s of Mbytes of text, and other
> binaries that use a mix of ELF symbol tables and DWARF debuginfo. I've
> assumed this will be too big to include, but haven't tried yet. Maybe
> it's better suited for some apps with smaller symbol tables?

Got it! My main use case is debugging kernel bugs from external
customers, who wouldn't want to share their application symbols anyway.
We frequently make do with just the kernel symbols and application
names, but adding in symbols from a few system utilities and libraries
can be very useful and only takes a few MiBs usually.

I've wanted a perf.data format which includes all symbols resolved for a
while now, and maybe some day I'll know enough perf innards to implement
it. The perf.data + symbol tables in a tarball has worked alright. But
my ideal would be a way to have perf (1) do dwarf stack walking and
symbol table lookups, and (2) store that data back into the perf.data
file. Then analysis could be reliably done on another machine, and it
would include all the data from the original recording. (For example,
I've had missing events due to a PERF_RECORD_THROTTLE event coming in,
which perf.script files show me.)

Anyhow, I'm probably over-optimizing at this point. Thanks for sharing
your motivation and use case!

Stephen

> Brendan

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-09-09 18:45 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-05  6:19 A list of visual Profiler UIs for linux perf Mark Hansen
2021-09-05  6:50 ` Brendan Gregg
2021-09-08 19:13   ` Stephen Brennan
2021-09-09  2:15     ` Brendan Gregg
2021-09-09 18:45       ` Stephen Brennan
2021-09-05 20:33 ` Viktor Rosendahl

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.