* Reordering the thread output in perf trace --summary
@ 2016-05-04 9:02 Milian Wolff
2016-05-04 9:51 ` Milian Wolff
0 siblings, 1 reply; 8+ messages in thread
From: Milian Wolff @ 2016-05-04 9:02 UTC (permalink / raw)
To: linux-perf-users
[-- Attachment #1: Type: text/plain, Size: 2409 bytes --]
Hey all,
when using `perf trace --summary` on a (badly designed) user application that
creates tons of threads, the usually interesting overall summary is drowned by
the per-thread summary output. I.e.:
perf trace --summary lab_mandelbrot_concurrent |& grep events
lab_mandelbrot_ (19497), 9246 events, 25.7%, 0.000 msec
QXcbEventReader (19498), 1094 events, 3.0%, 0.000 msec
QDBusConnection (19499), 132 events, 0.4%, 0.000 msec
Thread (pooled) (19500), 1982 events, 5.5%, 0.000 msec
Thread (pooled) (19501), 114 events, 0.3%, 0.000 msec
lab_mandelbrot_ (19502), 88 events, 0.2%, 0.000 msec
Thread (pooled) (19503), 106 events, 0.3%, 0.000 msec
Thread (pooled) (19504), 101 events, 0.3%, 0.000 msec
Thread (pooled) (19505), 102 events, 0.3%, 0.000 msec
... continued for a total of 163 lines
usually, I forget to pipe the output of `perf trace --summary` into a file and
then have to rerun the command, as the total output (2643 lines!) easily
exceeds my scrollback buffer.
I would like to propose to reorder the output to sort the output in ascending
total event order, such that the most interesting output is shown at the
bottom of the output on the CLI. I.e. in the output above it should be
something like
perf trace --summary lab_mandelbrot_concurrent |& grep events
... continued for a total of 163 lines
lab_mandelbrot_ (19502), 88 events, 0.2%, 0.000 msec
Thread (pooled) (19501), 114 events, 0.3%, 0.000 msec
Thread (pooled) (19503), 106 events, 0.3%, 0.000 msec
Thread (pooled) (19504), 101 events, 0.3%, 0.000 msec
Thread (pooled) (19505), 102 events, 0.3%, 0.000 msec
QDBusConnection (19499), 132 events, 0.4%, 0.000 msec
QXcbEventReader (19498), 1094 events, 3.0%, 0.000 msec
Thread (pooled) (19500), 1982 events, 5.5%, 0.000 msec
lab_mandelbrot_ (19497), 9246 events, 25.7%, 0.000 msec
If this is acceptable to you, can someone please tell me how to do such a
seemingly simple task in C? In C++ I'd except to add a simple std::sort
somewhere, but in perf's C...? My current idea would be to run
machine__for_each_thread and store the even count + thread pointer in another
temporary buffer, which I then qsort and finally iterate over. Does that sound
OK, or how would you approach this task?
Thanks
--
Milian Wolff | milian.wolff@kdab.com | Software Engineer
KDAB (Deutschland) GmbH&Co KG, a KDAB Group company
Tel: +49-30-521325470
KDAB - The Qt Experts
[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5903 bytes --]
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Reordering the thread output in perf trace --summary
2016-05-04 9:02 Reordering the thread output in perf trace --summary Milian Wolff
@ 2016-05-04 9:51 ` Milian Wolff
2016-05-04 21:41 ` Arnaldo Carvalho de Melo
0 siblings, 1 reply; 8+ messages in thread
From: Milian Wolff @ 2016-05-04 9:51 UTC (permalink / raw)
To: linux-perf-users
[-- Attachment #1: Type: text/plain, Size: 6376 bytes --]
On Wednesday, May 4, 2016 11:02:12 AM CEST Milian Wolff wrote:
> Hey all,
>
> when using `perf trace --summary` on a (badly designed) user application
> that creates tons of threads, the usually interesting overall summary is
> drowned by the per-thread summary output. I.e.:
>
> perf trace --summary lab_mandelbrot_concurrent |& grep events
> lab_mandelbrot_ (19497), 9246 events, 25.7%, 0.000 msec
> QXcbEventReader (19498), 1094 events, 3.0%, 0.000 msec
> QDBusConnection (19499), 132 events, 0.4%, 0.000 msec
> Thread (pooled) (19500), 1982 events, 5.5%, 0.000 msec
> Thread (pooled) (19501), 114 events, 0.3%, 0.000 msec
> lab_mandelbrot_ (19502), 88 events, 0.2%, 0.000 msec
> Thread (pooled) (19503), 106 events, 0.3%, 0.000 msec
> Thread (pooled) (19504), 101 events, 0.3%, 0.000 msec
> Thread (pooled) (19505), 102 events, 0.3%, 0.000 msec
> ... continued for a total of 163 lines
>
> usually, I forget to pipe the output of `perf trace --summary` into a file
> and then have to rerun the command, as the total output (2643 lines!)
> easily exceeds my scrollback buffer.
>
> I would like to propose to reorder the output to sort the output in
> ascending total event order, such that the most interesting output is shown
> at the bottom of the output on the CLI. I.e. in the output above it should
> be something like
>
> perf trace --summary lab_mandelbrot_concurrent |& grep events
> ... continued for a total of 163 lines
> lab_mandelbrot_ (19502), 88 events, 0.2%, 0.000 msec
> Thread (pooled) (19501), 114 events, 0.3%, 0.000 msec
> Thread (pooled) (19503), 106 events, 0.3%, 0.000 msec
> Thread (pooled) (19504), 101 events, 0.3%, 0.000 msec
> Thread (pooled) (19505), 102 events, 0.3%, 0.000 msec
> QDBusConnection (19499), 132 events, 0.4%, 0.000 msec
> QXcbEventReader (19498), 1094 events, 3.0%, 0.000 msec
> Thread (pooled) (19500), 1982 events, 5.5%, 0.000 msec
> lab_mandelbrot_ (19497), 9246 events, 25.7%, 0.000 msec
>
> If this is acceptable to you, can someone please tell me how to do such a
> seemingly simple task in C? In C++ I'd except to add a simple std::sort
> somewhere, but in perf's C...? My current idea would be to run
> machine__for_each_thread and store the even count + thread pointer in
> another temporary buffer, which I then qsort and finally iterate over. Does
> that sound OK, or how would you approach this task?
While at it, can we similarly reorder the output of the per-thread syscall
list? At the moment it is e.g.:
syscall calls total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- --------- --------- --------- --------- ------
read 166 0.332 0.001 0.002 0.031 10.22%
write 13 0.038 0.002 0.003 0.006 12.41%
open 448 1.189 0.001 0.003 0.020 1.94%
close 185 0.270 0.001 0.001 0.022 7.78%
stat 507 0.823 0.001 0.002 0.009 2.34%
fstat 215 0.211 0.001 0.001 0.001 1.00%
lstat 317 0.469 0.001 0.001 0.003 1.42%
poll 176 0.534 0.001 0.003 0.169 32.22%
lseek 1 0.001 0.001 0.001 0.001 0.00%
mmap 384 1.184 0.002 0.003 0.006 1.20%
mprotect 238 0.949 0.001 0.004 0.013 1.96%
munmap 42 0.501 0.002 0.012 0.107 27.58%
brk 12 0.042 0.001 0.004 0.013 26.16%
rt_sigaction 2 0.002 0.001 0.001 0.001 12.90%
rt_sigprocmask 1 0.001 0.001 0.001 0.001 0.00%
writev 165 0.387 0.002 0.002 0.005 1.57%
access 156 0.250 0.001 0.002 0.011 4.88%
socket 2 0.012 0.005 0.006 0.007 12.05%
connect 2 0.014 0.005 0.007 0.009 25.14%
recvfrom 4 0.014 0.002 0.003 0.008 45.24%
recvmsg 16 0.029 0.001 0.002 0.004 12.30%
shutdown 1 0.004 0.004 0.004 0.004 0.00%
getsockname 1 0.001 0.001 0.001 0.001 0.00%
getpeername 1 0.002 0.002 0.002 0.002 0.00%
getsockopt 1 0.002 0.002 0.002 0.002 0.00%
clone 34 7.506 0.207 0.221 0.295 1.49%
uname 2 0.003 0.001 0.001 0.001 4.12%
fcntl 32 0.032 0.001 0.001 0.001 2.53%
getdents 16 0.057 0.001 0.004 0.007 15.32%
readlink 11 0.020 0.001 0.002 0.005 19.02%
getrlimit 1 0.001 0.001 0.001 0.001 0.00%
getuid 2 0.002 0.001 0.001 0.001 15.24%
getgid 1 0.001 0.001 0.001 0.001 0.00%
geteuid 2 0.002 0.001 0.001 0.001 23.66%
getegid 1 0.001 0.001 0.001 0.001 0.00%
statfs 8 0.020 0.002 0.002 0.004 10.60%
arch_prctl 1 0.001 0.001 0.001 0.001 0.00%
futex 489 1466.240 0.001 2.998 1447.978 98.75%
set_tid_address 1 0.001 0.001 0.001 0.001 0.00%
clock_getres 1 0.001 0.001 0.001 0.001 0.00%
set_robust_list 1 0.001 0.001 0.001 0.001 0.00%
This output is not sorted by syscall name, nor by number of calls or total or
anything... Could we maybe sort it by total msecs by default? Or maybe by
syscall name and then offer the user a way to sort it by calls/total msecs
instead?
Thanks
--
Milian Wolff | milian.wolff@kdab.com | Software Engineer
KDAB (Deutschland) GmbH&Co KG, a KDAB Group company
Tel: +49-30-521325470
KDAB - The Qt Experts
[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5903 bytes --]
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Reordering the thread output in perf trace --summary
2016-05-04 9:51 ` Milian Wolff
@ 2016-05-04 21:41 ` Arnaldo Carvalho de Melo
2016-05-05 16:04 ` [DONE] " Arnaldo Carvalho de Melo
0 siblings, 1 reply; 8+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-05-04 21:41 UTC (permalink / raw)
To: Milian Wolff; +Cc: linux-perf-users
Em Wed, May 04, 2016 at 11:51:04AM +0200, Milian Wolff escreveu:
> On Wednesday, May 4, 2016 11:02:12 AM CEST Milian Wolff wrote:
> > I would like to propose to reorder the output to sort the output in
> > ascending total event order, such that the most interesting output is shown
> > at the bottom of the output on the CLI. I.e. in the output above it should
> > be something like
> > perf trace --summary lab_mandelbrot_concurrent |& grep events
> > ... continued for a total of 163 lines
> > lab_mandelbrot_ (19502), 88 events, 0.2%, 0.000 msec
> > Thread (pooled) (19501), 114 events, 0.3%, 0.000 msec
> > Thread (pooled) (19503), 106 events, 0.3%, 0.000 msec
> > Thread (pooled) (19504), 101 events, 0.3%, 0.000 msec
> > Thread (pooled) (19505), 102 events, 0.3%, 0.000 msec
> > QDBusConnection (19499), 132 events, 0.4%, 0.000 msec
> > QXcbEventReader (19498), 1094 events, 3.0%, 0.000 msec
> > Thread (pooled) (19500), 1982 events, 5.5%, 0.000 msec
> > lab_mandelbrot_ (19497), 9246 events, 25.7%, 0.000 msec
> > If this is acceptable to you, can someone please tell me how to do such a
> > seemingly simple task in C? In C++ I'd except to add a simple std::sort
> > somewhere, but in perf's C...? My current idea would be to run
> > machine__for_each_thread and store the even count + thread pointer in
> > another temporary buffer, which I then qsort and finally iterate over. Does
> > that sound OK, or how would you approach this task?
> While at it, can we similarly reorder the output of the per-thread syscall
> list? At the moment it is e.g.:
Take a look at my perf/core branch, I have it working there.
I'm in the process of experimenting with creating some kinde of template
for resorting rb_trees, that will reduce the boilerplace while keeping
it following the principles described in Documentation/rbtree.txt.
Using it:
# trace -a -s sleep 1
<SNIP>
gnome-shell (2231), 148 events, 10.3%, 0.000 msec
syscall calls total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- --------- --------- --------- --------- ------
poll 14 8.138 0.000 0.581 8.012 98.33%
ioctl 17 0.096 0.001 0.006 0.054 54.34%
recvmsg 30 0.070 0.001 0.002 0.005 7.87%
writev 6 0.032 0.004 0.005 0.006 5.43%
read 4 0.010 0.002 0.003 0.003 9.83%
write 3 0.006 0.002 0.002 0.002 13.11%
Xorg (1965), 150 events, 10.4%, 0.000 msec
syscall calls total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- --------- --------- --------- --------- ------
select 11 377.791 0.000 34.345 267.619 72.83%
writev 12 0.064 0.002 0.005 0.010 12.94%
ioctl 3 0.059 0.005 0.020 0.041 55.30%
recvmsg 18 0.050 0.001 0.003 0.005 10.72%
setitimer 18 0.032 0.001 0.002 0.004 10.40%
rt_sigprocmask 10 0.014 0.001 0.001 0.004 20.81%
poll 2 0.004 0.001 0.002 0.003 47.14%
read 1 0.003 0.003 0.003 0.003 0.00%
qemu-system-x86 (10021), 272 events, 18.8%, 0.000 msec
syscall calls total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- --------- --------- --------- --------- ------
poll 102 989.336 0.000 9.699 30.118 14.38%
read 34 0.200 0.003 0.006 0.014 7.01%
qemu-system-x86 (9931), 464 events, 32.2%, 0.000 msec
syscall calls total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- --------- --------- --------- --------- ------
ppoll 96 982.288 0.000 10.232 30.035 12.59%
write 34 0.368 0.003 0.011 0.026 5.80%
ioctl 102 0.290 0.001 0.003 0.010 4.74%
[root@jouet ~]#
Gotta check why the total time per thread is zeroed tho...
- Arnaldo
^ permalink raw reply [flat|nested] 8+ messages in thread
* [DONE] Re: Reordering the thread output in perf trace --summary
2016-05-04 21:41 ` Arnaldo Carvalho de Melo
@ 2016-05-05 16:04 ` Arnaldo Carvalho de Melo
2016-05-09 8:28 ` Milian Wolff
0 siblings, 1 reply; 8+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-05-05 16:04 UTC (permalink / raw)
To: Milian Wolff; +Cc: David Ahern, linux-perf-users
Em Wed, May 04, 2016 at 06:41:23PM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Wed, May 04, 2016 at 11:51:04AM +0200, Milian Wolff escreveu:
> > On Wednesday, May 4, 2016 11:02:12 AM CEST Milian Wolff wrote:
> > > I would like to propose to reorder the output to sort the output in
> > > ascending total event order, such that the most interesting output is shown
> > > at the bottom of the output on the CLI. I.e. in the output above it should
> > > be something like
> > > perf trace --summary lab_mandelbrot_concurrent |& grep events
> > > ... continued for a total of 168 lines
> > > QDBusConnection (19499), 132 events, 0.4%, 0.000 msec
> > > QXcbEventReader (19498), 1094 events, 3.0%, 0.000 msec
> > > Thread (pooled) (19500), 1982 events, 5.5%, 0.000 msec
> > > lab_mandelbrot_ (19497), 9246 events, 25.7%, 0.000 msec
> > > If this is acceptable to you, can someone please tell me how to do such a
> > > seemingly simple task in C? In C++ I'd except to add a simple std::sort
> > > somewhere, but in perf's C...? My current idea would be to run
> > > machine__for_each_thread and store the even count + thread pointer in
> > > another temporary buffer, which I then qsort and finally iterate over. Does
> > > that sound OK, or how would you approach this task?
> > While at it, can we similarly reorder the output of the per-thread syscall
> > list? At the moment it is e.g.:
> Take a look at my perf/core branch, I have it working there.
> I'm in the process of experimenting with creating some kinde of template
> for resorting rb_trees, that will reduce the boilerplace while keeping
> it following the principles described in Documentation/rbtree.txt.
Ok, done, got really small and easy to change the keys if we want to,
not dynamicly tho as-is now, but should be easy, with offsetof 8-)
Anyway, I'm satisfied and pushed to perf/core, now looking at why total
thread time is zeroed...
Please take a look and check if it works for you,
- Arnaldo
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [DONE] Re: Reordering the thread output in perf trace --summary
2016-05-05 16:04 ` [DONE] " Arnaldo Carvalho de Melo
@ 2016-05-09 8:28 ` Milian Wolff
2016-05-09 16:25 ` Arnaldo Carvalho de Melo
0 siblings, 1 reply; 8+ messages in thread
From: Milian Wolff @ 2016-05-09 8:28 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo; +Cc: David Ahern, linux-perf-users
[-- Attachment #1: Type: text/plain, Size: 2541 bytes --]
On Thursday, May 5, 2016 1:04:02 PM CEST Arnaldo Carvalho de Melo wrote:
> Em Wed, May 04, 2016 at 06:41:23PM -0300, Arnaldo Carvalho de Melo escreveu:
> > Em Wed, May 04, 2016 at 11:51:04AM +0200, Milian Wolff escreveu:
> > > On Wednesday, May 4, 2016 11:02:12 AM CEST Milian Wolff wrote:
> > > > I would like to propose to reorder the output to sort the output in
> > > > ascending total event order, such that the most interesting output is
> > > > shown
> > > > at the bottom of the output on the CLI. I.e. in the output above it
> > > > should
> > > > be something like
> > > >
> > > > perf trace --summary lab_mandelbrot_concurrent |& grep events
> > > > ... continued for a total of 168 lines
> > > >
> > > > QDBusConnection (19499), 132 events, 0.4%, 0.000 msec
> > > > QXcbEventReader (19498), 1094 events, 3.0%, 0.000 msec
> > > > Thread (pooled) (19500), 1982 events, 5.5%, 0.000 msec
> > > > lab_mandelbrot_ (19497), 9246 events, 25.7%, 0.000 msec
> > > >
> > > > If this is acceptable to you, can someone please tell me how to do
> > > > such a
> > > > seemingly simple task in C? In C++ I'd except to add a simple
> > > > std::sort
> > > > somewhere, but in perf's C...? My current idea would be to run
> > > > machine__for_each_thread and store the even count + thread pointer in
> > > > another temporary buffer, which I then qsort and finally iterate over.
> > > > Does
> > > > that sound OK, or how would you approach this task?
> > >
> > > While at it, can we similarly reorder the output of the per-thread
> > > syscall
> >
> > > list? At the moment it is e.g.:
> > Take a look at my perf/core branch, I have it working there.
> >
> > I'm in the process of experimenting with creating some kinde of template
> > for resorting rb_trees, that will reduce the boilerplace while keeping
> > it following the principles described in Documentation/rbtree.txt.
>
> Ok, done, got really small and easy to change the keys if we want to,
> not dynamicly tho as-is now, but should be easy, with offsetof 8-)
>
> Anyway, I'm satisfied and pushed to perf/core, now looking at why total
> thread time is zeroed...
>
> Please take a look and check if it works for you,
Great Arnaldo, thanks a lot! A pleasant surprise to come home from a sunny
weekend and see this gem waiting for me :)
I played around with it, and it does work as advertised. Great work!
Cheers
--
Milian Wolff | milian.wolff@kdab.com | Software Engineer
KDAB (Deutschland) GmbH&Co KG, a KDAB Group company
Tel: +49-30-521325470
KDAB - The Qt Experts
[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5903 bytes --]
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [DONE] Re: Reordering the thread output in perf trace --summary
2016-05-09 8:28 ` Milian Wolff
@ 2016-05-09 16:25 ` Arnaldo Carvalho de Melo
2016-05-09 18:03 ` Milian Wolff
0 siblings, 1 reply; 8+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-05-09 16:25 UTC (permalink / raw)
To: Milian Wolff; +Cc: David Ahern, linux-perf-users
Em Mon, May 09, 2016 at 10:28:01AM +0200, Milian Wolff escreveu:
> On Thursday, May 5, 2016 1:04:02 PM CEST Arnaldo Carvalho de Melo wrote:
> > Em Wed, May 04, 2016 at 06:41:23PM -0300, Arnaldo Carvalho de Melo escreveu:
> > > Em Wed, May 04, 2016 at 11:51:04AM +0200, Milian Wolff escreveu:
> > > > On Wednesday, May 4, 2016 11:02:12 AM CEST Milian Wolff wrote:
> > > > While at it, can we similarly reorder the output of the per-thread
> > > > syscall
> > > > list? At the moment it is e.g.:
> > > Take a look at my perf/core branch, I have it working there.
> > > I'm in the process of experimenting with creating some kinde of template
> > > for resorting rb_trees, that will reduce the boilerplace while keeping
> > > it following the principles described in Documentation/rbtree.txt.
> > Ok, done, got really small and easy to change the keys if we want to,
> > not dynamicly tho as-is now, but should be easy, with offsetof 8-)
> > Anyway, I'm satisfied and pushed to perf/core, now looking at why total
> > thread time is zeroed...
> > Please take a look and check if it works for you,
> Great Arnaldo, thanks a lot! A pleasant surprise to come home from a sunny
> weekend and see this gem waiting for me :)
>
> I played around with it, and it does work as advertised. Great work!
Glad you liked it :-)
I'll probably make it use the sched:sched_stat_runtime data as the sort
key for threads if --stat is used, what do you think?
- Arnaldo
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [DONE] Re: Reordering the thread output in perf trace --summary
2016-05-09 16:25 ` Arnaldo Carvalho de Melo
@ 2016-05-09 18:03 ` Milian Wolff
2016-05-09 20:12 ` Arnaldo Carvalho de Melo
0 siblings, 1 reply; 8+ messages in thread
From: Milian Wolff @ 2016-05-09 18:03 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo; +Cc: David Ahern, linux-perf-users
[-- Attachment #1: Type: text/plain, Size: 2517 bytes --]
On Monday, May 9, 2016 1:25:16 PM CEST Arnaldo Carvalho de Melo wrote:
> Em Mon, May 09, 2016 at 10:28:01AM +0200, Milian Wolff escreveu:
> > On Thursday, May 5, 2016 1:04:02 PM CEST Arnaldo Carvalho de Melo wrote:
> > > Em Wed, May 04, 2016 at 06:41:23PM -0300, Arnaldo Carvalho de Melo
escreveu:
> > > > Em Wed, May 04, 2016 at 11:51:04AM +0200, Milian Wolff escreveu:
> > > > > On Wednesday, May 4, 2016 11:02:12 AM CEST Milian Wolff wrote:
> > > > > While at it, can we similarly reorder the output of the per-thread
> > > > > syscall
> > > >
> > > > > list? At the moment it is e.g.:
> > > > Take a look at my perf/core branch, I have it working there.
> > > >
> > > > I'm in the process of experimenting with creating some kinde of
> > > > template
> > > > for resorting rb_trees, that will reduce the boilerplace while keeping
> > > > it following the principles described in Documentation/rbtree.txt.
> > >
> > > Ok, done, got really small and easy to change the keys if we want to,
> > > not dynamicly tho as-is now, but should be easy, with offsetof 8-)
> > >
> > > Anyway, I'm satisfied and pushed to perf/core, now looking at why total
> > > thread time is zeroed...
> > >
> > > Please take a look and check if it works for you,
> >
> > Great Arnaldo, thanks a lot! A pleasant surprise to come home from a sunny
> > weekend and see this gem waiting for me :)
> >
> > I played around with it, and it does work as advertised. Great work!
>
> Glad you liked it :-)
>
> I'll probably make it use the sched:sched_stat_runtime data as the sort
> key for threads if --stat is used, what do you think?
You mean if `--sched` was used? I'm undecided on this. On one hand, the user
explicitly requests `--sched` so he probably is interested in it. On the other
hand, the total number of syscalls or wait time per thread may be more
interesting... If you use this to find contention issues e.g. then the threads
suffering most from contention issues will have a low runtime.
Personally, I think this is yet another situation where a proper GUI could
solve this problem nicely. A trivial tree view with the option to adapt the
sorting and aggregation as needed and a way to filter out certain syscalls
post-collection would be really nice to have. I'm very much looking forward to
having more time at hands again to finally tackle this.
Bye
--
Milian Wolff | milian.wolff@kdab.com | Software Engineer
KDAB (Deutschland) GmbH&Co KG, a KDAB Group company
Tel: +49-30-521325470
KDAB - The Qt Experts
[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5903 bytes --]
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [DONE] Re: Reordering the thread output in perf trace --summary
2016-05-09 18:03 ` Milian Wolff
@ 2016-05-09 20:12 ` Arnaldo Carvalho de Melo
0 siblings, 0 replies; 8+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-05-09 20:12 UTC (permalink / raw)
To: Milian Wolff; +Cc: David Ahern, linux-perf-users
Em Mon, May 09, 2016 at 08:03:30PM +0200, Milian Wolff escreveu:
> On Monday, May 9, 2016 1:25:16 PM CEST Arnaldo Carvalho de Melo wrote:
> > I'll probably make it use the sched:sched_stat_runtime data as the sort
> > key for threads if --stat is used, what do you think?
> You mean if `--sched` was used? I'm undecided on this. On one hand, the user
> explicitly requests `--sched` so he probably is interested in it. On the other
> hand, the total number of syscalls or wait time per thread may be more
> interesting... If you use this to find contention issues e.g. then the threads
> suffering most from contention issues will have a low runtime.
> Personally, I think this is yet another situation where a proper GUI could
> solve this problem nicely. A trivial tree view with the option to adapt the
> sorting and aggregation as needed and a way to filter out certain syscalls
> post-collection would be really nice to have. I'm very much looking forward to
> having more time at hands again to finally tackle this.
Yeah, we could make that dynamic, the trace case is basically 'perf
top/report' working on two events at a a time (enter/exit).
In the end I think the best way would be to get 'perf trace' to use the
hists browser like top and report :-)
I.e. we would be showing that 'perf trace --summary' in "real time", to
abuse that term a bit more :-)
We would be producing it and refreshing it over time.
A generic mechanism for matching up pairs (or more) events and present
them together with things like callchains, like we do now specifically
for sys_enter/sys_exit syscall tracepoints would be handy indeed.
But feel encouraged to try a GUI to exercise your ideas, hopefully the
existing infrastructure can get readily used for that, let us know about
any change that you think would help with that.
- Arnaldo
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2016-05-09 20:13 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-05-04 9:02 Reordering the thread output in perf trace --summary Milian Wolff
2016-05-04 9:51 ` Milian Wolff
2016-05-04 21:41 ` Arnaldo Carvalho de Melo
2016-05-05 16:04 ` [DONE] " Arnaldo Carvalho de Melo
2016-05-09 8:28 ` Milian Wolff
2016-05-09 16:25 ` Arnaldo Carvalho de Melo
2016-05-09 18:03 ` Milian Wolff
2016-05-09 20:12 ` Arnaldo Carvalho de Melo
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.