linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* heads up/RFC: 'perf trace' using ordered_events
@ 2015-03-03 16:49 Arnaldo Carvalho de Melo
  2015-03-04  1:01 ` Namhyung Kim
  0 siblings, 1 reply; 8+ messages in thread
From: Arnaldo Carvalho de Melo @ 2015-03-03 16:49 UTC (permalink / raw)
  To: David Ahern, Namhyung Kim, Jiri Olsa, Ingo Molnar,
	Frédéric Weisbecker
  Cc: Linux Kernel Mailing List

Hi,

	Just a preview, but this is something David had mentioned at some
point, a major problem with 'perf trace' was that it wasn't using
'perf_session' event reordering mechanism, so I've been working on making it
use it, refactoring the ordered_events code so that it can be used by tools
that don't deal with perf.data files.

	I still have to investigate why there are so many sched_wakeup at the beginning,
but probably they are really being generated by the kernel and 'perf trace' has to make
this output more compact, perhaps noticing that a number of similar events appeared on
the stream and instead of writing N lines, print some prefix ([N*] perhaps) and then
the line.

	Also its keeping pointers to the ring buffer, probably we'll need to copy or
store the perf_sample somehow in the 'ordered_event' instance.

	But as this is touching things that Namhyung is touching as well, I thought about
sharing this stuff, its on my tree, branch tmp.perf/trace_ordered_events.

	Lunchtime, bbl!

- Arnaldo

[root@ssdandy ~]# trace --ev sched:* usleep 1
     0.000 (         ): sched:sched_wakeup:comm=trace pid=4677 prio=120 success=1 target_cpu=001)
     0.001 (         ): sched:sched_wakeup:comm=trace pid=4677 prio=120 success=1 target_cpu=001)
     0.002 (         ): sched:sched_wakeup:comm=trace pid=4677 prio=120 success=1 target_cpu=001)
     0.002 (         ): sched:sched_wakeup:comm=trace pid=4677 prio=120 success=1 target_cpu=001)
     0.003 (         ): sched:sched_wakeup:comm=trace pid=4677 prio=120 success=1 target_cpu=001)
     0.003 (         ): sched:sched_wakeup:comm=trace pid=4677 prio=120 success=1 target_cpu=001)
     0.003 (         ): sched:sched_wakeup:comm=trace pid=4677 prio=120 success=1 target_cpu=001)
     0.004 (         ): sched:sched_wakeup:comm=trace pid=4677 prio=120 success=1 target_cpu=001)
     1.367 (         ): sched:sched_wakeup:comm=trace pid=4677 prio=120 success=1 target_cpu=003)
     1.368 (         ): sched:sched_wakeup:comm=trace pid=4677 prio=120 success=1 target_cpu=003)
     1.369 (         ): sched:sched_wakeup:comm=trace pid=4677 prio=120 success=1 target_cpu=003)
     1.369 (         ): sched:sched_wakeup:comm=trace pid=4677 prio=120 success=1 target_cpu=003)
     1.369 (         ): sched:sched_wakeup:comm=trace pid=4677 prio=120 success=1 target_cpu=003)
     1.370 (         ): sched:sched_wakeup:comm=trace pid=4677 prio=120 success=1 target_cpu=003)
     1.370 (         ): sched:sched_wakeup:comm=trace pid=4677 prio=120 success=1 target_cpu=003)
     1.371 (         ): sched:sched_wakeup:comm=trace pid=4677 prio=120 success=1 target_cpu=003)
     1.615 (         ): sched:sched_process_exec:filename=/bin/usleep pid=4677 old_pid=4677)
     1.621 (         ): sched:sched_wakeup:comm=rcuop/3 pid=16 prio=120 success=1 target_cpu=005)
     1.640 ( 0.001 ms): usleep/4677 brk(                                                                  ) = 0x64e000
     1.662 ( 0.005 ms): usleep/4677 mmap(len: 4096, prot: READ|WRITE, flags: PRIVATE|ANONYMOUS, fd: -1    ) = 0x7f3d1d7ef000
     1.675 ( 0.005 ms): usleep/4677 access(filename: 0x7f3d1d5edbe0, mode: R                              ) = -1 ENOENT No such file or directory
     1.687 ( 0.004 ms): usleep/4677 open(filename: 0x7f3d1d5ec4d8, flags: CLOEXEC                         ) = 3
     1.689 ( 0.001 ms): usleep/4677 fstat(fd: 3, statbuf: 0x7fff28219d10                                  ) = 0
     1.695 ( 0.005 ms): usleep/4677 mmap(len: 68635, prot: READ, flags: PRIVATE, fd: 3                    ) = 0x7f3d1d7de000
     1.696 ( 0.001 ms): usleep/4677 close(fd: 3                                                           ) = 0
     1.714 ( 0.007 ms): usleep/4677 open(filename: 0x7f3d1d7e867d, flags: CLOEXEC                         ) = 3
     1.716 ( 0.002 ms): usleep/4677 read(fd: 3, buf: 0x7fff28219eb0, count: 832                           ) = 832
     1.719 ( 0.001 ms): usleep/4677 fstat(fd: 3, statbuf: 0x7fff28219d60                                  ) = 0
     1.727 ( 0.006 ms): usleep/4677 mmap(len: 2135088, prot: EXEC|READ, flags: PRIVATE|DENYWRITE, fd: 3   ) = 0x7f3d1d3c6000
     1.734 ( 0.007 ms): usleep/4677 mprotect(start: 0x7f3d1d3cf000, len: 2093056                          ) = 0
     1.741 ( 0.007 ms): usleep/4677 mmap(addr: 0x7f3d1d5ce000, len: 8192, prot: READ|WRITE, flags: PRIVATE|DENYWRITE|FIXED, fd: 3, off: 32768) = 0x7f3d1d5ce000
     1.748 ( 0.001 ms): usleep/4677 close(fd: 3                                                           ) = 0
     1.763 ( 0.006 ms): usleep/4677 open(filename: 0x7f3d1d7ed5fa, flags: CLOEXEC                         ) = 3
     1.765 ( 0.002 ms): usleep/4677 read(fd: 3, buf: 0x7fff28219e80, count: 832                           ) = 832
     1.767 ( 0.001 ms): usleep/4677 fstat(fd: 3, statbuf: 0x7fff28219d30                                  ) = 0
     1.775 ( 0.006 ms): usleep/4677 mmap(len: 3932736, prot: EXEC|READ, flags: PRIVATE|DENYWRITE, fd: 3   ) = 0x7f3d1d005000
     1.782 ( 0.007 ms): usleep/4677 mprotect(start: 0x7f3d1d1bb000, len: 2097152                          ) = 0
     1.790 ( 0.008 ms): usleep/4677 mmap(addr: 0x7f3d1d3bb000, len: 24576, prot: READ|WRITE, flags: PRIVATE|DENYWRITE|FIXED, fd: 3, off: 1794048) ...
     1.790 (         ): sched:sched_stat_runtime:comm=usleep pid=4677 runtime=426124 [ns] vruntime=569867840059 [ns])
     1.793 (         ): sched:sched_wakeup:comm=rcuc/3 pid=56 prio=97 success=1 target_cpu=003)
     1.794 (         ): sched:sched_wakeup:comm=ksoftirqd/3 pid=58 prio=98 success=1 target_cpu=003)
     1.797 (         ): sched:sched_stat_runtime:comm=usleep pid=4677 runtime=2371 [ns] vruntime=569867842430 [ns])
     1.798 (         ): sched:sched_switch:prev_comm=usleep prev_pid=4677 prev_prio=120 prev_state=R+ ==> next_comm=rcuc/3 next_pid=56 next_prio=97)
     1.832 ( 0.049 ms): usleep/4677  ... [continued]: mmap()) = 0x7f3d1d3bb000
     1.844 ( 0.007 ms): usleep/4677 mmap(addr: 0x7f3d1d3c1000, len: 16960, prot: READ|WRITE, flags: PRIVATE|ANONYMOUS|FIXED, fd: -1) = 0x7f3d1d3c1000
     1.849 ( 0.001 ms): usleep/4677 close(fd: 3                                                           ) = 0
     1.859 ( 0.005 ms): usleep/4677 mmap(len: 4096, prot: READ|WRITE, flags: PRIVATE|ANONYMOUS, fd: -1    ) = 0x7f3d1d7dd000
     1.873 ( 0.005 ms): usleep/4677 mmap(len: 8192, prot: READ|WRITE, flags: PRIVATE|ANONYMOUS, fd: -1    ) = 0x7f3d1d7db000
     1.879 ( 0.001 ms): usleep/4677 arch_prctl(option: 4098, arg2: 139900464510784, arg3: 139900464513104, arg4: 34, arg5: 4294967295) = 0
     1.969 ( 0.009 ms): usleep/4677 mprotect(start: 0x7f3d1d3bb000, len: 16384, prot: READ                ) = 0
     1.982 ( 0.007 ms): usleep/4677 mprotect(start: 0x7f3d1d5ce000, len: 4096, prot: READ                 ) = 0
     1.991 ( 0.006 ms): usleep/4677 mprotect(start: 0x601000, len: 4096, prot: READ                       ) = 0
     2.001 ( 0.007 ms): usleep/4677 mprotect(start: 0x7f3d1d7f0000, len: 4096, prot: READ                 ) = 0
     2.009 ( 0.007 ms): usleep/4677 munmap(addr: 0x7f3d1d7de000, len: 68635                               ) = 0
     2.049 (         ): sched:sched_stat_runtime:comm=usleep pid=4677 runtime=220367 [ns] vruntime=569868062797 [ns])
     2.050 (         ): sched:sched_switch:prev_comm=usleep prev_pid=4677 prev_prio=120 prev_state=R ==> next_comm=irq/51-enp0s25 next_pid=1164 next_prio=49)
     2.097 (         ): sched:sched_stat_runtime:comm=usleep pid=4677 runtime=29987 [ns] vruntime=569868092784 [ns])
     2.099 (         ): sched:sched_switch:prev_comm=usleep prev_pid=4677 prev_prio=120 prev_state=R+ ==> next_comm=irq/51-enp0s25 next_pid=1164 next_prio=49)
     2.121 ( 0.001 ms): usleep/4677 brk(                                                                  ) = 0x64e000
     2.126 ( 0.005 ms): usleep/4677 brk(brk: 0x66f000                                                     ) = 0x66f000
     2.129 ( 0.001 ms): usleep/4677 brk(                                                                  ) = 0x66f000
     2.148 (         ): sched:sched_stat_runtime:comm=usleep pid=4677 runtime=34634 [ns] vruntime=569868127418 [ns])
     2.149 (         ): sched:sched_switch:prev_comm=usleep prev_pid=4677 prev_prio=120 prev_state=R+ ==> next_comm=irq/51-enp0s25 next_pid=1164 next_prio=49)
     2.173 ( 0.002 ms): usleep/4677 nanosleep(rqtp: 0x7fff2821a4e0                                        ) ...
     2.173 (         ): sched:sched_stat_runtime:comm=usleep pid=4677 runtime=11440 [ns] vruntime=569868138858 [ns])
     2.174 (         ): sched:sched_switch:prev_comm=usleep prev_pid=4677 prev_prio=120 prev_state=S ==> next_comm=swapper/3 next_pid=0 next_prio=120)
     2.227 (         ): sched:sched_wakeup:comm=usleep pid=4677 prio=120 success=1 target_cpu=003)
     2.228 (         ): sched:sched_wakeup:comm=usleep pid=4677 prio=120 success=1 target_cpu=003)
     2.228 (         ): sched:sched_wakeup:comm=usleep pid=4677 prio=120 success=1 target_cpu=003)
     2.229 (         ): sched:sched_wakeup:comm=usleep pid=4677 prio=120 success=1 target_cpu=003)
     2.230 (         ): sched:sched_wakeup:comm=usleep pid=4677 prio=120 success=1 target_cpu=003)
     2.230 (         ): sched:sched_wakeup:comm=usleep pid=4677 prio=120 success=1 target_cpu=003)
     2.230 (         ): sched:sched_wakeup:comm=usleep pid=4677 prio=120 success=1 target_cpu=003)
     2.231 (         ): sched:sched_wakeup:comm=usleep pid=4677 prio=120 success=1 target_cpu=003)
     2.237 ( 0.067 ms): usleep/4677  ... [continued]: nanosleep()) = 0
     2.243 ( 0.000 ms): usleep/4677 exit_group(                                                           
     2.289 (         ): sched:sched_process_exit:comm=usleep pid=4677 prio=120)
[root@ssdandy ~]#

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: heads up/RFC: 'perf trace' using ordered_events
  2015-03-03 16:49 heads up/RFC: 'perf trace' using ordered_events Arnaldo Carvalho de Melo
@ 2015-03-04  1:01 ` Namhyung Kim
  2015-03-04  1:07   ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 8+ messages in thread
From: Namhyung Kim @ 2015-03-04  1:01 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: David Ahern, Jiri Olsa, Ingo Molnar,
	Frédéric Weisbecker, Linux Kernel Mailing List

On Tue, Mar 03, 2015 at 01:49:40PM -0300, Arnaldo Carvalho de Melo wrote:
> Hi,
> 
> 	Just a preview, but this is something David had mentioned at some
> point, a major problem with 'perf trace' was that it wasn't using
> 'perf_session' event reordering mechanism, so I've been working on making it
> use it, refactoring the ordered_events code so that it can be used by tools
> that don't deal with perf.data files.

At a conceptual level, I think event processing should be done with
session.  Even if perf trace does not do anything with a data file it
can have a (live) session like perf top does.  This way we can avoid
internal API changes.

Also, in my multi-thread work, I don't use ordered_events API much
since all samples are already ordered for each cpu or thread.  So I'd
like to decouple it as much as possible.


> 
> 	I still have to investigate why there are so many sched_wakeup at the beginning,
> but probably they are really being generated by the kernel and 'perf trace' has to make
> this output more compact, perhaps noticing that a number of similar events appeared on
> the stream and instead of writing N lines, print some prefix ([N*] perhaps) and then
> the line.
> 
> 	Also its keeping pointers to the ring buffer, probably we'll need to copy or
> store the perf_sample somehow in the 'ordered_event' instance.
> 
> 	But as this is touching things that Namhyung is touching as well, I thought about
> sharing this stuff, its on my tree, branch tmp.perf/trace_ordered_events.

Thanks for letting me know.
Namhyung


> 
> 	Lunchtime, bbl!
> 
> - Arnaldo
> 
> [root@ssdandy ~]# trace --ev sched:* usleep 1
>      0.000 (         ): sched:sched_wakeup:comm=trace pid=4677 prio=120 success=1 target_cpu=001)
>      0.001 (         ): sched:sched_wakeup:comm=trace pid=4677 prio=120 success=1 target_cpu=001)
>      0.002 (         ): sched:sched_wakeup:comm=trace pid=4677 prio=120 success=1 target_cpu=001)
>      0.002 (         ): sched:sched_wakeup:comm=trace pid=4677 prio=120 success=1 target_cpu=001)
>      0.003 (         ): sched:sched_wakeup:comm=trace pid=4677 prio=120 success=1 target_cpu=001)
>      0.003 (         ): sched:sched_wakeup:comm=trace pid=4677 prio=120 success=1 target_cpu=001)
>      0.003 (         ): sched:sched_wakeup:comm=trace pid=4677 prio=120 success=1 target_cpu=001)
>      0.004 (         ): sched:sched_wakeup:comm=trace pid=4677 prio=120 success=1 target_cpu=001)
>      1.367 (         ): sched:sched_wakeup:comm=trace pid=4677 prio=120 success=1 target_cpu=003)
>      1.368 (         ): sched:sched_wakeup:comm=trace pid=4677 prio=120 success=1 target_cpu=003)
>      1.369 (         ): sched:sched_wakeup:comm=trace pid=4677 prio=120 success=1 target_cpu=003)
>      1.369 (         ): sched:sched_wakeup:comm=trace pid=4677 prio=120 success=1 target_cpu=003)
>      1.369 (         ): sched:sched_wakeup:comm=trace pid=4677 prio=120 success=1 target_cpu=003)
>      1.370 (         ): sched:sched_wakeup:comm=trace pid=4677 prio=120 success=1 target_cpu=003)
>      1.370 (         ): sched:sched_wakeup:comm=trace pid=4677 prio=120 success=1 target_cpu=003)
>      1.371 (         ): sched:sched_wakeup:comm=trace pid=4677 prio=120 success=1 target_cpu=003)
>      1.615 (         ): sched:sched_process_exec:filename=/bin/usleep pid=4677 old_pid=4677)
>      1.621 (         ): sched:sched_wakeup:comm=rcuop/3 pid=16 prio=120 success=1 target_cpu=005)
>      1.640 ( 0.001 ms): usleep/4677 brk(                                                                  ) = 0x64e000
>      1.662 ( 0.005 ms): usleep/4677 mmap(len: 4096, prot: READ|WRITE, flags: PRIVATE|ANONYMOUS, fd: -1    ) = 0x7f3d1d7ef000
>      1.675 ( 0.005 ms): usleep/4677 access(filename: 0x7f3d1d5edbe0, mode: R                              ) = -1 ENOENT No such file or directory
>      1.687 ( 0.004 ms): usleep/4677 open(filename: 0x7f3d1d5ec4d8, flags: CLOEXEC                         ) = 3
>      1.689 ( 0.001 ms): usleep/4677 fstat(fd: 3, statbuf: 0x7fff28219d10                                  ) = 0
>      1.695 ( 0.005 ms): usleep/4677 mmap(len: 68635, prot: READ, flags: PRIVATE, fd: 3                    ) = 0x7f3d1d7de000
>      1.696 ( 0.001 ms): usleep/4677 close(fd: 3                                                           ) = 0
>      1.714 ( 0.007 ms): usleep/4677 open(filename: 0x7f3d1d7e867d, flags: CLOEXEC                         ) = 3
>      1.716 ( 0.002 ms): usleep/4677 read(fd: 3, buf: 0x7fff28219eb0, count: 832                           ) = 832
>      1.719 ( 0.001 ms): usleep/4677 fstat(fd: 3, statbuf: 0x7fff28219d60                                  ) = 0
>      1.727 ( 0.006 ms): usleep/4677 mmap(len: 2135088, prot: EXEC|READ, flags: PRIVATE|DENYWRITE, fd: 3   ) = 0x7f3d1d3c6000
>      1.734 ( 0.007 ms): usleep/4677 mprotect(start: 0x7f3d1d3cf000, len: 2093056                          ) = 0
>      1.741 ( 0.007 ms): usleep/4677 mmap(addr: 0x7f3d1d5ce000, len: 8192, prot: READ|WRITE, flags: PRIVATE|DENYWRITE|FIXED, fd: 3, off: 32768) = 0x7f3d1d5ce000
>      1.748 ( 0.001 ms): usleep/4677 close(fd: 3                                                           ) = 0
>      1.763 ( 0.006 ms): usleep/4677 open(filename: 0x7f3d1d7ed5fa, flags: CLOEXEC                         ) = 3
>      1.765 ( 0.002 ms): usleep/4677 read(fd: 3, buf: 0x7fff28219e80, count: 832                           ) = 832
>      1.767 ( 0.001 ms): usleep/4677 fstat(fd: 3, statbuf: 0x7fff28219d30                                  ) = 0
>      1.775 ( 0.006 ms): usleep/4677 mmap(len: 3932736, prot: EXEC|READ, flags: PRIVATE|DENYWRITE, fd: 3   ) = 0x7f3d1d005000
>      1.782 ( 0.007 ms): usleep/4677 mprotect(start: 0x7f3d1d1bb000, len: 2097152                          ) = 0
>      1.790 ( 0.008 ms): usleep/4677 mmap(addr: 0x7f3d1d3bb000, len: 24576, prot: READ|WRITE, flags: PRIVATE|DENYWRITE|FIXED, fd: 3, off: 1794048) ...
>      1.790 (         ): sched:sched_stat_runtime:comm=usleep pid=4677 runtime=426124 [ns] vruntime=569867840059 [ns])
>      1.793 (         ): sched:sched_wakeup:comm=rcuc/3 pid=56 prio=97 success=1 target_cpu=003)
>      1.794 (         ): sched:sched_wakeup:comm=ksoftirqd/3 pid=58 prio=98 success=1 target_cpu=003)
>      1.797 (         ): sched:sched_stat_runtime:comm=usleep pid=4677 runtime=2371 [ns] vruntime=569867842430 [ns])
>      1.798 (         ): sched:sched_switch:prev_comm=usleep prev_pid=4677 prev_prio=120 prev_state=R+ ==> next_comm=rcuc/3 next_pid=56 next_prio=97)
>      1.832 ( 0.049 ms): usleep/4677  ... [continued]: mmap()) = 0x7f3d1d3bb000
>      1.844 ( 0.007 ms): usleep/4677 mmap(addr: 0x7f3d1d3c1000, len: 16960, prot: READ|WRITE, flags: PRIVATE|ANONYMOUS|FIXED, fd: -1) = 0x7f3d1d3c1000
>      1.849 ( 0.001 ms): usleep/4677 close(fd: 3                                                           ) = 0
>      1.859 ( 0.005 ms): usleep/4677 mmap(len: 4096, prot: READ|WRITE, flags: PRIVATE|ANONYMOUS, fd: -1    ) = 0x7f3d1d7dd000
>      1.873 ( 0.005 ms): usleep/4677 mmap(len: 8192, prot: READ|WRITE, flags: PRIVATE|ANONYMOUS, fd: -1    ) = 0x7f3d1d7db000
>      1.879 ( 0.001 ms): usleep/4677 arch_prctl(option: 4098, arg2: 139900464510784, arg3: 139900464513104, arg4: 34, arg5: 4294967295) = 0
>      1.969 ( 0.009 ms): usleep/4677 mprotect(start: 0x7f3d1d3bb000, len: 16384, prot: READ                ) = 0
>      1.982 ( 0.007 ms): usleep/4677 mprotect(start: 0x7f3d1d5ce000, len: 4096, prot: READ                 ) = 0
>      1.991 ( 0.006 ms): usleep/4677 mprotect(start: 0x601000, len: 4096, prot: READ                       ) = 0
>      2.001 ( 0.007 ms): usleep/4677 mprotect(start: 0x7f3d1d7f0000, len: 4096, prot: READ                 ) = 0
>      2.009 ( 0.007 ms): usleep/4677 munmap(addr: 0x7f3d1d7de000, len: 68635                               ) = 0
>      2.049 (         ): sched:sched_stat_runtime:comm=usleep pid=4677 runtime=220367 [ns] vruntime=569868062797 [ns])
>      2.050 (         ): sched:sched_switch:prev_comm=usleep prev_pid=4677 prev_prio=120 prev_state=R ==> next_comm=irq/51-enp0s25 next_pid=1164 next_prio=49)
>      2.097 (         ): sched:sched_stat_runtime:comm=usleep pid=4677 runtime=29987 [ns] vruntime=569868092784 [ns])
>      2.099 (         ): sched:sched_switch:prev_comm=usleep prev_pid=4677 prev_prio=120 prev_state=R+ ==> next_comm=irq/51-enp0s25 next_pid=1164 next_prio=49)
>      2.121 ( 0.001 ms): usleep/4677 brk(                                                                  ) = 0x64e000
>      2.126 ( 0.005 ms): usleep/4677 brk(brk: 0x66f000                                                     ) = 0x66f000
>      2.129 ( 0.001 ms): usleep/4677 brk(                                                                  ) = 0x66f000
>      2.148 (         ): sched:sched_stat_runtime:comm=usleep pid=4677 runtime=34634 [ns] vruntime=569868127418 [ns])
>      2.149 (         ): sched:sched_switch:prev_comm=usleep prev_pid=4677 prev_prio=120 prev_state=R+ ==> next_comm=irq/51-enp0s25 next_pid=1164 next_prio=49)
>      2.173 ( 0.002 ms): usleep/4677 nanosleep(rqtp: 0x7fff2821a4e0                                        ) ...
>      2.173 (         ): sched:sched_stat_runtime:comm=usleep pid=4677 runtime=11440 [ns] vruntime=569868138858 [ns])
>      2.174 (         ): sched:sched_switch:prev_comm=usleep prev_pid=4677 prev_prio=120 prev_state=S ==> next_comm=swapper/3 next_pid=0 next_prio=120)
>      2.227 (         ): sched:sched_wakeup:comm=usleep pid=4677 prio=120 success=1 target_cpu=003)
>      2.228 (         ): sched:sched_wakeup:comm=usleep pid=4677 prio=120 success=1 target_cpu=003)
>      2.228 (         ): sched:sched_wakeup:comm=usleep pid=4677 prio=120 success=1 target_cpu=003)
>      2.229 (         ): sched:sched_wakeup:comm=usleep pid=4677 prio=120 success=1 target_cpu=003)
>      2.230 (         ): sched:sched_wakeup:comm=usleep pid=4677 prio=120 success=1 target_cpu=003)
>      2.230 (         ): sched:sched_wakeup:comm=usleep pid=4677 prio=120 success=1 target_cpu=003)
>      2.230 (         ): sched:sched_wakeup:comm=usleep pid=4677 prio=120 success=1 target_cpu=003)
>      2.231 (         ): sched:sched_wakeup:comm=usleep pid=4677 prio=120 success=1 target_cpu=003)
>      2.237 ( 0.067 ms): usleep/4677  ... [continued]: nanosleep()) = 0
>      2.243 ( 0.000 ms): usleep/4677 exit_group(                                                           
>      2.289 (         ): sched:sched_process_exit:comm=usleep pid=4677 prio=120)
> [root@ssdandy ~]#

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: heads up/RFC: 'perf trace' using ordered_events
  2015-03-04  1:01 ` Namhyung Kim
@ 2015-03-04  1:07   ` Arnaldo Carvalho de Melo
  2015-03-07 12:45     ` Adrian Hunter
  0 siblings, 1 reply; 8+ messages in thread
From: Arnaldo Carvalho de Melo @ 2015-03-04  1:07 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: David Ahern, Jiri Olsa, Ingo Molnar,
	Frédéric Weisbecker, Linux Kernel Mailing List

Em Wed, Mar 04, 2015 at 10:01:23AM +0900, Namhyung Kim escreveu:
> On Tue, Mar 03, 2015 at 01:49:40PM -0300, Arnaldo Carvalho de Melo wrote:
> > Hi,
> > 
> > 	Just a preview, but this is something David had mentioned at some
> > point, a major problem with 'perf trace' was that it wasn't using
> > 'perf_session' event reordering mechanism, so I've been working on making it
> > use it, refactoring the ordered_events code so that it can be used by tools
> > that don't deal with perf.data files.
 
> At a conceptual level, I think event processing should be done with
> session.  Even if perf trace does not do anything with a data file it
> can have a (live) session like perf top does.  This way we can avoid

perf top is another one I want to move away from perf_session, as it
doesn't use any perf.data file, but needs to have events ordered.

But then I really need to look at what you done in your patchset,
probably I am missing something (or a lot).

> internal API changes.
> 
> Also, in my multi-thread work, I don't use ordered_events API much
> since all samples are already ordered for each cpu or thread.  So I'd
> like to decouple it as much as possible.
 
> > 	I still have to investigate why there are so many sched_wakeup at the beginning,
> > but probably they are really being generated by the kernel and 'perf trace' has to make
> > this output more compact, perhaps noticing that a number of similar events appeared on
> > the stream and instead of writing N lines, print some prefix ([N*] perhaps) and then
> > the line.
> > 
> > 	Also its keeping pointers to the ring buffer, probably we'll need to copy or
> > store the perf_sample somehow in the 'ordered_event' instance.
> > 
> > 	But as this is touching things that Namhyung is touching as well, I thought about
> > sharing this stuff, its on my tree, branch tmp.perf/trace_ordered_events.
> 
> Thanks for letting me know.
> Namhyung
> 
> 
> > 
> > 	Lunchtime, bbl!
> > 
> > - Arnaldo
> > 
> > [root@ssdandy ~]# trace --ev sched:* usleep 1
> >      0.000 (         ): sched:sched_wakeup:comm=trace pid=4677 prio=120 success=1 target_cpu=001)
> >      0.001 (         ): sched:sched_wakeup:comm=trace pid=4677 prio=120 success=1 target_cpu=001)
> >      0.002 (         ): sched:sched_wakeup:comm=trace pid=4677 prio=120 success=1 target_cpu=001)
> >      0.002 (         ): sched:sched_wakeup:comm=trace pid=4677 prio=120 success=1 target_cpu=001)
> >      0.003 (         ): sched:sched_wakeup:comm=trace pid=4677 prio=120 success=1 target_cpu=001)
> >      0.003 (         ): sched:sched_wakeup:comm=trace pid=4677 prio=120 success=1 target_cpu=001)
> >      0.003 (         ): sched:sched_wakeup:comm=trace pid=4677 prio=120 success=1 target_cpu=001)
> >      0.004 (         ): sched:sched_wakeup:comm=trace pid=4677 prio=120 success=1 target_cpu=001)
> >      1.367 (         ): sched:sched_wakeup:comm=trace pid=4677 prio=120 success=1 target_cpu=003)
> >      1.368 (         ): sched:sched_wakeup:comm=trace pid=4677 prio=120 success=1 target_cpu=003)
> >      1.369 (         ): sched:sched_wakeup:comm=trace pid=4677 prio=120 success=1 target_cpu=003)
> >      1.369 (         ): sched:sched_wakeup:comm=trace pid=4677 prio=120 success=1 target_cpu=003)
> >      1.369 (         ): sched:sched_wakeup:comm=trace pid=4677 prio=120 success=1 target_cpu=003)
> >      1.370 (         ): sched:sched_wakeup:comm=trace pid=4677 prio=120 success=1 target_cpu=003)
> >      1.370 (         ): sched:sched_wakeup:comm=trace pid=4677 prio=120 success=1 target_cpu=003)
> >      1.371 (         ): sched:sched_wakeup:comm=trace pid=4677 prio=120 success=1 target_cpu=003)
> >      1.615 (         ): sched:sched_process_exec:filename=/bin/usleep pid=4677 old_pid=4677)
> >      1.621 (         ): sched:sched_wakeup:comm=rcuop/3 pid=16 prio=120 success=1 target_cpu=005)
> >      1.640 ( 0.001 ms): usleep/4677 brk(                                                                  ) = 0x64e000
> >      1.662 ( 0.005 ms): usleep/4677 mmap(len: 4096, prot: READ|WRITE, flags: PRIVATE|ANONYMOUS, fd: -1    ) = 0x7f3d1d7ef000
> >      1.675 ( 0.005 ms): usleep/4677 access(filename: 0x7f3d1d5edbe0, mode: R                              ) = -1 ENOENT No such file or directory
> >      1.687 ( 0.004 ms): usleep/4677 open(filename: 0x7f3d1d5ec4d8, flags: CLOEXEC                         ) = 3
> >      1.689 ( 0.001 ms): usleep/4677 fstat(fd: 3, statbuf: 0x7fff28219d10                                  ) = 0
> >      1.695 ( 0.005 ms): usleep/4677 mmap(len: 68635, prot: READ, flags: PRIVATE, fd: 3                    ) = 0x7f3d1d7de000
> >      1.696 ( 0.001 ms): usleep/4677 close(fd: 3                                                           ) = 0
> >      1.714 ( 0.007 ms): usleep/4677 open(filename: 0x7f3d1d7e867d, flags: CLOEXEC                         ) = 3
> >      1.716 ( 0.002 ms): usleep/4677 read(fd: 3, buf: 0x7fff28219eb0, count: 832                           ) = 832
> >      1.719 ( 0.001 ms): usleep/4677 fstat(fd: 3, statbuf: 0x7fff28219d60                                  ) = 0
> >      1.727 ( 0.006 ms): usleep/4677 mmap(len: 2135088, prot: EXEC|READ, flags: PRIVATE|DENYWRITE, fd: 3   ) = 0x7f3d1d3c6000
> >      1.734 ( 0.007 ms): usleep/4677 mprotect(start: 0x7f3d1d3cf000, len: 2093056                          ) = 0
> >      1.741 ( 0.007 ms): usleep/4677 mmap(addr: 0x7f3d1d5ce000, len: 8192, prot: READ|WRITE, flags: PRIVATE|DENYWRITE|FIXED, fd: 3, off: 32768) = 0x7f3d1d5ce000
> >      1.748 ( 0.001 ms): usleep/4677 close(fd: 3                                                           ) = 0
> >      1.763 ( 0.006 ms): usleep/4677 open(filename: 0x7f3d1d7ed5fa, flags: CLOEXEC                         ) = 3
> >      1.765 ( 0.002 ms): usleep/4677 read(fd: 3, buf: 0x7fff28219e80, count: 832                           ) = 832
> >      1.767 ( 0.001 ms): usleep/4677 fstat(fd: 3, statbuf: 0x7fff28219d30                                  ) = 0
> >      1.775 ( 0.006 ms): usleep/4677 mmap(len: 3932736, prot: EXEC|READ, flags: PRIVATE|DENYWRITE, fd: 3   ) = 0x7f3d1d005000
> >      1.782 ( 0.007 ms): usleep/4677 mprotect(start: 0x7f3d1d1bb000, len: 2097152                          ) = 0
> >      1.790 ( 0.008 ms): usleep/4677 mmap(addr: 0x7f3d1d3bb000, len: 24576, prot: READ|WRITE, flags: PRIVATE|DENYWRITE|FIXED, fd: 3, off: 1794048) ...
> >      1.790 (         ): sched:sched_stat_runtime:comm=usleep pid=4677 runtime=426124 [ns] vruntime=569867840059 [ns])
> >      1.793 (         ): sched:sched_wakeup:comm=rcuc/3 pid=56 prio=97 success=1 target_cpu=003)
> >      1.794 (         ): sched:sched_wakeup:comm=ksoftirqd/3 pid=58 prio=98 success=1 target_cpu=003)
> >      1.797 (         ): sched:sched_stat_runtime:comm=usleep pid=4677 runtime=2371 [ns] vruntime=569867842430 [ns])
> >      1.798 (         ): sched:sched_switch:prev_comm=usleep prev_pid=4677 prev_prio=120 prev_state=R+ ==> next_comm=rcuc/3 next_pid=56 next_prio=97)
> >      1.832 ( 0.049 ms): usleep/4677  ... [continued]: mmap()) = 0x7f3d1d3bb000
> >      1.844 ( 0.007 ms): usleep/4677 mmap(addr: 0x7f3d1d3c1000, len: 16960, prot: READ|WRITE, flags: PRIVATE|ANONYMOUS|FIXED, fd: -1) = 0x7f3d1d3c1000
> >      1.849 ( 0.001 ms): usleep/4677 close(fd: 3                                                           ) = 0
> >      1.859 ( 0.005 ms): usleep/4677 mmap(len: 4096, prot: READ|WRITE, flags: PRIVATE|ANONYMOUS, fd: -1    ) = 0x7f3d1d7dd000
> >      1.873 ( 0.005 ms): usleep/4677 mmap(len: 8192, prot: READ|WRITE, flags: PRIVATE|ANONYMOUS, fd: -1    ) = 0x7f3d1d7db000
> >      1.879 ( 0.001 ms): usleep/4677 arch_prctl(option: 4098, arg2: 139900464510784, arg3: 139900464513104, arg4: 34, arg5: 4294967295) = 0
> >      1.969 ( 0.009 ms): usleep/4677 mprotect(start: 0x7f3d1d3bb000, len: 16384, prot: READ                ) = 0
> >      1.982 ( 0.007 ms): usleep/4677 mprotect(start: 0x7f3d1d5ce000, len: 4096, prot: READ                 ) = 0
> >      1.991 ( 0.006 ms): usleep/4677 mprotect(start: 0x601000, len: 4096, prot: READ                       ) = 0
> >      2.001 ( 0.007 ms): usleep/4677 mprotect(start: 0x7f3d1d7f0000, len: 4096, prot: READ                 ) = 0
> >      2.009 ( 0.007 ms): usleep/4677 munmap(addr: 0x7f3d1d7de000, len: 68635                               ) = 0
> >      2.049 (         ): sched:sched_stat_runtime:comm=usleep pid=4677 runtime=220367 [ns] vruntime=569868062797 [ns])
> >      2.050 (         ): sched:sched_switch:prev_comm=usleep prev_pid=4677 prev_prio=120 prev_state=R ==> next_comm=irq/51-enp0s25 next_pid=1164 next_prio=49)
> >      2.097 (         ): sched:sched_stat_runtime:comm=usleep pid=4677 runtime=29987 [ns] vruntime=569868092784 [ns])
> >      2.099 (         ): sched:sched_switch:prev_comm=usleep prev_pid=4677 prev_prio=120 prev_state=R+ ==> next_comm=irq/51-enp0s25 next_pid=1164 next_prio=49)
> >      2.121 ( 0.001 ms): usleep/4677 brk(                                                                  ) = 0x64e000
> >      2.126 ( 0.005 ms): usleep/4677 brk(brk: 0x66f000                                                     ) = 0x66f000
> >      2.129 ( 0.001 ms): usleep/4677 brk(                                                                  ) = 0x66f000
> >      2.148 (         ): sched:sched_stat_runtime:comm=usleep pid=4677 runtime=34634 [ns] vruntime=569868127418 [ns])
> >      2.149 (         ): sched:sched_switch:prev_comm=usleep prev_pid=4677 prev_prio=120 prev_state=R+ ==> next_comm=irq/51-enp0s25 next_pid=1164 next_prio=49)
> >      2.173 ( 0.002 ms): usleep/4677 nanosleep(rqtp: 0x7fff2821a4e0                                        ) ...
> >      2.173 (         ): sched:sched_stat_runtime:comm=usleep pid=4677 runtime=11440 [ns] vruntime=569868138858 [ns])
> >      2.174 (         ): sched:sched_switch:prev_comm=usleep prev_pid=4677 prev_prio=120 prev_state=S ==> next_comm=swapper/3 next_pid=0 next_prio=120)
> >      2.227 (         ): sched:sched_wakeup:comm=usleep pid=4677 prio=120 success=1 target_cpu=003)
> >      2.228 (         ): sched:sched_wakeup:comm=usleep pid=4677 prio=120 success=1 target_cpu=003)
> >      2.228 (         ): sched:sched_wakeup:comm=usleep pid=4677 prio=120 success=1 target_cpu=003)
> >      2.229 (         ): sched:sched_wakeup:comm=usleep pid=4677 prio=120 success=1 target_cpu=003)
> >      2.230 (         ): sched:sched_wakeup:comm=usleep pid=4677 prio=120 success=1 target_cpu=003)
> >      2.230 (         ): sched:sched_wakeup:comm=usleep pid=4677 prio=120 success=1 target_cpu=003)
> >      2.230 (         ): sched:sched_wakeup:comm=usleep pid=4677 prio=120 success=1 target_cpu=003)
> >      2.231 (         ): sched:sched_wakeup:comm=usleep pid=4677 prio=120 success=1 target_cpu=003)
> >      2.237 ( 0.067 ms): usleep/4677  ... [continued]: nanosleep()) = 0
> >      2.243 ( 0.000 ms): usleep/4677 exit_group(                                                           
> >      2.289 (         ): sched:sched_process_exit:comm=usleep pid=4677 prio=120)
> > [root@ssdandy ~]#

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: heads up/RFC: 'perf trace' using ordered_events
  2015-03-04  1:07   ` Arnaldo Carvalho de Melo
@ 2015-03-07 12:45     ` Adrian Hunter
  2015-03-09 13:21       ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 8+ messages in thread
From: Adrian Hunter @ 2015-03-07 12:45 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Namhyung Kim
  Cc: David Ahern, Jiri Olsa, Ingo Molnar,
	Frédéric Weisbecker, Linux Kernel Mailing List

On 4/03/2015 3:07 a.m., Arnaldo Carvalho de Melo wrote:
> Em Wed, Mar 04, 2015 at 10:01:23AM +0900, Namhyung Kim escreveu:
>> On Tue, Mar 03, 2015 at 01:49:40PM -0300, Arnaldo Carvalho de Melo wrote:
>>> Hi,
>>>
>>> 	Just a preview, but this is something David had mentioned at some
>>> point, a major problem with 'perf trace' was that it wasn't using
>>> 'perf_session' event reordering mechanism, so I've been working on making it
>>> use it, refactoring the ordered_events code so that it can be used by tools
>>> that don't deal with perf.data files.
>
>> At a conceptual level, I think event processing should be done with
>> session.  Even if perf trace does not do anything with a data file it
>> can have a (live) session like perf top does.  This way we can avoid
>
> perf top is another one I want to move away from perf_session, as it
> doesn't use any perf.data file, but needs to have events ordered.
>
> But then I really need to look at what you done in your patchset,
> probably I am missing something (or a lot).

The change appears to conflict substantially with the patches I have for
adding instruction tracing support. I will send them again soon, so you
can see what I mean.

My first thought would be that iterating over events from an event
source would be the natural abstraction which would not differentiate
from where the events came. i.e. it should not matter if the back end
is a perf.data file, pipe, or mmap.

Generally there is a need for a top level data structure. It avoids having
to use global variables and reduces the number of parameters that get
passed around.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: heads up/RFC: 'perf trace' using ordered_events
  2015-03-07 12:45     ` Adrian Hunter
@ 2015-03-09 13:21       ` Arnaldo Carvalho de Melo
  2015-03-10  6:06         ` Namhyung Kim
  0 siblings, 1 reply; 8+ messages in thread
From: Arnaldo Carvalho de Melo @ 2015-03-09 13:21 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Namhyung Kim, David Ahern, Jiri Olsa, Ingo Molnar,
	Frédéric Weisbecker, Linux Kernel Mailing List

Em Sat, Mar 07, 2015 at 02:45:24PM +0200, Adrian Hunter escreveu:
> On 4/03/2015 3:07 a.m., Arnaldo Carvalho de Melo wrote:
> >Em Wed, Mar 04, 2015 at 10:01:23AM +0900, Namhyung Kim escreveu:
> >>On Tue, Mar 03, 2015 at 01:49:40PM -0300, Arnaldo Carvalho de Melo wrote:
> >>>	Just a preview, but this is something David had mentioned at some
> >>>point, a major problem with 'perf trace' was that it wasn't using
> >>>'perf_session' event reordering mechanism, so I've been working on making it
> >>>use it, refactoring the ordered_events code so that it can be used by tools
> >>>that don't deal with perf.data files.

> >>At a conceptual level, I think event processing should be done with
> >>session.  Even if perf trace does not do anything with a data file it
> >>can have a (live) session like perf top does.  This way we can avoid

> >perf top is another one I want to move away from perf_session, as it
> >doesn't use any perf.data file, but needs to have events ordered.

> >But then I really need to look at what you done in your patchset,
> >probably I am missing something (or a lot).
 
> The change appears to conflict substantially with the patches I have for
> adding instruction tracing support. I will send them again soon, so you
> can see what I mean.

Ok, I'll check your patchkit
 
> My first thought would be that iterating over events from an event
> source would be the natural abstraction which would not differentiate
> from where the events came. i.e. it should not matter if the back end
> is a perf.data file, pipe, or mmap.

> Generally there is a need for a top level data structure. It avoids having
> to use global variables and reduces the number of parameters that get
> passed around.

I agree with that, that is why perf_session was introduced in the first
place :-)

But that was long ago, when the main use case was for processing
perf.data files, as time went on it was being stuffed with many other
things.

What I've been trying to do is to better separate those things into
multiple classes, so that tools can use parts of it as they see fit.

My recent work on ordered_samples was with the intent of using that
reordering class in 'trace', but as I went on doing that I think it is
too tied to the design of perf.data, with its shortcomings.

And then I've been trying to do it in a way that the existing API isn't
changed too much, keeping assumptions existing tools have.

For trace I need to take advantage of the fact that each mmap is ordered
already and then just sort by the timestamp in the mmap head, etc.

In retrospect, the perf.data file should have kept that ordering, i.e.
have one file per mmap, that would be saved in parallel, without any of
those PERF_RECORD_FINISHED_ROUND records.

But I have to experiment with that, leaving the existing code around to
deal with older files.

- Arnaldo

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: heads up/RFC: 'perf trace' using ordered_events
  2015-03-09 13:21       ` Arnaldo Carvalho de Melo
@ 2015-03-10  6:06         ` Namhyung Kim
  2015-03-10 14:25           ` David Ahern
  0 siblings, 1 reply; 8+ messages in thread
From: Namhyung Kim @ 2015-03-10  6:06 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Adrian Hunter, David Ahern, Jiri Olsa, Ingo Molnar,
	Frédéric Weisbecker, Linux Kernel Mailing List

Hi Arnaldo,

On Mon, Mar 09, 2015 at 10:21:35AM -0300, Arnaldo Carvalho de Melo wrote:
> For trace I need to take advantage of the fact that each mmap is ordered
> already and then just sort by the timestamp in the mmap head, etc.
> 
> In retrospect, the perf.data file should have kept that ordering, i.e.
> have one file per mmap, that would be saved in parallel, without any of
> those PERF_RECORD_FINISHED_ROUND records.
> 
> But I have to experiment with that, leaving the existing code around to
> deal with older files.

It seems like what you said is almost same as my multi-thread work.
It saves data files per mmap and then merges them with an index
table so that they can be processed in parallel.

Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: heads up/RFC: 'perf trace' using ordered_events
  2015-03-10  6:06         ` Namhyung Kim
@ 2015-03-10 14:25           ` David Ahern
  2015-03-11  0:33             ` Namhyung Kim
  0 siblings, 1 reply; 8+ messages in thread
From: David Ahern @ 2015-03-10 14:25 UTC (permalink / raw)
  To: Namhyung Kim, Arnaldo Carvalho de Melo
  Cc: Adrian Hunter, Jiri Olsa, Ingo Molnar,
	Frédéric Weisbecker, Linux Kernel Mailing List

On 3/10/15 12:06 AM, Namhyung Kim wrote:
> Hi Arnaldo,
>
> On Mon, Mar 09, 2015 at 10:21:35AM -0300, Arnaldo Carvalho de Melo wrote:
>> For trace I need to take advantage of the fact that each mmap is ordered
>> already and then just sort by the timestamp in the mmap head, etc.
>>
>> In retrospect, the perf.data file should have kept that ordering, i.e.
>> have one file per mmap, that would be saved in parallel, without any of
>> those PERF_RECORD_FINISHED_ROUND records.
>>
>> But I have to experiment with that, leaving the existing code around to
>> deal with older files.
>
> It seems like what you said is almost same as my multi-thread work.
> It saves data files per mmap and then merges them with an index
> table so that they can be processed in parallel.

I think you and Jiri both have worked on saving a file per mmap. Where 
does that stand? I would like to try it out.

With the 1024 cpu systems I am seeing 5GB files in 1 second runs and 
perf is not handling it well. The perf-script/perf-report (stdio) will 
'hang' for 45 minutes munging through the file. I have to connect gdb 
from time to time to verify it is making progress (file_offset is 
increasing). I believe what happens is that there is 'no round' -- it 
has to process all mmaps (1024 cpus and 6 or 7 events) through the 
ordered events queue before it can push out results. I need to look at 
that once I figure out the task scheduler problem.

David


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: heads up/RFC: 'perf trace' using ordered_events
  2015-03-10 14:25           ` David Ahern
@ 2015-03-11  0:33             ` Namhyung Kim
  0 siblings, 0 replies; 8+ messages in thread
From: Namhyung Kim @ 2015-03-11  0:33 UTC (permalink / raw)
  To: David Ahern
  Cc: Arnaldo Carvalho de Melo, Adrian Hunter, Jiri Olsa, Ingo Molnar,
	Frédéric Weisbecker, Linux Kernel Mailing List

Hi David,

On Tue, Mar 10, 2015 at 08:25:27AM -0600, David Ahern wrote:
> On 3/10/15 12:06 AM, Namhyung Kim wrote:
> >Hi Arnaldo,
> >
> >On Mon, Mar 09, 2015 at 10:21:35AM -0300, Arnaldo Carvalho de Melo wrote:
> >>For trace I need to take advantage of the fact that each mmap is ordered
> >>already and then just sort by the timestamp in the mmap head, etc.
> >>
> >>In retrospect, the perf.data file should have kept that ordering, i.e.
> >>have one file per mmap, that would be saved in parallel, without any of
> >>those PERF_RECORD_FINISHED_ROUND records.
> >>
> >>But I have to experiment with that, leaving the existing code around to
> >>deal with older files.
> >
> >It seems like what you said is almost same as my multi-thread work.
> >It saves data files per mmap and then merges them with an index
> >table so that they can be processed in parallel.
> 
> I think you and Jiri both have worked on saving a file per mmap. Where does
> that stand? I would like to try it out.

It's available on perf/threaded-v3 branch in my tree.

You can see the description and numbers at

  https://lkml.org/lkml/2015/3/2/988.

Also you can index your data file and then process with multi-thread:

  $ perf data index -i perf.data -o perf.data.idx
  $ perf report --multi-thread -i perf.data.idx

Thanks,
Namhyung


> 
> With the 1024 cpu systems I am seeing 5GB files in 1 second runs and perf is
> not handling it well. The perf-script/perf-report (stdio) will 'hang' for 45
> minutes munging through the file. I have to connect gdb from time to time to
> verify it is making progress (file_offset is increasing). I believe what
> happens is that there is 'no round' -- it has to process all mmaps (1024
> cpus and 6 or 7 events) through the ordered events queue before it can push
> out results. I need to look at that once I figure out the task scheduler
> problem.
> 
> David
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2015-03-11  0:39 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-03-03 16:49 heads up/RFC: 'perf trace' using ordered_events Arnaldo Carvalho de Melo
2015-03-04  1:01 ` Namhyung Kim
2015-03-04  1:07   ` Arnaldo Carvalho de Melo
2015-03-07 12:45     ` Adrian Hunter
2015-03-09 13:21       ` Arnaldo Carvalho de Melo
2015-03-10  6:06         ` Namhyung Kim
2015-03-10 14:25           ` David Ahern
2015-03-11  0:33             ` Namhyung Kim

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).