* [GIT PULL 0/9] perf/core improvements and fixes @ 2015-08-21 16:10 Arnaldo Carvalho de Melo 2015-08-21 16:10 ` [PATCH 1/9] perf script: Fix segfault using --show-mmap-events Arnaldo Carvalho de Melo ` (9 more replies) 0 siblings, 10 replies; 42+ messages in thread From: Arnaldo Carvalho de Melo @ 2015-08-21 16:10 UTC (permalink / raw) To: Ingo Molnar Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter, Alexei Starovoitov, Borislav Petkov, Brendan Gregg, Daniel Borkmann, David Ahern, Dean Nelson, Frederic Weisbecker, He Kuang, Jiri Olsa, Kaixu Xia, Li Zhang, Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama, Stephane Eranian, Sukadev Bhattiprolu, Wang Nan, Zefan Li, Arnaldo Carvalho de Melo Hi Ingo, Please consider pulling, - Arnaldo The following changes since commit 82819ffb42fb45197bacf3223191deca31d3eb91: perf/x86/msr: Fix the MSR driver build (2015-08-21 08:17:01 +0200) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo for you to fetch changes up to 1c0bd0e891aaed0219010bfe79b32e1b0b82d662: perf probe: Try to use symbol table if searching debug info failed (2015-08-21 12:57:20 -0300) ---------------------------------------------------------------- perf/core improvements and fixes: User visible: - Fix segfault using 'perf script --show-mmap-events', affects only current perf/core (Adrian Hunter). - /proc/kcore requires CAP_SYS_RAWIO message too noisy, make it debug only (Adrian Hunter) - Fix Intel PT timestamp handling (Adrian Hunter) - Add Intel BTS support, with a call-graph script to show it and PT in use in a GUI using 'perf script' python scripting with postgresql and Qt (Adrian Hunter) - Add checks for returned EVENT_ERROR type in libtraceevent, fixing a bug that surfaced on arm64 systems (Dean Nelson) - Fallback to using kallsyms when libdw fails to handle a vmlinux file, that can happen, for instance, when perf is statically linked and then libdw fails to load libebl_{arch}.so (Wang Nan) Infrastructure: - Initialize reference counts in map__clone() (Arnaldo Carvalho de Melo) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> ---------------------------------------------------------------- Adrian Hunter (6): perf script: Fix segfault using --show-mmap-events perf tools: /proc/kcore requires CAP_SYS_RAWIO message too noisy perf tools: Fix Intel PT timestamp handling perf tools: Add Intel BTS support perf tools: Put itrace options into an asciidoc include perf tools: Add example call-graph script Arnaldo Carvalho de Melo (1): perf tools: Initialize reference counts in map__clone() Dean Nelson (1): tools lib traceevent: Add checks for returned EVENT_ERROR type Wang Nan (1): perf probe: Try to use symbol table if searching debug info failed tools/lib/traceevent/event-parse.c | 9 + tools/perf/Documentation/intel-bts.txt | 86 ++ tools/perf/Documentation/itrace.txt | 22 + tools/perf/Documentation/perf-inject.txt | 23 +- tools/perf/Documentation/perf-report.txt | 23 +- tools/perf/Documentation/perf-script.txt | 23 +- tools/perf/arch/x86/util/Build | 1 + tools/perf/arch/x86/util/auxtrace.c | 49 +- tools/perf/arch/x86/util/intel-bts.c | 458 ++++++++++ tools/perf/arch/x86/util/pmu.c | 3 + .../scripts/python/call-graph-from-postgresql.py | 327 ++++++++ tools/perf/scripts/python/export-to-postgresql.py | 47 ++ tools/perf/util/Build | 1 + tools/perf/util/annotate.c | 1 + tools/perf/util/auxtrace.c | 3 + tools/perf/util/auxtrace.h | 1 + tools/perf/util/evlist.c | 2 +- tools/perf/util/intel-bts.c | 933 +++++++++++++++++++++ tools/perf/util/intel-bts.h | 43 + tools/perf/util/intel-pt.c | 2 +- tools/perf/util/map.c | 13 +- tools/perf/util/pmu.c | 4 - tools/perf/util/probe-event.c | 7 +- tools/perf/util/symbol.c | 4 +- 24 files changed, 2004 insertions(+), 81 deletions(-) create mode 100644 tools/perf/Documentation/intel-bts.txt create mode 100644 tools/perf/Documentation/itrace.txt create mode 100644 tools/perf/arch/x86/util/intel-bts.c create mode 100644 tools/perf/scripts/python/call-graph-from-postgresql.py create mode 100644 tools/perf/util/intel-bts.c create mode 100644 tools/perf/util/intel-bts.h ^ permalink raw reply [flat|nested] 42+ messages in thread
* [PATCH 1/9] perf script: Fix segfault using --show-mmap-events 2015-08-21 16:10 [GIT PULL 0/9] perf/core improvements and fixes Arnaldo Carvalho de Melo @ 2015-08-21 16:10 ` Arnaldo Carvalho de Melo 2015-08-21 16:10 ` [PATCH 2/9] perf tools: /proc/kcore requires CAP_SYS_RAWIO message too noisy Arnaldo Carvalho de Melo ` (8 subsequent siblings) 9 siblings, 0 replies; 42+ messages in thread From: Arnaldo Carvalho de Melo @ 2015-08-21 16:10 UTC (permalink / raw) To: Ingo Molnar Cc: linux-kernel, Adrian Hunter, Jiri Olsa, Arnaldo Carvalho de Melo From: Adrian Hunter <adrian.hunter@intel.com> Patch "perf script: Don't assume evsel position of tracking events" changed 'perf script' to use 'perf_evlist__id2evsel()'. That results in a segfault if there is more than 1 event and there are synthesized mmap events e.g. $ perf record -e cycles,instructions -p$$ sleep 1 $ perf script --show-mmap-events Segmentation fault (core dumped) That happens because these synthesized events have an 'id' of zero which does not match any 'evsel'. Currently, these synthesized events use the sample type of the first evsel. Change 'perf_evlist__id2evsel()' to reflect that which also makes it consistent with 'perf_evlist__event2evsel()'. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Fixes: 06b234ec26fd ("perf script: Don't assume evsel position of tracking events") Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1440059205-1765-1-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> --- tools/perf/util/evlist.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c index 373f65b02545..e9a5d432902c 100644 --- a/tools/perf/util/evlist.c +++ b/tools/perf/util/evlist.c @@ -573,7 +573,7 @@ struct perf_evsel *perf_evlist__id2evsel(struct perf_evlist *evlist, u64 id) { struct perf_sample_id *sid; - if (evlist->nr_entries == 1) + if (evlist->nr_entries == 1 || !id) return perf_evlist__first(evlist); sid = perf_evlist__id2sid(evlist, id); -- 2.1.0 ^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH 2/9] perf tools: /proc/kcore requires CAP_SYS_RAWIO message too noisy 2015-08-21 16:10 [GIT PULL 0/9] perf/core improvements and fixes Arnaldo Carvalho de Melo 2015-08-21 16:10 ` [PATCH 1/9] perf script: Fix segfault using --show-mmap-events Arnaldo Carvalho de Melo @ 2015-08-21 16:10 ` Arnaldo Carvalho de Melo 2015-08-21 16:10 ` [PATCH 3/9] perf tools: Fix Intel PT timestamp handling Arnaldo Carvalho de Melo ` (7 subsequent siblings) 9 siblings, 0 replies; 42+ messages in thread From: Arnaldo Carvalho de Melo @ 2015-08-21 16:10 UTC (permalink / raw) To: Ingo Molnar Cc: linux-kernel, Adrian Hunter, Jiri Olsa, Li Zhang, Sukadev Bhattiprolu, Arnaldo Carvalho de Melo From: Adrian Hunter <adrian.hunter@intel.com> The "/proc/kcore requires CAP_SYS_RAWIO" message comes up all the time for 'perf script' if vmlinux is not found and the user isn't root, even when the kernel is not being traced and even though the message is only really relevant for annotation. Change it to pr_debug and instead put a note in the message displayed if annotation is not possible. Also, the file being accessed might not be /proc/kcore. Tools can be directed to a different location using the --kallsyms option in which case kcore is expected to be in the same directory. Adjust the message so it is not misleading in that case. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Li Zhang <zhlcindy@linux.vnet.ibm.com> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/1440065260-8802-1-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> --- tools/perf/util/annotate.c | 1 + tools/perf/util/symbol.c | 4 ++-- 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c index 8a18347709e1..d1eece70b84d 100644 --- a/tools/perf/util/annotate.c +++ b/tools/perf/util/annotate.c @@ -1126,6 +1126,7 @@ fallback: dso->annotate_warned = 1; pr_err("Can't annotate %s:\n\n" "No vmlinux file%s\nwas found in the path.\n\n" + "Note that annotation using /proc/kcore requires CAP_SYS_RAWIO capability.\n\n" "Please use:\n\n" " perf buildid-cache -vu vmlinux\n\n" "or:\n\n" diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c index 725640fd7cd8..42e98ab5a9bb 100644 --- a/tools/perf/util/symbol.c +++ b/tools/perf/util/symbol.c @@ -1138,8 +1138,8 @@ static int dso__load_kcore(struct dso *dso, struct map *map, fd = open(kcore_filename, O_RDONLY); if (fd < 0) { - pr_err("%s requires CAP_SYS_RAWIO capability to access.\n", - kcore_filename); + pr_debug("Failed to open %s. Note /proc/kcore requires CAP_SYS_RAWIO capability to access.\n", + kcore_filename); return -EINVAL; } -- 2.1.0 ^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH 3/9] perf tools: Fix Intel PT timestamp handling 2015-08-21 16:10 [GIT PULL 0/9] perf/core improvements and fixes Arnaldo Carvalho de Melo 2015-08-21 16:10 ` [PATCH 1/9] perf script: Fix segfault using --show-mmap-events Arnaldo Carvalho de Melo 2015-08-21 16:10 ` [PATCH 2/9] perf tools: /proc/kcore requires CAP_SYS_RAWIO message too noisy Arnaldo Carvalho de Melo @ 2015-08-21 16:10 ` Arnaldo Carvalho de Melo 2015-08-21 16:10 ` [PATCH 4/9] tools lib traceevent: Add checks for returned EVENT_ERROR type Arnaldo Carvalho de Melo ` (6 subsequent siblings) 9 siblings, 0 replies; 42+ messages in thread From: Arnaldo Carvalho de Melo @ 2015-08-21 16:10 UTC (permalink / raw) To: Ingo Molnar Cc: linux-kernel, Adrian Hunter, Jiri Olsa, Arnaldo Carvalho de Melo From: Adrian Hunter <adrian.hunter@intel.com> Events that don't sample the timestamp have a timestamp value of -1. Intel PT processing wasn't taking that into account. This is particularly noticeable with Intel BTS because timestamps are not requested by default. Then, if the conversion of -1 to TSC results in a small number, the processing is unaffected. However if the conversion results in a big number, then the data is processed prematurely before relevant sideband data like mmap events, which in turn results in samples with unknown dsos. Commiter note: Since BTS wasn't upstream, I split the patch to fold the BTS part with the patch introducing it, to avoid having this bug in the commit history. PT was already upstream, so this patch contains that part. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1440060692-5585-1-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> --- tools/perf/util/intel-pt.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/perf/util/intel-pt.c b/tools/perf/util/intel-pt.c index 2a4a4120473b..a5acd2fe2447 100644 --- a/tools/perf/util/intel-pt.c +++ b/tools/perf/util/intel-pt.c @@ -1450,7 +1450,7 @@ static int intel_pt_process_event(struct perf_session *session, return -EINVAL; } - if (sample->time) + if (sample->time && sample->time != (u64)-1) timestamp = perf_time_to_tsc(sample->time, &pt->tc); else timestamp = 0; -- 2.1.0 ^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH 4/9] tools lib traceevent: Add checks for returned EVENT_ERROR type 2015-08-21 16:10 [GIT PULL 0/9] perf/core improvements and fixes Arnaldo Carvalho de Melo ` (2 preceding siblings ...) 2015-08-21 16:10 ` [PATCH 3/9] perf tools: Fix Intel PT timestamp handling Arnaldo Carvalho de Melo @ 2015-08-21 16:10 ` Arnaldo Carvalho de Melo 2015-08-21 16:10 ` [PATCH 5/9] perf tools: Add Intel BTS support Arnaldo Carvalho de Melo ` (5 subsequent siblings) 9 siblings, 0 replies; 42+ messages in thread From: Arnaldo Carvalho de Melo @ 2015-08-21 16:10 UTC (permalink / raw) To: Ingo Molnar Cc: linux-kernel, Dean Nelson, Jiri Olsa, Peter Zijlstra, Arnaldo Carvalho de Melo From: Dean Nelson <dnelson@redhat.com> Running the following perf-stat command on an arm64 system produces the following result... [root@aarch64 ~]# perf stat -e kmem:mm_page_alloc -a sleep 1 Warning: [kmem:mm_page_alloc] function sizeof not defined Warning: Error: expected type 4 but read 0 Segmentation fault [root@aarch64 ~]# The second warning was a result of the first warning not stopping processing after it detected the issue. That is, code that found the issue reported the first problem, but because it did not exit out of the functions smoothly, it caused the other warning to appear and not only that, it later caused the SIGSEGV. Signed-off-by: Dean Nelson <dnelson@redhat.com> Reviewed-by: Steven Rostedt <rostedt@goodmis.org> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20150820151632.13927.13791.email-sent-by-dnelson@teal Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> --- tools/lib/traceevent/event-parse.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/tools/lib/traceevent/event-parse.c b/tools/lib/traceevent/event-parse.c index fcd8a9e3d2e1..5c1867a13ef2 100644 --- a/tools/lib/traceevent/event-parse.c +++ b/tools/lib/traceevent/event-parse.c @@ -1745,6 +1745,9 @@ process_cond(struct event_format *event, struct print_arg *top, char **tok) type = process_arg(event, left, &token); again: + if (type == EVENT_ERROR) + goto out_free; + /* Handle other operations in the arguments */ if (type == EVENT_OP && strcmp(token, ":") != 0) { type = process_op(event, left, &token); @@ -2004,6 +2007,12 @@ process_op(struct event_format *event, struct print_arg *arg, char **tok) goto out_warn_free; type = process_arg_token(event, right, tok, type); + if (type == EVENT_ERROR) { + free_arg(right); + /* token was freed in process_arg_token() via *tok */ + token = NULL; + goto out_free; + } if (right->type == PRINT_OP && get_op_prio(arg->op.op) < get_op_prio(right->op.op)) { -- 2.1.0 ^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH 5/9] perf tools: Add Intel BTS support 2015-08-21 16:10 [GIT PULL 0/9] perf/core improvements and fixes Arnaldo Carvalho de Melo ` (3 preceding siblings ...) 2015-08-21 16:10 ` [PATCH 4/9] tools lib traceevent: Add checks for returned EVENT_ERROR type Arnaldo Carvalho de Melo @ 2015-08-21 16:10 ` Arnaldo Carvalho de Melo 2015-08-21 16:10 ` [PATCH 6/9] perf tools: Put itrace options into an asciidoc include Arnaldo Carvalho de Melo ` (4 subsequent siblings) 9 siblings, 0 replies; 42+ messages in thread From: Arnaldo Carvalho de Melo @ 2015-08-21 16:10 UTC (permalink / raw) To: Ingo Molnar Cc: linux-kernel, Adrian Hunter, Jiri Olsa, Arnaldo Carvalho de Melo From: Adrian Hunter <adrian.hunter@intel.com> Intel BTS support fits within the new auxtrace infrastructure. Recording is supporting by identifying the Intel BTS PMU, parsing options and setting up events. Decoding is supported by queuing up trace data by thread and then decoding synchronously delivering synthesized event samples into the session processing for tools to consume. Committer note: E.g: [root@felicio ~]# perf record --per-thread -e intel_bts// ls anaconda-ks.cfg apctest.output bin kernel-rt-3.10.0-298.rt56.171.el7.x86_64.rpm libexec lock_page.bpf.c perf.data perf.data.old [ perf record: Woken up 3 times to write data ] [ perf record: Captured and wrote 4.367 MB perf.data ] [root@felicio ~]# perf evlist -v intel_bts//: type: 6, size: 112, { sample_period, sample_freq }: 1, sample_type: IP|TID|IDENTIFIER, read_format: ID, disabled: 1, enable_on_exec: 1, sample_id_all: 1, exclude_guest: 1 dummy:u: type: 1, size: 112, config: 0x9, { sample_period, sample_freq }: 1, sample_type: IP|TID|IDENTIFIER, read_format: ID, disabled: 1, exclude_kernel: 1, exclude_hv: 1, mmap: 1, comm: 1, enable_on_exec: 1, task: 1, sample_id_all: 1, mmap2: 1, comm_exec: 1 [root@felicio ~]# perf script # the navigate in the pager to some interesting place: ls 1843 1 branches: ffffffff810a60cb flush_signal_handlers ([kernel.kallsyms]) => ffffffff8121a522 setup_new_exec ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8121a529 setup_new_exec ([kernel.kallsyms]) => ffffffff8122fa30 do_close_on_exec ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8122fa5d do_close_on_exec ([kernel.kallsyms]) => ffffffff81767ae0 _raw_spin_lock ([kernel.kallsyms]) ls 1843 1 branches: ffffffff81767af4 _raw_spin_lock ([kernel.kallsyms]) => ffffffff8122fa62 do_close_on_exec ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8122fa8e do_close_on_exec ([kernel.kallsyms]) => ffffffff8122faf0 do_close_on_exec ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8122faf7 do_close_on_exec ([kernel.kallsyms]) => ffffffff8122fa8b do_close_on_exec ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8122fa8e do_close_on_exec ([kernel.kallsyms]) => ffffffff8122faf0 do_close_on_exec ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8122faf7 do_close_on_exec ([kernel.kallsyms]) => ffffffff8122fa8b do_close_on_exec ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8122fa8e do_close_on_exec ([kernel.kallsyms]) => ffffffff8122faf0 do_close_on_exec ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8122faf7 do_close_on_exec ([kernel.kallsyms]) => ffffffff8122fa8b do_close_on_exec ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8122fa8e do_close_on_exec ([kernel.kallsyms]) => ffffffff8122faf0 do_close_on_exec ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8122faf7 do_close_on_exec ([kernel.kallsyms]) => ffffffff8122fa8b do_close_on_exec ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8122fa8e do_close_on_exec ([kernel.kallsyms]) => ffffffff8122faf0 do_close_on_exec ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8122faf7 do_close_on_exec ([kernel.kallsyms]) => ffffffff8122fa8b do_close_on_exec ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8122fa8e do_close_on_exec ([kernel.kallsyms]) => ffffffff8122faf0 do_close_on_exec ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8122faf7 do_close_on_exec ([kernel.kallsyms]) => ffffffff8122fa8b do_close_on_exec ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8122fac9 do_close_on_exec ([kernel.kallsyms]) => ffffffff8122fad2 do_close_on_exec ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8122fadd do_close_on_exec ([kernel.kallsyms]) => ffffffff8120fc80 filp_close ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8120fcaf filp_close ([kernel.kallsyms]) => ffffffff8120fcb6 filp_close ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8120fcc2 filp_close ([kernel.kallsyms]) => ffffffff812547f0 dnotify_flush ([kernel.kallsyms]) ls 1843 1 branches: ffffffff81254823 dnotify_flush ([kernel.kallsyms]) => ffffffff8120fcc7 filp_close ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8120fccd filp_close ([kernel.kallsyms]) => ffffffff81261790 locks_remove_posix ([kernel.kallsyms]) ls 1843 1 branches: ffffffff812617a3 locks_remove_posix ([kernel.kallsyms]) => ffffffff812617b9 locks_remove_posix ([kernel.kallsyms]) ls 1843 1 branches: ffffffff812617b9 locks_remove_posix ([kernel.kallsyms]) => ffffffff8120fcd2 filp_close ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8120fcd5 filp_close ([kernel.kallsyms]) => ffffffff812142c0 fput ([kernel.kallsyms]) ls 1843 1 branches: ffffffff812142d6 fput ([kernel.kallsyms]) => ffffffff812142df fput ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8121430c fput ([kernel.kallsyms]) => ffffffff810b6580 task_work_add ([kernel.kallsyms]) ls 1843 1 branches: ffffffff810b65ad task_work_add ([kernel.kallsyms]) => ffffffff810b65b1 task_work_add ([kernel.kallsyms]) ls 1843 1 branches: ffffffff810b65c1 task_work_add ([kernel.kallsyms]) => ffffffff810bc710 kick_process ([kernel.kallsyms]) ls 1843 1 branches: ffffffff810bc725 kick_process ([kernel.kallsyms]) => ffffffff810bc742 kick_process ([kernel.kallsyms]) ls 1843 1 branches: ffffffff810bc742 kick_process ([kernel.kallsyms]) => ffffffff810b65c6 task_work_add ([kernel.kallsyms]) ls 1843 1 branches: ffffffff810b65c9 task_work_add ([kernel.kallsyms]) => ffffffff81214311 fput ([kernel.kallsyms]) Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1437150840-31811-9-git-send-email-adrian.hunter@intel.com [ Merged sample->time fix for bug found after first round of testing on slightly older kernel ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> --- tools/perf/Documentation/intel-bts.txt | 86 +++ tools/perf/arch/x86/util/Build | 1 + tools/perf/arch/x86/util/auxtrace.c | 49 +- tools/perf/arch/x86/util/intel-bts.c | 458 ++++++++++++++++ tools/perf/arch/x86/util/pmu.c | 3 + tools/perf/util/Build | 1 + tools/perf/util/auxtrace.c | 3 + tools/perf/util/auxtrace.h | 1 + tools/perf/util/intel-bts.c | 933 +++++++++++++++++++++++++++++++++ tools/perf/util/intel-bts.h | 43 ++ tools/perf/util/pmu.c | 4 - 11 files changed, 1576 insertions(+), 6 deletions(-) create mode 100644 tools/perf/Documentation/intel-bts.txt create mode 100644 tools/perf/arch/x86/util/intel-bts.c create mode 100644 tools/perf/util/intel-bts.c create mode 100644 tools/perf/util/intel-bts.h diff --git a/tools/perf/Documentation/intel-bts.txt b/tools/perf/Documentation/intel-bts.txt new file mode 100644 index 000000000000..8bdc93bd7fdb --- /dev/null +++ b/tools/perf/Documentation/intel-bts.txt @@ -0,0 +1,86 @@ +Intel Branch Trace Store +======================== + +Overview +======== + +Intel BTS could be regarded as a predecessor to Intel PT and has some +similarities because it can also identify every branch a program takes. A +notable difference is that Intel BTS has no timing information and as a +consequence the present implementation is limited to per-thread recording. + +While decoding Intel BTS does not require walking the object code, the object +code is still needed to pair up calls and returns correctly, consequently much +of the Intel PT documentation applies also to Intel BTS. Refer to the Intel PT +documentation and consider that the PMU 'intel_bts' can usually be used in +place of 'intel_pt' in the examples provided, with the proviso that per-thread +recording must also be stipulated i.e. the --per-thread option for +'perf record'. + + +perf record +=========== + +new event +--------- + +The Intel BTS kernel driver creates a new PMU for Intel BTS. The perf record +option is: + + -e intel_bts// + +Currently Intel BTS is limited to per-thread tracing so the --per-thread option +is also needed. + + +snapshot option +--------------- + +The snapshot option is the same as Intel PT (refer Intel PT documentation). + + +auxtrace mmap size option +----------------------- + +The mmap size option is the same as Intel PT (refer Intel PT documentation). + + +perf script +=========== + +By default, perf script will decode trace data found in the perf.data file. +This can be further controlled by option --itrace. The --itrace option is +the same as Intel PT (refer Intel PT documentation) except that neither +"instructions" events nor "transactions" events (and consequently call +chains) are supported. + +To disable trace decoding entirely, use the option --no-itrace. + + +dump option +----------- + +perf script has an option (-D) to "dump" the events i.e. display the binary +data. + +When -D is used, Intel BTS packets are displayed. + +To disable the display of Intel BTS packets, combine the -D option with +--no-itrace. + + +perf report +=========== + +By default, perf report will decode trace data found in the perf.data file. +This can be further controlled by new option --itrace exactly the same as +perf script. + + +perf inject +=========== + +perf inject also accepts the --itrace option in which case tracing data is +removed and replaced with the synthesized events. e.g. + + perf inject --itrace -i perf.data -o perf.data.new diff --git a/tools/perf/arch/x86/util/Build b/tools/perf/arch/x86/util/Build index a8be9f9d0462..2c55e1b336c5 100644 --- a/tools/perf/arch/x86/util/Build +++ b/tools/perf/arch/x86/util/Build @@ -10,3 +10,4 @@ libperf-$(CONFIG_LIBDW_DWARF_UNWIND) += unwind-libdw.o libperf-$(CONFIG_AUXTRACE) += auxtrace.o libperf-$(CONFIG_AUXTRACE) += intel-pt.o +libperf-$(CONFIG_AUXTRACE) += intel-bts.o diff --git a/tools/perf/arch/x86/util/auxtrace.c b/tools/perf/arch/x86/util/auxtrace.c index e7654b506312..7a7805583e3f 100644 --- a/tools/perf/arch/x86/util/auxtrace.c +++ b/tools/perf/arch/x86/util/auxtrace.c @@ -13,11 +13,56 @@ * */ +#include <stdbool.h> + #include "../../util/header.h" +#include "../../util/debug.h" +#include "../../util/pmu.h" #include "../../util/auxtrace.h" #include "../../util/intel-pt.h" +#include "../../util/intel-bts.h" +#include "../../util/evlist.h" + +static +struct auxtrace_record *auxtrace_record__init_intel(struct perf_evlist *evlist, + int *err) +{ + struct perf_pmu *intel_pt_pmu; + struct perf_pmu *intel_bts_pmu; + struct perf_evsel *evsel; + bool found_pt = false; + bool found_bts = false; + + intel_pt_pmu = perf_pmu__find(INTEL_PT_PMU_NAME); + intel_bts_pmu = perf_pmu__find(INTEL_BTS_PMU_NAME); + + if (evlist) { + evlist__for_each(evlist, evsel) { + if (intel_pt_pmu && + evsel->attr.type == intel_pt_pmu->type) + found_pt = true; + if (intel_bts_pmu && + evsel->attr.type == intel_bts_pmu->type) + found_bts = true; + } + } + + if (found_pt && found_bts) { + pr_err("intel_pt and intel_bts may not be used together\n"); + *err = -EINVAL; + return NULL; + } + + if (found_pt) + return intel_pt_recording_init(err); + + if (found_bts) + return intel_bts_recording_init(err); -struct auxtrace_record *auxtrace_record__init(struct perf_evlist *evlist __maybe_unused, + return NULL; +} + +struct auxtrace_record *auxtrace_record__init(struct perf_evlist *evlist, int *err) { char buffer[64]; @@ -32,7 +77,7 @@ struct auxtrace_record *auxtrace_record__init(struct perf_evlist *evlist __maybe } if (!strncmp(buffer, "GenuineIntel,", 13)) - return intel_pt_recording_init(err); + return auxtrace_record__init_intel(evlist, err); return NULL; } diff --git a/tools/perf/arch/x86/util/intel-bts.c b/tools/perf/arch/x86/util/intel-bts.c new file mode 100644 index 000000000000..9b94ce520917 --- /dev/null +++ b/tools/perf/arch/x86/util/intel-bts.c @@ -0,0 +1,458 @@ +/* + * intel-bts.c: Intel Processor Trace support + * Copyright (c) 2013-2015, Intel Corporation. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms and conditions of the GNU General Public License, + * version 2, as published by the Free Software Foundation. + * + * This program is distributed in the hope it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + */ + +#include <linux/kernel.h> +#include <linux/types.h> +#include <linux/bitops.h> +#include <linux/log2.h> + +#include "../../util/cpumap.h" +#include "../../util/evsel.h" +#include "../../util/evlist.h" +#include "../../util/session.h" +#include "../../util/util.h" +#include "../../util/pmu.h" +#include "../../util/debug.h" +#include "../../util/tsc.h" +#include "../../util/auxtrace.h" +#include "../../util/intel-bts.h" + +#define KiB(x) ((x) * 1024) +#define MiB(x) ((x) * 1024 * 1024) +#define KiB_MASK(x) (KiB(x) - 1) +#define MiB_MASK(x) (MiB(x) - 1) + +#define INTEL_BTS_DFLT_SAMPLE_SIZE KiB(4) + +#define INTEL_BTS_MAX_SAMPLE_SIZE KiB(60) + +struct intel_bts_snapshot_ref { + void *ref_buf; + size_t ref_offset; + bool wrapped; +}; + +struct intel_bts_recording { + struct auxtrace_record itr; + struct perf_pmu *intel_bts_pmu; + struct perf_evlist *evlist; + bool snapshot_mode; + size_t snapshot_size; + int snapshot_ref_cnt; + struct intel_bts_snapshot_ref *snapshot_refs; +}; + +struct branch { + u64 from; + u64 to; + u64 misc; +}; + +static size_t intel_bts_info_priv_size(struct auxtrace_record *itr __maybe_unused) +{ + return INTEL_BTS_AUXTRACE_PRIV_SIZE; +} + +static int intel_bts_info_fill(struct auxtrace_record *itr, + struct perf_session *session, + struct auxtrace_info_event *auxtrace_info, + size_t priv_size) +{ + struct intel_bts_recording *btsr = + container_of(itr, struct intel_bts_recording, itr); + struct perf_pmu *intel_bts_pmu = btsr->intel_bts_pmu; + struct perf_event_mmap_page *pc; + struct perf_tsc_conversion tc = { .time_mult = 0, }; + bool cap_user_time_zero = false; + int err; + + if (priv_size != INTEL_BTS_AUXTRACE_PRIV_SIZE) + return -EINVAL; + + if (!session->evlist->nr_mmaps) + return -EINVAL; + + pc = session->evlist->mmap[0].base; + if (pc) { + err = perf_read_tsc_conversion(pc, &tc); + if (err) { + if (err != -EOPNOTSUPP) + return err; + } else { + cap_user_time_zero = tc.time_mult != 0; + } + if (!cap_user_time_zero) + ui__warning("Intel BTS: TSC not available\n"); + } + + auxtrace_info->type = PERF_AUXTRACE_INTEL_BTS; + auxtrace_info->priv[INTEL_BTS_PMU_TYPE] = intel_bts_pmu->type; + auxtrace_info->priv[INTEL_BTS_TIME_SHIFT] = tc.time_shift; + auxtrace_info->priv[INTEL_BTS_TIME_MULT] = tc.time_mult; + auxtrace_info->priv[INTEL_BTS_TIME_ZERO] = tc.time_zero; + auxtrace_info->priv[INTEL_BTS_CAP_USER_TIME_ZERO] = cap_user_time_zero; + auxtrace_info->priv[INTEL_BTS_SNAPSHOT_MODE] = btsr->snapshot_mode; + + return 0; +} + +static int intel_bts_recording_options(struct auxtrace_record *itr, + struct perf_evlist *evlist, + struct record_opts *opts) +{ + struct intel_bts_recording *btsr = + container_of(itr, struct intel_bts_recording, itr); + struct perf_pmu *intel_bts_pmu = btsr->intel_bts_pmu; + struct perf_evsel *evsel, *intel_bts_evsel = NULL; + const struct cpu_map *cpus = evlist->cpus; + bool privileged = geteuid() == 0 || perf_event_paranoid() < 0; + + btsr->evlist = evlist; + btsr->snapshot_mode = opts->auxtrace_snapshot_mode; + + evlist__for_each(evlist, evsel) { + if (evsel->attr.type == intel_bts_pmu->type) { + if (intel_bts_evsel) { + pr_err("There may be only one " INTEL_BTS_PMU_NAME " event\n"); + return -EINVAL; + } + evsel->attr.freq = 0; + evsel->attr.sample_period = 1; + intel_bts_evsel = evsel; + opts->full_auxtrace = true; + } + } + + if (opts->auxtrace_snapshot_mode && !opts->full_auxtrace) { + pr_err("Snapshot mode (-S option) requires " INTEL_BTS_PMU_NAME " PMU event (-e " INTEL_BTS_PMU_NAME ")\n"); + return -EINVAL; + } + + if (!opts->full_auxtrace) + return 0; + + if (opts->full_auxtrace && !cpu_map__empty(cpus)) { + pr_err(INTEL_BTS_PMU_NAME " does not support per-cpu recording\n"); + return -EINVAL; + } + + /* Set default sizes for snapshot mode */ + if (opts->auxtrace_snapshot_mode) { + if (!opts->auxtrace_snapshot_size && !opts->auxtrace_mmap_pages) { + if (privileged) { + opts->auxtrace_mmap_pages = MiB(4) / page_size; + } else { + opts->auxtrace_mmap_pages = KiB(128) / page_size; + if (opts->mmap_pages == UINT_MAX) + opts->mmap_pages = KiB(256) / page_size; + } + } else if (!opts->auxtrace_mmap_pages && !privileged && + opts->mmap_pages == UINT_MAX) { + opts->mmap_pages = KiB(256) / page_size; + } + if (!opts->auxtrace_snapshot_size) + opts->auxtrace_snapshot_size = + opts->auxtrace_mmap_pages * (size_t)page_size; + if (!opts->auxtrace_mmap_pages) { + size_t sz = opts->auxtrace_snapshot_size; + + sz = round_up(sz, page_size) / page_size; + opts->auxtrace_mmap_pages = roundup_pow_of_two(sz); + } + if (opts->auxtrace_snapshot_size > + opts->auxtrace_mmap_pages * (size_t)page_size) { + pr_err("Snapshot size %zu must not be greater than AUX area tracing mmap size %zu\n", + opts->auxtrace_snapshot_size, + opts->auxtrace_mmap_pages * (size_t)page_size); + return -EINVAL; + } + if (!opts->auxtrace_snapshot_size || !opts->auxtrace_mmap_pages) { + pr_err("Failed to calculate default snapshot size and/or AUX area tracing mmap pages\n"); + return -EINVAL; + } + pr_debug2("Intel BTS snapshot size: %zu\n", + opts->auxtrace_snapshot_size); + } + + /* Set default sizes for full trace mode */ + if (opts->full_auxtrace && !opts->auxtrace_mmap_pages) { + if (privileged) { + opts->auxtrace_mmap_pages = MiB(4) / page_size; + } else { + opts->auxtrace_mmap_pages = KiB(128) / page_size; + if (opts->mmap_pages == UINT_MAX) + opts->mmap_pages = KiB(256) / page_size; + } + } + + /* Validate auxtrace_mmap_pages */ + if (opts->auxtrace_mmap_pages) { + size_t sz = opts->auxtrace_mmap_pages * (size_t)page_size; + size_t min_sz; + + if (opts->auxtrace_snapshot_mode) + min_sz = KiB(4); + else + min_sz = KiB(8); + + if (sz < min_sz || !is_power_of_2(sz)) { + pr_err("Invalid mmap size for Intel BTS: must be at least %zuKiB and a power of 2\n", + min_sz / 1024); + return -EINVAL; + } + } + + if (intel_bts_evsel) { + /* + * To obtain the auxtrace buffer file descriptor, the auxtrace event + * must come first. + */ + perf_evlist__to_front(evlist, intel_bts_evsel); + /* + * In the case of per-cpu mmaps, we need the CPU on the + * AUX event. + */ + if (!cpu_map__empty(cpus)) + perf_evsel__set_sample_bit(intel_bts_evsel, CPU); + } + + /* Add dummy event to keep tracking */ + if (opts->full_auxtrace) { + struct perf_evsel *tracking_evsel; + int err; + + err = parse_events(evlist, "dummy:u", NULL); + if (err) + return err; + + tracking_evsel = perf_evlist__last(evlist); + + perf_evlist__set_tracking_event(evlist, tracking_evsel); + + tracking_evsel->attr.freq = 0; + tracking_evsel->attr.sample_period = 1; + } + + return 0; +} + +static int intel_bts_parse_snapshot_options(struct auxtrace_record *itr, + struct record_opts *opts, + const char *str) +{ + struct intel_bts_recording *btsr = + container_of(itr, struct intel_bts_recording, itr); + unsigned long long snapshot_size = 0; + char *endptr; + + if (str) { + snapshot_size = strtoull(str, &endptr, 0); + if (*endptr || snapshot_size > SIZE_MAX) + return -1; + } + + opts->auxtrace_snapshot_mode = true; + opts->auxtrace_snapshot_size = snapshot_size; + + btsr->snapshot_size = snapshot_size; + + return 0; +} + +static u64 intel_bts_reference(struct auxtrace_record *itr __maybe_unused) +{ + return rdtsc(); +} + +static int intel_bts_alloc_snapshot_refs(struct intel_bts_recording *btsr, + int idx) +{ + const size_t sz = sizeof(struct intel_bts_snapshot_ref); + int cnt = btsr->snapshot_ref_cnt, new_cnt = cnt * 2; + struct intel_bts_snapshot_ref *refs; + + if (!new_cnt) + new_cnt = 16; + + while (new_cnt <= idx) + new_cnt *= 2; + + refs = calloc(new_cnt, sz); + if (!refs) + return -ENOMEM; + + memcpy(refs, btsr->snapshot_refs, cnt * sz); + + btsr->snapshot_refs = refs; + btsr->snapshot_ref_cnt = new_cnt; + + return 0; +} + +static void intel_bts_free_snapshot_refs(struct intel_bts_recording *btsr) +{ + int i; + + for (i = 0; i < btsr->snapshot_ref_cnt; i++) + zfree(&btsr->snapshot_refs[i].ref_buf); + zfree(&btsr->snapshot_refs); +} + +static void intel_bts_recording_free(struct auxtrace_record *itr) +{ + struct intel_bts_recording *btsr = + container_of(itr, struct intel_bts_recording, itr); + + intel_bts_free_snapshot_refs(btsr); + free(btsr); +} + +static int intel_bts_snapshot_start(struct auxtrace_record *itr) +{ + struct intel_bts_recording *btsr = + container_of(itr, struct intel_bts_recording, itr); + struct perf_evsel *evsel; + + evlist__for_each(btsr->evlist, evsel) { + if (evsel->attr.type == btsr->intel_bts_pmu->type) + return perf_evlist__disable_event(btsr->evlist, evsel); + } + return -EINVAL; +} + +static int intel_bts_snapshot_finish(struct auxtrace_record *itr) +{ + struct intel_bts_recording *btsr = + container_of(itr, struct intel_bts_recording, itr); + struct perf_evsel *evsel; + + evlist__for_each(btsr->evlist, evsel) { + if (evsel->attr.type == btsr->intel_bts_pmu->type) + return perf_evlist__enable_event(btsr->evlist, evsel); + } + return -EINVAL; +} + +static bool intel_bts_first_wrap(u64 *data, size_t buf_size) +{ + int i, a, b; + + b = buf_size >> 3; + a = b - 512; + if (a < 0) + a = 0; + + for (i = a; i < b; i++) { + if (data[i]) + return true; + } + + return false; +} + +static int intel_bts_find_snapshot(struct auxtrace_record *itr, int idx, + struct auxtrace_mmap *mm, unsigned char *data, + u64 *head, u64 *old) +{ + struct intel_bts_recording *btsr = + container_of(itr, struct intel_bts_recording, itr); + bool wrapped; + int err; + + pr_debug3("%s: mmap index %d old head %zu new head %zu\n", + __func__, idx, (size_t)*old, (size_t)*head); + + if (idx >= btsr->snapshot_ref_cnt) { + err = intel_bts_alloc_snapshot_refs(btsr, idx); + if (err) + goto out_err; + } + + wrapped = btsr->snapshot_refs[idx].wrapped; + if (!wrapped && intel_bts_first_wrap((u64 *)data, mm->len)) { + btsr->snapshot_refs[idx].wrapped = true; + wrapped = true; + } + + /* + * In full trace mode 'head' continually increases. However in snapshot + * mode 'head' is an offset within the buffer. Here 'old' and 'head' + * are adjusted to match the full trace case which expects that 'old' is + * always less than 'head'. + */ + if (wrapped) { + *old = *head; + *head += mm->len; + } else { + if (mm->mask) + *old &= mm->mask; + else + *old %= mm->len; + if (*old > *head) + *head += mm->len; + } + + pr_debug3("%s: wrap-around %sdetected, adjusted old head %zu adjusted new head %zu\n", + __func__, wrapped ? "" : "not ", (size_t)*old, (size_t)*head); + + return 0; + +out_err: + pr_err("%s: failed, error %d\n", __func__, err); + return err; +} + +static int intel_bts_read_finish(struct auxtrace_record *itr, int idx) +{ + struct intel_bts_recording *btsr = + container_of(itr, struct intel_bts_recording, itr); + struct perf_evsel *evsel; + + evlist__for_each(btsr->evlist, evsel) { + if (evsel->attr.type == btsr->intel_bts_pmu->type) + return perf_evlist__enable_event_idx(btsr->evlist, + evsel, idx); + } + return -EINVAL; +} + +struct auxtrace_record *intel_bts_recording_init(int *err) +{ + struct perf_pmu *intel_bts_pmu = perf_pmu__find(INTEL_BTS_PMU_NAME); + struct intel_bts_recording *btsr; + + if (!intel_bts_pmu) + return NULL; + + btsr = zalloc(sizeof(struct intel_bts_recording)); + if (!btsr) { + *err = -ENOMEM; + return NULL; + } + + btsr->intel_bts_pmu = intel_bts_pmu; + btsr->itr.recording_options = intel_bts_recording_options; + btsr->itr.info_priv_size = intel_bts_info_priv_size; + btsr->itr.info_fill = intel_bts_info_fill; + btsr->itr.free = intel_bts_recording_free; + btsr->itr.snapshot_start = intel_bts_snapshot_start; + btsr->itr.snapshot_finish = intel_bts_snapshot_finish; + btsr->itr.find_snapshot = intel_bts_find_snapshot; + btsr->itr.parse_snapshot_options = intel_bts_parse_snapshot_options; + btsr->itr.reference = intel_bts_reference; + btsr->itr.read_finish = intel_bts_read_finish; + btsr->itr.alignment = sizeof(struct branch); + return &btsr->itr; +} diff --git a/tools/perf/arch/x86/util/pmu.c b/tools/perf/arch/x86/util/pmu.c index fd11cc3ce780..79fe07158d00 100644 --- a/tools/perf/arch/x86/util/pmu.c +++ b/tools/perf/arch/x86/util/pmu.c @@ -3,6 +3,7 @@ #include <linux/perf_event.h> #include "../../util/intel-pt.h" +#include "../../util/intel-bts.h" #include "../../util/pmu.h" struct perf_event_attr *perf_pmu__get_default_config(struct perf_pmu *pmu __maybe_unused) @@ -10,6 +11,8 @@ struct perf_event_attr *perf_pmu__get_default_config(struct perf_pmu *pmu __mayb #ifdef HAVE_AUXTRACE_SUPPORT if (!strcmp(pmu->name, INTEL_PT_PMU_NAME)) return intel_pt_pmu_default_config(pmu); + if (!strcmp(pmu->name, INTEL_BTS_PMU_NAME)) + pmu->selectable = true; #endif return NULL; } diff --git a/tools/perf/util/Build b/tools/perf/util/Build index c20473d1369e..e912856cc4e5 100644 --- a/tools/perf/util/Build +++ b/tools/perf/util/Build @@ -80,6 +80,7 @@ libperf-y += thread-stack.o libperf-$(CONFIG_AUXTRACE) += auxtrace.o libperf-$(CONFIG_AUXTRACE) += intel-pt-decoder/ libperf-$(CONFIG_AUXTRACE) += intel-pt.o +libperf-$(CONFIG_AUXTRACE) += intel-bts.o libperf-y += parse-branch-options.o libperf-$(CONFIG_LIBELF) += symbol-elf.o diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c index 0f0b7e11e2d9..a980e7c50ee0 100644 --- a/tools/perf/util/auxtrace.c +++ b/tools/perf/util/auxtrace.c @@ -48,6 +48,7 @@ #include "parse-options.h" #include "intel-pt.h" +#include "intel-bts.h" int auxtrace_mmap__mmap(struct auxtrace_mmap *mm, struct auxtrace_mmap_params *mp, @@ -888,6 +889,8 @@ int perf_event__process_auxtrace_info(struct perf_tool *tool __maybe_unused, switch (type) { case PERF_AUXTRACE_INTEL_PT: return intel_pt_process_auxtrace_info(event, session); + case PERF_AUXTRACE_INTEL_BTS: + return intel_bts_process_auxtrace_info(event, session); case PERF_AUXTRACE_UNKNOWN: default: return -EINVAL; diff --git a/tools/perf/util/auxtrace.h b/tools/perf/util/auxtrace.h index 7d12f33a3a06..bf72b77a588a 100644 --- a/tools/perf/util/auxtrace.h +++ b/tools/perf/util/auxtrace.h @@ -40,6 +40,7 @@ struct events_stats; enum auxtrace_type { PERF_AUXTRACE_UNKNOWN, PERF_AUXTRACE_INTEL_PT, + PERF_AUXTRACE_INTEL_BTS, }; enum itrace_period_type { diff --git a/tools/perf/util/intel-bts.c b/tools/perf/util/intel-bts.c new file mode 100644 index 000000000000..ea768625ab5b --- /dev/null +++ b/tools/perf/util/intel-bts.c @@ -0,0 +1,933 @@ +/* + * intel-bts.c: Intel Processor Trace support + * Copyright (c) 2013-2015, Intel Corporation. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms and conditions of the GNU General Public License, + * version 2, as published by the Free Software Foundation. + * + * This program is distributed in the hope it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + */ + +#include <endian.h> +#include <byteswap.h> +#include <linux/kernel.h> +#include <linux/types.h> +#include <linux/bitops.h> +#include <linux/log2.h> + +#include "cpumap.h" +#include "color.h" +#include "evsel.h" +#include "evlist.h" +#include "machine.h" +#include "session.h" +#include "util.h" +#include "thread.h" +#include "thread-stack.h" +#include "debug.h" +#include "tsc.h" +#include "auxtrace.h" +#include "intel-pt-decoder/intel-pt-insn-decoder.h" +#include "intel-bts.h" + +#define MAX_TIMESTAMP (~0ULL) + +#define INTEL_BTS_ERR_NOINSN 5 +#define INTEL_BTS_ERR_LOST 9 + +#if __BYTE_ORDER == __BIG_ENDIAN +#define le64_to_cpu bswap_64 +#else +#define le64_to_cpu +#endif + +struct intel_bts { + struct auxtrace auxtrace; + struct auxtrace_queues queues; + struct auxtrace_heap heap; + u32 auxtrace_type; + struct perf_session *session; + struct machine *machine; + bool sampling_mode; + bool snapshot_mode; + bool data_queued; + u32 pmu_type; + struct perf_tsc_conversion tc; + bool cap_user_time_zero; + struct itrace_synth_opts synth_opts; + bool sample_branches; + u32 branches_filter; + u64 branches_sample_type; + u64 branches_id; + size_t branches_event_size; + bool synth_needs_swap; +}; + +struct intel_bts_queue { + struct intel_bts *bts; + unsigned int queue_nr; + struct auxtrace_buffer *buffer; + bool on_heap; + bool done; + pid_t pid; + pid_t tid; + int cpu; + u64 time; + struct intel_pt_insn intel_pt_insn; + u32 sample_flags; +}; + +struct branch { + u64 from; + u64 to; + u64 misc; +}; + +static void intel_bts_dump(struct intel_bts *bts __maybe_unused, + unsigned char *buf, size_t len) +{ + struct branch *branch; + size_t i, pos = 0, br_sz = sizeof(struct branch), sz; + const char *color = PERF_COLOR_BLUE; + + color_fprintf(stdout, color, + ". ... Intel BTS data: size %zu bytes\n", + len); + + while (len) { + if (len >= br_sz) + sz = br_sz; + else + sz = len; + printf("."); + color_fprintf(stdout, color, " %08x: ", pos); + for (i = 0; i < sz; i++) + color_fprintf(stdout, color, " %02x", buf[i]); + for (; i < br_sz; i++) + color_fprintf(stdout, color, " "); + if (len >= br_sz) { + branch = (struct branch *)buf; + color_fprintf(stdout, color, " %"PRIx64" -> %"PRIx64" %s\n", + le64_to_cpu(branch->from), + le64_to_cpu(branch->to), + le64_to_cpu(branch->misc) & 0x10 ? + "pred" : "miss"); + } else { + color_fprintf(stdout, color, " Bad record!\n"); + } + pos += sz; + buf += sz; + len -= sz; + } +} + +static void intel_bts_dump_event(struct intel_bts *bts, unsigned char *buf, + size_t len) +{ + printf(".\n"); + intel_bts_dump(bts, buf, len); +} + +static int intel_bts_lost(struct intel_bts *bts, struct perf_sample *sample) +{ + union perf_event event; + int err; + + auxtrace_synth_error(&event.auxtrace_error, PERF_AUXTRACE_ERROR_ITRACE, + INTEL_BTS_ERR_LOST, sample->cpu, sample->pid, + sample->tid, 0, "Lost trace data"); + + err = perf_session__deliver_synth_event(bts->session, &event, NULL); + if (err) + pr_err("Intel BTS: failed to deliver error event, error %d\n", + err); + + return err; +} + +static struct intel_bts_queue *intel_bts_alloc_queue(struct intel_bts *bts, + unsigned int queue_nr) +{ + struct intel_bts_queue *btsq; + + btsq = zalloc(sizeof(struct intel_bts_queue)); + if (!btsq) + return NULL; + + btsq->bts = bts; + btsq->queue_nr = queue_nr; + btsq->pid = -1; + btsq->tid = -1; + btsq->cpu = -1; + + return btsq; +} + +static int intel_bts_setup_queue(struct intel_bts *bts, + struct auxtrace_queue *queue, + unsigned int queue_nr) +{ + struct intel_bts_queue *btsq = queue->priv; + + if (list_empty(&queue->head)) + return 0; + + if (!btsq) { + btsq = intel_bts_alloc_queue(bts, queue_nr); + if (!btsq) + return -ENOMEM; + queue->priv = btsq; + + if (queue->cpu != -1) + btsq->cpu = queue->cpu; + btsq->tid = queue->tid; + } + + if (bts->sampling_mode) + return 0; + + if (!btsq->on_heap && !btsq->buffer) { + int ret; + + btsq->buffer = auxtrace_buffer__next(queue, NULL); + if (!btsq->buffer) + return 0; + + ret = auxtrace_heap__add(&bts->heap, queue_nr, + btsq->buffer->reference); + if (ret) + return ret; + btsq->on_heap = true; + } + + return 0; +} + +static int intel_bts_setup_queues(struct intel_bts *bts) +{ + unsigned int i; + int ret; + + for (i = 0; i < bts->queues.nr_queues; i++) { + ret = intel_bts_setup_queue(bts, &bts->queues.queue_array[i], + i); + if (ret) + return ret; + } + return 0; +} + +static inline int intel_bts_update_queues(struct intel_bts *bts) +{ + if (bts->queues.new_data) { + bts->queues.new_data = false; + return intel_bts_setup_queues(bts); + } + return 0; +} + +static unsigned char *intel_bts_find_overlap(unsigned char *buf_a, size_t len_a, + unsigned char *buf_b, size_t len_b) +{ + size_t offs, len; + + if (len_a > len_b) + offs = len_a - len_b; + else + offs = 0; + + for (; offs < len_a; offs += sizeof(struct branch)) { + len = len_a - offs; + if (!memcmp(buf_a + offs, buf_b, len)) + return buf_b + len; + } + + return buf_b; +} + +static int intel_bts_do_fix_overlap(struct auxtrace_queue *queue, + struct auxtrace_buffer *b) +{ + struct auxtrace_buffer *a; + void *start; + + if (b->list.prev == &queue->head) + return 0; + a = list_entry(b->list.prev, struct auxtrace_buffer, list); + start = intel_bts_find_overlap(a->data, a->size, b->data, b->size); + if (!start) + return -EINVAL; + b->use_size = b->data + b->size - start; + b->use_data = start; + return 0; +} + +static int intel_bts_synth_branch_sample(struct intel_bts_queue *btsq, + struct branch *branch) +{ + int ret; + struct intel_bts *bts = btsq->bts; + union perf_event event; + struct perf_sample sample = { .ip = 0, }; + + event.sample.header.type = PERF_RECORD_SAMPLE; + event.sample.header.misc = PERF_RECORD_MISC_USER; + event.sample.header.size = sizeof(struct perf_event_header); + + sample.ip = le64_to_cpu(branch->from); + sample.pid = btsq->pid; + sample.tid = btsq->tid; + sample.addr = le64_to_cpu(branch->to); + sample.id = btsq->bts->branches_id; + sample.stream_id = btsq->bts->branches_id; + sample.period = 1; + sample.cpu = btsq->cpu; + sample.flags = btsq->sample_flags; + sample.insn_len = btsq->intel_pt_insn.length; + + if (bts->synth_opts.inject) { + event.sample.header.size = bts->branches_event_size; + ret = perf_event__synthesize_sample(&event, + bts->branches_sample_type, + 0, &sample, + bts->synth_needs_swap); + if (ret) + return ret; + } + + ret = perf_session__deliver_synth_event(bts->session, &event, &sample); + if (ret) + pr_err("Intel BTS: failed to deliver branch event, error %d\n", + ret); + + return ret; +} + +static int intel_bts_get_next_insn(struct intel_bts_queue *btsq, u64 ip) +{ + struct machine *machine = btsq->bts->machine; + struct thread *thread; + struct addr_location al; + unsigned char buf[1024]; + size_t bufsz; + ssize_t len; + int x86_64; + uint8_t cpumode; + int err = -1; + + bufsz = intel_pt_insn_max_size(); + + if (machine__kernel_ip(machine, ip)) + cpumode = PERF_RECORD_MISC_KERNEL; + else + cpumode = PERF_RECORD_MISC_USER; + + thread = machine__find_thread(machine, -1, btsq->tid); + if (!thread) + return -1; + + thread__find_addr_map(thread, cpumode, MAP__FUNCTION, ip, &al); + if (!al.map || !al.map->dso) + goto out_put; + + len = dso__data_read_addr(al.map->dso, al.map, machine, ip, buf, bufsz); + if (len <= 0) + goto out_put; + + /* Load maps to ensure dso->is_64_bit has been updated */ + map__load(al.map, machine->symbol_filter); + + x86_64 = al.map->dso->is_64_bit; + + if (intel_pt_get_insn(buf, len, x86_64, &btsq->intel_pt_insn)) + goto out_put; + + err = 0; +out_put: + thread__put(thread); + return err; +} + +static int intel_bts_synth_error(struct intel_bts *bts, int cpu, pid_t pid, + pid_t tid, u64 ip) +{ + union perf_event event; + int err; + + auxtrace_synth_error(&event.auxtrace_error, PERF_AUXTRACE_ERROR_ITRACE, + INTEL_BTS_ERR_NOINSN, cpu, pid, tid, ip, + "Failed to get instruction"); + + err = perf_session__deliver_synth_event(bts->session, &event, NULL); + if (err) + pr_err("Intel BTS: failed to deliver error event, error %d\n", + err); + + return err; +} + +static int intel_bts_get_branch_type(struct intel_bts_queue *btsq, + struct branch *branch) +{ + int err; + + if (!branch->from) { + if (branch->to) + btsq->sample_flags = PERF_IP_FLAG_BRANCH | + PERF_IP_FLAG_TRACE_BEGIN; + else + btsq->sample_flags = 0; + btsq->intel_pt_insn.length = 0; + } else if (!branch->to) { + btsq->sample_flags = PERF_IP_FLAG_BRANCH | + PERF_IP_FLAG_TRACE_END; + btsq->intel_pt_insn.length = 0; + } else { + err = intel_bts_get_next_insn(btsq, branch->from); + if (err) { + btsq->sample_flags = 0; + btsq->intel_pt_insn.length = 0; + if (!btsq->bts->synth_opts.errors) + return 0; + err = intel_bts_synth_error(btsq->bts, btsq->cpu, + btsq->pid, btsq->tid, + branch->from); + return err; + } + btsq->sample_flags = intel_pt_insn_type(btsq->intel_pt_insn.op); + /* Check for an async branch into the kernel */ + if (!machine__kernel_ip(btsq->bts->machine, branch->from) && + machine__kernel_ip(btsq->bts->machine, branch->to) && + btsq->sample_flags != (PERF_IP_FLAG_BRANCH | + PERF_IP_FLAG_CALL | + PERF_IP_FLAG_SYSCALLRET)) + btsq->sample_flags = PERF_IP_FLAG_BRANCH | + PERF_IP_FLAG_CALL | + PERF_IP_FLAG_ASYNC | + PERF_IP_FLAG_INTERRUPT; + } + + return 0; +} + +static int intel_bts_process_buffer(struct intel_bts_queue *btsq, + struct auxtrace_buffer *buffer) +{ + struct branch *branch; + size_t sz, bsz = sizeof(struct branch); + u32 filter = btsq->bts->branches_filter; + int err = 0; + + if (buffer->use_data) { + sz = buffer->use_size; + branch = buffer->use_data; + } else { + sz = buffer->size; + branch = buffer->data; + } + + if (!btsq->bts->sample_branches) + return 0; + + for (; sz > bsz; branch += 1, sz -= bsz) { + if (!branch->from && !branch->to) + continue; + intel_bts_get_branch_type(btsq, branch); + if (filter && !(filter & btsq->sample_flags)) + continue; + err = intel_bts_synth_branch_sample(btsq, branch); + if (err) + break; + } + return err; +} + +static int intel_bts_process_queue(struct intel_bts_queue *btsq, u64 *timestamp) +{ + struct auxtrace_buffer *buffer = btsq->buffer, *old_buffer = buffer; + struct auxtrace_queue *queue; + struct thread *thread; + int err; + + if (btsq->done) + return 1; + + if (btsq->pid == -1) { + thread = machine__find_thread(btsq->bts->machine, -1, + btsq->tid); + if (thread) + btsq->pid = thread->pid_; + } else { + thread = machine__findnew_thread(btsq->bts->machine, btsq->pid, + btsq->tid); + } + + queue = &btsq->bts->queues.queue_array[btsq->queue_nr]; + + if (!buffer) + buffer = auxtrace_buffer__next(queue, NULL); + + if (!buffer) { + if (!btsq->bts->sampling_mode) + btsq->done = 1; + err = 1; + goto out_put; + } + + /* Currently there is no support for split buffers */ + if (buffer->consecutive) { + err = -EINVAL; + goto out_put; + } + + if (!buffer->data) { + int fd = perf_data_file__fd(btsq->bts->session->file); + + buffer->data = auxtrace_buffer__get_data(buffer, fd); + if (!buffer->data) { + err = -ENOMEM; + goto out_put; + } + } + + if (btsq->bts->snapshot_mode && !buffer->consecutive && + intel_bts_do_fix_overlap(queue, buffer)) { + err = -ENOMEM; + goto out_put; + } + + if (!btsq->bts->synth_opts.callchain && thread && + (!old_buffer || btsq->bts->sampling_mode || + (btsq->bts->snapshot_mode && !buffer->consecutive))) + thread_stack__set_trace_nr(thread, buffer->buffer_nr + 1); + + err = intel_bts_process_buffer(btsq, buffer); + + auxtrace_buffer__drop_data(buffer); + + btsq->buffer = auxtrace_buffer__next(queue, buffer); + if (btsq->buffer) { + if (timestamp) + *timestamp = btsq->buffer->reference; + } else { + if (!btsq->bts->sampling_mode) + btsq->done = 1; + } +out_put: + thread__put(thread); + return err; +} + +static int intel_bts_flush_queue(struct intel_bts_queue *btsq) +{ + u64 ts = 0; + int ret; + + while (1) { + ret = intel_bts_process_queue(btsq, &ts); + if (ret < 0) + return ret; + if (ret) + break; + } + return 0; +} + +static int intel_bts_process_tid_exit(struct intel_bts *bts, pid_t tid) +{ + struct auxtrace_queues *queues = &bts->queues; + unsigned int i; + + for (i = 0; i < queues->nr_queues; i++) { + struct auxtrace_queue *queue = &bts->queues.queue_array[i]; + struct intel_bts_queue *btsq = queue->priv; + + if (btsq && btsq->tid == tid) + return intel_bts_flush_queue(btsq); + } + return 0; +} + +static int intel_bts_process_queues(struct intel_bts *bts, u64 timestamp) +{ + while (1) { + unsigned int queue_nr; + struct auxtrace_queue *queue; + struct intel_bts_queue *btsq; + u64 ts = 0; + int ret; + + if (!bts->heap.heap_cnt) + return 0; + + if (bts->heap.heap_array[0].ordinal > timestamp) + return 0; + + queue_nr = bts->heap.heap_array[0].queue_nr; + queue = &bts->queues.queue_array[queue_nr]; + btsq = queue->priv; + + auxtrace_heap__pop(&bts->heap); + + ret = intel_bts_process_queue(btsq, &ts); + if (ret < 0) { + auxtrace_heap__add(&bts->heap, queue_nr, ts); + return ret; + } + + if (!ret) { + ret = auxtrace_heap__add(&bts->heap, queue_nr, ts); + if (ret < 0) + return ret; + } else { + btsq->on_heap = false; + } + } + + return 0; +} + +static int intel_bts_process_event(struct perf_session *session, + union perf_event *event, + struct perf_sample *sample, + struct perf_tool *tool) +{ + struct intel_bts *bts = container_of(session->auxtrace, struct intel_bts, + auxtrace); + u64 timestamp; + int err; + + if (dump_trace) + return 0; + + if (!tool->ordered_events) { + pr_err("Intel BTS requires ordered events\n"); + return -EINVAL; + } + + if (sample->time && sample->time != (u64)-1) + timestamp = perf_time_to_tsc(sample->time, &bts->tc); + else + timestamp = 0; + + err = intel_bts_update_queues(bts); + if (err) + return err; + + err = intel_bts_process_queues(bts, timestamp); + if (err) + return err; + if (event->header.type == PERF_RECORD_EXIT) { + err = intel_bts_process_tid_exit(bts, event->comm.tid); + if (err) + return err; + } + + if (event->header.type == PERF_RECORD_AUX && + (event->aux.flags & PERF_AUX_FLAG_TRUNCATED) && + bts->synth_opts.errors) + err = intel_bts_lost(bts, sample); + + return err; +} + +static int intel_bts_process_auxtrace_event(struct perf_session *session, + union perf_event *event, + struct perf_tool *tool __maybe_unused) +{ + struct intel_bts *bts = container_of(session->auxtrace, struct intel_bts, + auxtrace); + + if (bts->sampling_mode) + return 0; + + if (!bts->data_queued) { + struct auxtrace_buffer *buffer; + off_t data_offset; + int fd = perf_data_file__fd(session->file); + int err; + + if (perf_data_file__is_pipe(session->file)) { + data_offset = 0; + } else { + data_offset = lseek(fd, 0, SEEK_CUR); + if (data_offset == -1) + return -errno; + } + + err = auxtrace_queues__add_event(&bts->queues, session, event, + data_offset, &buffer); + if (err) + return err; + + /* Dump here now we have copied a piped trace out of the pipe */ + if (dump_trace) { + if (auxtrace_buffer__get_data(buffer, fd)) { + intel_bts_dump_event(bts, buffer->data, + buffer->size); + auxtrace_buffer__put_data(buffer); + } + } + } + + return 0; +} + +static int intel_bts_flush(struct perf_session *session __maybe_unused, + struct perf_tool *tool __maybe_unused) +{ + struct intel_bts *bts = container_of(session->auxtrace, struct intel_bts, + auxtrace); + int ret; + + if (dump_trace || bts->sampling_mode) + return 0; + + if (!tool->ordered_events) + return -EINVAL; + + ret = intel_bts_update_queues(bts); + if (ret < 0) + return ret; + + return intel_bts_process_queues(bts, MAX_TIMESTAMP); +} + +static void intel_bts_free_queue(void *priv) +{ + struct intel_bts_queue *btsq = priv; + + if (!btsq) + return; + free(btsq); +} + +static void intel_bts_free_events(struct perf_session *session) +{ + struct intel_bts *bts = container_of(session->auxtrace, struct intel_bts, + auxtrace); + struct auxtrace_queues *queues = &bts->queues; + unsigned int i; + + for (i = 0; i < queues->nr_queues; i++) { + intel_bts_free_queue(queues->queue_array[i].priv); + queues->queue_array[i].priv = NULL; + } + auxtrace_queues__free(queues); +} + +static void intel_bts_free(struct perf_session *session) +{ + struct intel_bts *bts = container_of(session->auxtrace, struct intel_bts, + auxtrace); + + auxtrace_heap__free(&bts->heap); + intel_bts_free_events(session); + session->auxtrace = NULL; + free(bts); +} + +struct intel_bts_synth { + struct perf_tool dummy_tool; + struct perf_session *session; +}; + +static int intel_bts_event_synth(struct perf_tool *tool, + union perf_event *event, + struct perf_sample *sample __maybe_unused, + struct machine *machine __maybe_unused) +{ + struct intel_bts_synth *intel_bts_synth = + container_of(tool, struct intel_bts_synth, dummy_tool); + + return perf_session__deliver_synth_event(intel_bts_synth->session, + event, NULL); +} + +static int intel_bts_synth_event(struct perf_session *session, + struct perf_event_attr *attr, u64 id) +{ + struct intel_bts_synth intel_bts_synth; + + memset(&intel_bts_synth, 0, sizeof(struct intel_bts_synth)); + intel_bts_synth.session = session; + + return perf_event__synthesize_attr(&intel_bts_synth.dummy_tool, attr, 1, + &id, intel_bts_event_synth); +} + +static int intel_bts_synth_events(struct intel_bts *bts, + struct perf_session *session) +{ + struct perf_evlist *evlist = session->evlist; + struct perf_evsel *evsel; + struct perf_event_attr attr; + bool found = false; + u64 id; + int err; + + evlist__for_each(evlist, evsel) { + if (evsel->attr.type == bts->pmu_type && evsel->ids) { + found = true; + break; + } + } + + if (!found) { + pr_debug("There are no selected events with Intel BTS data\n"); + return 0; + } + + memset(&attr, 0, sizeof(struct perf_event_attr)); + attr.size = sizeof(struct perf_event_attr); + attr.type = PERF_TYPE_HARDWARE; + attr.sample_type = evsel->attr.sample_type & PERF_SAMPLE_MASK; + attr.sample_type |= PERF_SAMPLE_IP | PERF_SAMPLE_TID | + PERF_SAMPLE_PERIOD; + attr.sample_type &= ~(u64)PERF_SAMPLE_TIME; + attr.sample_type &= ~(u64)PERF_SAMPLE_CPU; + attr.exclude_user = evsel->attr.exclude_user; + attr.exclude_kernel = evsel->attr.exclude_kernel; + attr.exclude_hv = evsel->attr.exclude_hv; + attr.exclude_host = evsel->attr.exclude_host; + attr.exclude_guest = evsel->attr.exclude_guest; + attr.sample_id_all = evsel->attr.sample_id_all; + attr.read_format = evsel->attr.read_format; + + id = evsel->id[0] + 1000000000; + if (!id) + id = 1; + + if (bts->synth_opts.branches) { + attr.config = PERF_COUNT_HW_BRANCH_INSTRUCTIONS; + attr.sample_period = 1; + attr.sample_type |= PERF_SAMPLE_ADDR; + pr_debug("Synthesizing 'branches' event with id %" PRIu64 " sample type %#" PRIx64 "\n", + id, (u64)attr.sample_type); + err = intel_bts_synth_event(session, &attr, id); + if (err) { + pr_err("%s: failed to synthesize 'branches' event type\n", + __func__); + return err; + } + bts->sample_branches = true; + bts->branches_sample_type = attr.sample_type; + bts->branches_id = id; + /* + * We only use sample types from PERF_SAMPLE_MASK so we can use + * __perf_evsel__sample_size() here. + */ + bts->branches_event_size = sizeof(struct sample_event) + + __perf_evsel__sample_size(attr.sample_type); + } + + bts->synth_needs_swap = evsel->needs_swap; + + return 0; +} + +static const char * const intel_bts_info_fmts[] = { + [INTEL_BTS_PMU_TYPE] = " PMU Type %"PRId64"\n", + [INTEL_BTS_TIME_SHIFT] = " Time Shift %"PRIu64"\n", + [INTEL_BTS_TIME_MULT] = " Time Muliplier %"PRIu64"\n", + [INTEL_BTS_TIME_ZERO] = " Time Zero %"PRIu64"\n", + [INTEL_BTS_CAP_USER_TIME_ZERO] = " Cap Time Zero %"PRId64"\n", + [INTEL_BTS_SNAPSHOT_MODE] = " Snapshot mode %"PRId64"\n", +}; + +static void intel_bts_print_info(u64 *arr, int start, int finish) +{ + int i; + + if (!dump_trace) + return; + + for (i = start; i <= finish; i++) + fprintf(stdout, intel_bts_info_fmts[i], arr[i]); +} + +u64 intel_bts_auxtrace_info_priv[INTEL_BTS_AUXTRACE_PRIV_SIZE]; + +int intel_bts_process_auxtrace_info(union perf_event *event, + struct perf_session *session) +{ + struct auxtrace_info_event *auxtrace_info = &event->auxtrace_info; + size_t min_sz = sizeof(u64) * INTEL_BTS_SNAPSHOT_MODE; + struct intel_bts *bts; + int err; + + if (auxtrace_info->header.size < sizeof(struct auxtrace_info_event) + + min_sz) + return -EINVAL; + + bts = zalloc(sizeof(struct intel_bts)); + if (!bts) + return -ENOMEM; + + err = auxtrace_queues__init(&bts->queues); + if (err) + goto err_free; + + bts->session = session; + bts->machine = &session->machines.host; /* No kvm support */ + bts->auxtrace_type = auxtrace_info->type; + bts->pmu_type = auxtrace_info->priv[INTEL_BTS_PMU_TYPE]; + bts->tc.time_shift = auxtrace_info->priv[INTEL_BTS_TIME_SHIFT]; + bts->tc.time_mult = auxtrace_info->priv[INTEL_BTS_TIME_MULT]; + bts->tc.time_zero = auxtrace_info->priv[INTEL_BTS_TIME_ZERO]; + bts->cap_user_time_zero = + auxtrace_info->priv[INTEL_BTS_CAP_USER_TIME_ZERO]; + bts->snapshot_mode = auxtrace_info->priv[INTEL_BTS_SNAPSHOT_MODE]; + + bts->sampling_mode = false; + + bts->auxtrace.process_event = intel_bts_process_event; + bts->auxtrace.process_auxtrace_event = intel_bts_process_auxtrace_event; + bts->auxtrace.flush_events = intel_bts_flush; + bts->auxtrace.free_events = intel_bts_free_events; + bts->auxtrace.free = intel_bts_free; + session->auxtrace = &bts->auxtrace; + + intel_bts_print_info(&auxtrace_info->priv[0], INTEL_BTS_PMU_TYPE, + INTEL_BTS_SNAPSHOT_MODE); + + if (dump_trace) + return 0; + + if (session->itrace_synth_opts && session->itrace_synth_opts->set) + bts->synth_opts = *session->itrace_synth_opts; + else + itrace_synth_opts__set_default(&bts->synth_opts); + + if (bts->synth_opts.calls) + bts->branches_filter |= PERF_IP_FLAG_CALL | PERF_IP_FLAG_ASYNC | + PERF_IP_FLAG_TRACE_END; + if (bts->synth_opts.returns) + bts->branches_filter |= PERF_IP_FLAG_RETURN | + PERF_IP_FLAG_TRACE_BEGIN; + + err = intel_bts_synth_events(bts, session); + if (err) + goto err_free_queues; + + err = auxtrace_queues__process_index(&bts->queues, session); + if (err) + goto err_free_queues; + + if (bts->queues.populated) + bts->data_queued = true; + + return 0; + +err_free_queues: + auxtrace_queues__free(&bts->queues); + session->auxtrace = NULL; +err_free: + free(bts); + return err; +} diff --git a/tools/perf/util/intel-bts.h b/tools/perf/util/intel-bts.h new file mode 100644 index 000000000000..ca65e21b3e83 --- /dev/null +++ b/tools/perf/util/intel-bts.h @@ -0,0 +1,43 @@ +/* + * intel-bts.h: Intel Processor Trace support + * Copyright (c) 2013-2014, Intel Corporation. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms and conditions of the GNU General Public License, + * version 2, as published by the Free Software Foundation. + * + * This program is distributed in the hope it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + */ + +#ifndef INCLUDE__PERF_INTEL_BTS_H__ +#define INCLUDE__PERF_INTEL_BTS_H__ + +#define INTEL_BTS_PMU_NAME "intel_bts" + +enum { + INTEL_BTS_PMU_TYPE, + INTEL_BTS_TIME_SHIFT, + INTEL_BTS_TIME_MULT, + INTEL_BTS_TIME_ZERO, + INTEL_BTS_CAP_USER_TIME_ZERO, + INTEL_BTS_SNAPSHOT_MODE, + INTEL_BTS_AUXTRACE_PRIV_MAX, +}; + +#define INTEL_BTS_AUXTRACE_PRIV_SIZE (INTEL_BTS_AUXTRACE_PRIV_MAX * sizeof(u64)) + +struct auxtrace_record; +struct perf_tool; +union perf_event; +struct perf_session; + +struct auxtrace_record *intel_bts_recording_init(int *err); + +int intel_bts_process_auxtrace_info(union perf_event *event, + struct perf_session *session); + +#endif diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c index 3c71138e7672..89c91a1a67e7 100644 --- a/tools/perf/util/pmu.c +++ b/tools/perf/util/pmu.c @@ -462,10 +462,6 @@ static struct perf_pmu *pmu_lookup(const char *name) LIST_HEAD(aliases); __u32 type; - /* No support for intel_bts so disallow it */ - if (!strcmp(name, "intel_bts")) - return NULL; - /* * The pmu data we store & need consists of the pmu * type value and format definitions. Load both right -- 2.1.0 ^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH 6/9] perf tools: Put itrace options into an asciidoc include 2015-08-21 16:10 [GIT PULL 0/9] perf/core improvements and fixes Arnaldo Carvalho de Melo ` (4 preceding siblings ...) 2015-08-21 16:10 ` [PATCH 5/9] perf tools: Add Intel BTS support Arnaldo Carvalho de Melo @ 2015-08-21 16:10 ` Arnaldo Carvalho de Melo 2015-08-21 16:10 ` [PATCH 7/9] perf tools: Add example call-graph script Arnaldo Carvalho de Melo ` (3 subsequent siblings) 9 siblings, 0 replies; 42+ messages in thread From: Arnaldo Carvalho de Melo @ 2015-08-21 16:10 UTC (permalink / raw) To: Ingo Molnar Cc: linux-kernel, Adrian Hunter, Jiri Olsa, Arnaldo Carvalho de Melo From: Adrian Hunter <adrian.hunter@intel.com> perf script, report and inject all have the same itrace options. Put them into an asciidoc include file. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1437150840-31811-10-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> --- tools/perf/Documentation/itrace.txt | 22 ++++++++++++++++++++++ tools/perf/Documentation/perf-inject.txt | 23 +---------------------- tools/perf/Documentation/perf-report.txt | 23 +---------------------- tools/perf/Documentation/perf-script.txt | 23 +---------------------- 4 files changed, 25 insertions(+), 66 deletions(-) create mode 100644 tools/perf/Documentation/itrace.txt diff --git a/tools/perf/Documentation/itrace.txt b/tools/perf/Documentation/itrace.txt new file mode 100644 index 000000000000..2ff946677e3b --- /dev/null +++ b/tools/perf/Documentation/itrace.txt @@ -0,0 +1,22 @@ + i synthesize instructions events + b synthesize branches events + c synthesize branches events (calls only) + r synthesize branches events (returns only) + x synthesize transactions events + e synthesize error events + d create a debug log + g synthesize a call chain (use with i or x) + + The default is all events i.e. the same as --itrace=ibxe + + In addition, the period (default 100000) for instructions events + can be specified in units of: + + i instructions + t ticks + ms milliseconds + us microseconds + ns nanoseconds (default) + + Also the call chain size (default 16, max. 1024) for instructions or + transactions events can be specified. diff --git a/tools/perf/Documentation/perf-inject.txt b/tools/perf/Documentation/perf-inject.txt index b876ae312699..0c721c3e37e1 100644 --- a/tools/perf/Documentation/perf-inject.txt +++ b/tools/perf/Documentation/perf-inject.txt @@ -48,28 +48,7 @@ OPTIONS Decode Instruction Tracing data, replacing it with synthesized events. Options are: - i synthesize instructions events - b synthesize branches events - c synthesize branches events (calls only) - r synthesize branches events (returns only) - x synthesize transactions events - e synthesize error events - d create a debug log - g synthesize a call chain (use with i or x) - - The default is all events i.e. the same as --itrace=ibxe - - In addition, the period (default 100000) for instructions events - can be specified in units of: - - i instructions - t ticks - ms milliseconds - us microseconds - ns nanoseconds (default) - - Also the call chain size (default 16, max. 1024) for instructions or - transactions events can be specified. +include::itrace.txt[] SEE ALSO -------- diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt index a18ba757a0ed..9c7981bfddad 100644 --- a/tools/perf/Documentation/perf-report.txt +++ b/tools/perf/Documentation/perf-report.txt @@ -331,28 +331,7 @@ OPTIONS --itrace:: Options for decoding instruction tracing data. The options are: - i synthesize instructions events - b synthesize branches events - c synthesize branches events (calls only) - r synthesize branches events (returns only) - x synthesize transactions events - e synthesize error events - d create a debug log - g synthesize a call chain (use with i or x) - - The default is all events i.e. the same as --itrace=ibxe - - In addition, the period (default 100000) for instructions events - can be specified in units of: - - i instructions - t ticks - ms milliseconds - us microseconds - ns nanoseconds (default) - - Also the call chain size (default 16, max. 1024) for instructions or - transactions events can be specified. +include::itrace.txt[] To disable decoding entirely, use --no-itrace. diff --git a/tools/perf/Documentation/perf-script.txt b/tools/perf/Documentation/perf-script.txt index 8e9be1f9c1dd..c0d24791a7f3 100644 --- a/tools/perf/Documentation/perf-script.txt +++ b/tools/perf/Documentation/perf-script.txt @@ -235,28 +235,7 @@ OPTIONS --itrace:: Options for decoding instruction tracing data. The options are: - i synthesize instructions events - b synthesize branches events - c synthesize branches events (calls only) - r synthesize branches events (returns only) - x synthesize transactions events - e synthesize error events - d create a debug log - g synthesize a call chain (use with i or x) - - The default is all events i.e. the same as --itrace=ibxe - - In addition, the period (default 100000) for instructions events - can be specified in units of: - - i instructions - t ticks - ms milliseconds - us microseconds - ns nanoseconds (default) - - Also the call chain size (default 16, max. 1024) for instructions or - transactions events can be specified. +include::itrace.txt[] To disable decoding entirely, use --no-itrace. -- 2.1.0 ^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH 7/9] perf tools: Add example call-graph script 2015-08-21 16:10 [GIT PULL 0/9] perf/core improvements and fixes Arnaldo Carvalho de Melo ` (5 preceding siblings ...) 2015-08-21 16:10 ` [PATCH 6/9] perf tools: Put itrace options into an asciidoc include Arnaldo Carvalho de Melo @ 2015-08-21 16:10 ` Arnaldo Carvalho de Melo 2015-08-21 16:10 ` [PATCH 8/9] perf tools: Initialize reference counts in map__clone() Arnaldo Carvalho de Melo ` (2 subsequent siblings) 9 siblings, 0 replies; 42+ messages in thread From: Arnaldo Carvalho de Melo @ 2015-08-21 16:10 UTC (permalink / raw) To: Ingo Molnar Cc: linux-kernel, Adrian Hunter, Jiri Olsa, Arnaldo Carvalho de Melo From: Adrian Hunter <adrian.hunter@intel.com> Add a script to produce a call-graph from data exported to a postgresql database and derived from a processor trace event like intel_pt or intel_bts. Refer to comments in the scripts call-graph-from-postgresql.py and export-to-postgresql.py for more details on how to set up the environment, install the required packages, etc. Committer note: >From the scripts, for convenience while reading 'git log': An example of using this script with Intel PT: $ perf record -e intel_pt//u ls $ perf script -s ~/libexec/perf-core/scripts/python/export-to-postgresql.py pt_example branches calls 2015-05-29 12:49:23.464364 Creating database... 2015-05-29 12:49:26.281717 Writing to intermediate files... 2015-05-29 12:49:27.190383 Copying to database... 2015-05-29 12:49:28.140451 Removing intermediate files... 2015-05-29 12:49:28.147451 Adding primary keys 2015-05-29 12:49:28.655683 Adding foreign keys 2015-05-29 12:49:29.365350 Done $ python tools/perf/scripts/python/call-graph-from-postgresql.py pt_example # The result is a GUI window with a tree representing a context-sensitive # call-graph. Expanding a couple of levels of the tree and adjusting column # widths to suit will display something like: Call Graph: pt_example Call Path |Object |Count|Time(ns)|Time(%)|Branch Count|Branch Count(%) v- ls v- 2638:2638 v- _start ld-2.19.so 1 10074071 100.0 211135 100.0 |- unknown unknown 1 13198 0.1 1 0.0 >- _dl_start ld-2.19.so 1 1400980 13.9 19637 9.3 >- _d_linit_internal ld-2.19.so 1 448152 4.4 11094 5.3 v-__libc_start_main@plt ls 1 8211741 81.5 180397 85.4 >- _dl_fixup ld-2.19.so 1 7607 0.1 108 0.1 >- __cxa_atexit libc-2.19.so 1 11737 0.1 10 0.0 >- __libc_csu_init ls 1 10354 0.1 10 0.0 |- _setjmp libc-2.19.so 1 0 0.0 4 0.0 v- main ls 1 8182043 99.6 180254 99.9 Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1437150840-31811-11-git-send-email-adrian.hunter@intel.com [ Added 'python-pyside qt-postgresql' to the yum cmdline installing required packages ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> --- .../scripts/python/call-graph-from-postgresql.py | 327 +++++++++++++++++++++ tools/perf/scripts/python/export-to-postgresql.py | 47 +++ 2 files changed, 374 insertions(+) create mode 100644 tools/perf/scripts/python/call-graph-from-postgresql.py diff --git a/tools/perf/scripts/python/call-graph-from-postgresql.py b/tools/perf/scripts/python/call-graph-from-postgresql.py new file mode 100644 index 000000000000..e78fdc2a5a9d --- /dev/null +++ b/tools/perf/scripts/python/call-graph-from-postgresql.py @@ -0,0 +1,327 @@ +#!/usr/bin/python2 +# call-graph-from-postgresql.py: create call-graph from postgresql database +# Copyright (c) 2014, Intel Corporation. +# +# This program is free software; you can redistribute it and/or modify it +# under the terms and conditions of the GNU General Public License, +# version 2, as published by the Free Software Foundation. +# +# This program is distributed in the hope it will be useful, but WITHOUT +# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or +# FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for +# more details. + +# To use this script you will need to have exported data using the +# export-to-postgresql.py script. Refer to that script for details. +# +# Following on from the example in the export-to-postgresql.py script, a +# call-graph can be displayed for the pt_example database like this: +# +# python tools/perf/scripts/python/call-graph-from-postgresql.py pt_example +# +# Note this script supports connecting to remote databases by setting hostname, +# port, username, password, and dbname e.g. +# +# python tools/perf/scripts/python/call-graph-from-postgresql.py "hostname=myhost username=myuser password=mypassword dbname=pt_example" +# +# The result is a GUI window with a tree representing a context-sensitive +# call-graph. Expanding a couple of levels of the tree and adjusting column +# widths to suit will display something like: +# +# Call Graph: pt_example +# Call Path Object Count Time(ns) Time(%) Branch Count Branch Count(%) +# v- ls +# v- 2638:2638 +# v- _start ld-2.19.so 1 10074071 100.0 211135 100.0 +# |- unknown unknown 1 13198 0.1 1 0.0 +# >- _dl_start ld-2.19.so 1 1400980 13.9 19637 9.3 +# >- _d_linit_internal ld-2.19.so 1 448152 4.4 11094 5.3 +# v-__libc_start_main@plt ls 1 8211741 81.5 180397 85.4 +# >- _dl_fixup ld-2.19.so 1 7607 0.1 108 0.1 +# >- __cxa_atexit libc-2.19.so 1 11737 0.1 10 0.0 +# >- __libc_csu_init ls 1 10354 0.1 10 0.0 +# |- _setjmp libc-2.19.so 1 0 0.0 4 0.0 +# v- main ls 1 8182043 99.6 180254 99.9 +# +# Points to note: +# The top level is a command name (comm) +# The next level is a thread (pid:tid) +# Subsequent levels are functions +# 'Count' is the number of calls +# 'Time' is the elapsed time until the function returns +# Percentages are relative to the level above +# 'Branch Count' is the total number of branches for that function and all +# functions that it calls + +import sys +from PySide.QtCore import * +from PySide.QtGui import * +from PySide.QtSql import * +from decimal import * + +class TreeItem(): + + def __init__(self, db, row, parent_item): + self.db = db + self.row = row + self.parent_item = parent_item + self.query_done = False; + self.child_count = 0 + self.child_items = [] + self.data = ["", "", "", "", "", "", ""] + self.comm_id = 0 + self.thread_id = 0 + self.call_path_id = 1 + self.branch_count = 0 + self.time = 0 + if not parent_item: + self.setUpRoot() + + def setUpRoot(self): + self.query_done = True + query = QSqlQuery(self.db) + ret = query.exec_('SELECT id, comm FROM comms') + if not ret: + raise Exception("Query failed: " + query.lastError().text()) + while query.next(): + if not query.value(0): + continue + child_item = TreeItem(self.db, self.child_count, self) + self.child_items.append(child_item) + self.child_count += 1 + child_item.setUpLevel1(query.value(0), query.value(1)) + + def setUpLevel1(self, comm_id, comm): + self.query_done = True; + self.comm_id = comm_id + self.data[0] = comm + self.child_items = [] + self.child_count = 0 + query = QSqlQuery(self.db) + ret = query.exec_('SELECT thread_id, ( SELECT pid FROM threads WHERE id = thread_id ), ( SELECT tid FROM threads WHERE id = thread_id ) FROM comm_threads WHERE comm_id = ' + str(comm_id)) + if not ret: + raise Exception("Query failed: " + query.lastError().text()) + while query.next(): + child_item = TreeItem(self.db, self.child_count, self) + self.child_items.append(child_item) + self.child_count += 1 + child_item.setUpLevel2(comm_id, query.value(0), query.value(1), query.value(2)) + + def setUpLevel2(self, comm_id, thread_id, pid, tid): + self.comm_id = comm_id + self.thread_id = thread_id + self.data[0] = str(pid) + ":" + str(tid) + + def getChildItem(self, row): + return self.child_items[row] + + def getParentItem(self): + return self.parent_item + + def getRow(self): + return self.row + + def timePercent(self, b): + if not self.time: + return "0.0" + x = (b * Decimal(100)) / self.time + return str(x.quantize(Decimal('.1'), rounding=ROUND_HALF_UP)) + + def branchPercent(self, b): + if not self.branch_count: + return "0.0" + x = (b * Decimal(100)) / self.branch_count + return str(x.quantize(Decimal('.1'), rounding=ROUND_HALF_UP)) + + def addChild(self, call_path_id, name, dso, count, time, branch_count): + child_item = TreeItem(self.db, self.child_count, self) + child_item.comm_id = self.comm_id + child_item.thread_id = self.thread_id + child_item.call_path_id = call_path_id + child_item.branch_count = branch_count + child_item.time = time + child_item.data[0] = name + if dso == "[kernel.kallsyms]": + dso = "[kernel]" + child_item.data[1] = dso + child_item.data[2] = str(count) + child_item.data[3] = str(time) + child_item.data[4] = self.timePercent(time) + child_item.data[5] = str(branch_count) + child_item.data[6] = self.branchPercent(branch_count) + self.child_items.append(child_item) + self.child_count += 1 + + def selectCalls(self): + self.query_done = True; + query = QSqlQuery(self.db) + ret = query.exec_('SELECT id, call_path_id, branch_count, call_time, return_time, ' + '( SELECT name FROM symbols WHERE id = ( SELECT symbol_id FROM call_paths WHERE id = call_path_id ) ), ' + '( SELECT short_name FROM dsos WHERE id = ( SELECT dso_id FROM symbols WHERE id = ( SELECT symbol_id FROM call_paths WHERE id = call_path_id ) ) ), ' + '( SELECT ip FROM call_paths where id = call_path_id ) ' + 'FROM calls WHERE parent_call_path_id = ' + str(self.call_path_id) + ' AND comm_id = ' + str(self.comm_id) + ' AND thread_id = ' + str(self.thread_id) + + 'ORDER BY call_path_id') + if not ret: + raise Exception("Query failed: " + query.lastError().text()) + last_call_path_id = 0 + name = "" + dso = "" + count = 0 + branch_count = 0 + total_branch_count = 0 + time = 0 + total_time = 0 + while query.next(): + if query.value(1) == last_call_path_id: + count += 1 + branch_count += query.value(2) + time += query.value(4) - query.value(3) + else: + if count: + self.addChild(last_call_path_id, name, dso, count, time, branch_count) + last_call_path_id = query.value(1) + name = query.value(5) + dso = query.value(6) + count = 1 + total_branch_count += branch_count + total_time += time + branch_count = query.value(2) + time = query.value(4) - query.value(3) + if count: + self.addChild(last_call_path_id, name, dso, count, time, branch_count) + total_branch_count += branch_count + total_time += time + # Top level does not have time or branch count, so fix that here + if total_branch_count > self.branch_count: + self.branch_count = total_branch_count + if self.branch_count: + for child_item in self.child_items: + child_item.data[6] = self.branchPercent(child_item.branch_count) + if total_time > self.time: + self.time = total_time + if self.time: + for child_item in self.child_items: + child_item.data[4] = self.timePercent(child_item.time) + + def childCount(self): + if not self.query_done: + self.selectCalls() + return self.child_count + + def columnCount(self): + return 7 + + def columnHeader(self, column): + headers = ["Call Path", "Object", "Count ", "Time (ns) ", "Time (%) ", "Branch Count ", "Branch Count (%) "] + return headers[column] + + def getData(self, column): + return self.data[column] + +class TreeModel(QAbstractItemModel): + + def __init__(self, db, parent=None): + super(TreeModel, self).__init__(parent) + self.db = db + self.root = TreeItem(db, 0, None) + + def columnCount(self, parent): + return self.root.columnCount() + + def rowCount(self, parent): + if parent.isValid(): + parent_item = parent.internalPointer() + else: + parent_item = self.root + return parent_item.childCount() + + def headerData(self, section, orientation, role): + if role == Qt.TextAlignmentRole: + if section > 1: + return Qt.AlignRight + if role != Qt.DisplayRole: + return None + if orientation != Qt.Horizontal: + return None + return self.root.columnHeader(section) + + def parent(self, child): + child_item = child.internalPointer() + if child_item is self.root: + return QModelIndex() + parent_item = child_item.getParentItem() + return self.createIndex(parent_item.getRow(), 0, parent_item) + + def index(self, row, column, parent): + if parent.isValid(): + parent_item = parent.internalPointer() + else: + parent_item = self.root + child_item = parent_item.getChildItem(row) + return self.createIndex(row, column, child_item) + + def data(self, index, role): + if role == Qt.TextAlignmentRole: + if index.column() > 1: + return Qt.AlignRight + if role != Qt.DisplayRole: + return None + index_item = index.internalPointer() + return index_item.getData(index.column()) + +class MainWindow(QMainWindow): + + def __init__(self, db, dbname, parent=None): + super(MainWindow, self).__init__(parent) + + self.setObjectName("MainWindow") + self.setWindowTitle("Call Graph: " + dbname) + self.move(100, 100) + self.resize(800, 600) + style = self.style() + icon = style.standardIcon(QStyle.SP_MessageBoxInformation) + self.setWindowIcon(icon); + + self.model = TreeModel(db) + + self.view = QTreeView() + self.view.setModel(self.model) + + self.setCentralWidget(self.view) + +if __name__ == '__main__': + if (len(sys.argv) < 2): + print >> sys.stderr, "Usage is: call-graph-from-postgresql.py <database name>" + raise Exception("Too few arguments") + + dbname = sys.argv[1] + + db = QSqlDatabase.addDatabase('QPSQL') + + opts = dbname.split() + for opt in opts: + if '=' in opt: + opt = opt.split('=') + if opt[0] == 'hostname': + db.setHostName(opt[1]) + elif opt[0] == 'port': + db.setPort(int(opt[1])) + elif opt[0] == 'username': + db.setUserName(opt[1]) + elif opt[0] == 'password': + db.setPassword(opt[1]) + elif opt[0] == 'dbname': + dbname = opt[1] + else: + dbname = opt + + db.setDatabaseName(dbname) + if not db.open(): + raise Exception("Failed to open database " + dbname + " error: " + db.lastError().text()) + + app = QApplication(sys.argv) + window = MainWindow(db, dbname) + window.show() + err = app.exec_() + db.close() + sys.exit(err) diff --git a/tools/perf/scripts/python/export-to-postgresql.py b/tools/perf/scripts/python/export-to-postgresql.py index 4cdafd880074..84a32037a80f 100644 --- a/tools/perf/scripts/python/export-to-postgresql.py +++ b/tools/perf/scripts/python/export-to-postgresql.py @@ -15,6 +15,53 @@ import sys import struct import datetime +# To use this script you will need to have installed package python-pyside which +# provides LGPL-licensed Python bindings for Qt. You will also need the package +# libqt4-sql-psql for Qt postgresql support. +# +# The script assumes postgresql is running on the local machine and that the +# user has postgresql permissions to create databases. Examples of installing +# postgresql and adding such a user are: +# +# fedora: +# +# $ sudo yum install postgresql postgresql-server python-pyside qt-postgresql +# $ sudo su - postgres -c initdb +# $ sudo service postgresql start +# $ sudo su - postgres +# $ createuser <your user id here> +# Shall the new role be a superuser? (y/n) y +# +# ubuntu: +# +# $ sudo apt-get install postgresql +# $ sudo su - postgres +# $ createuser <your user id here> +# Shall the new role be a superuser? (y/n) y +# +# An example of using this script with Intel PT: +# +# $ perf record -e intel_pt//u ls +# $ perf script -s ~/libexec/perf-core/scripts/python/export-to-postgresql.py pt_example branches calls +# 2015-05-29 12:49:23.464364 Creating database... +# 2015-05-29 12:49:26.281717 Writing to intermediate files... +# 2015-05-29 12:49:27.190383 Copying to database... +# 2015-05-29 12:49:28.140451 Removing intermediate files... +# 2015-05-29 12:49:28.147451 Adding primary keys +# 2015-05-29 12:49:28.655683 Adding foreign keys +# 2015-05-29 12:49:29.365350 Done +# +# To browse the database, psql can be used e.g. +# +# $ psql pt_example +# pt_example=# select * from samples_view where id < 100; +# pt_example=# \d+ +# pt_example=# \d+ samples_view +# pt_example=# \q +# +# An example of using the database is provided by the script +# call-graph-from-postgresql.py. Refer to that script for details. + from PySide.QtSql import * # Need to access PostgreSQL C library directly to use COPY FROM STDIN -- 2.1.0 ^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH 8/9] perf tools: Initialize reference counts in map__clone() 2015-08-21 16:10 [GIT PULL 0/9] perf/core improvements and fixes Arnaldo Carvalho de Melo ` (6 preceding siblings ...) 2015-08-21 16:10 ` [PATCH 7/9] perf tools: Add example call-graph script Arnaldo Carvalho de Melo @ 2015-08-21 16:10 ` Arnaldo Carvalho de Melo 2015-08-21 16:10 ` [PATCH 9/9] perf probe: Try to use symbol table if searching debug info failed Arnaldo Carvalho de Melo 2015-08-22 6:47 ` [GIT PULL 0/9] perf/core improvements and fixes Ingo Molnar 9 siblings, 0 replies; 42+ messages in thread From: Arnaldo Carvalho de Melo @ 2015-08-21 16:10 UTC (permalink / raw) To: Ingo Molnar Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter, Borislav Petkov, David Ahern, Frederic Weisbecker, Jiri Olsa, Namhyung Kim, Stephane Eranian From: Arnaldo Carvalho de Melo <acme@redhat.com> Map clone was written before we introduced reference counts for maps and dsos, so all that was needed was just a copy and then we would insert it into the new map_groups instance. Fix it by, after copying, initializing the map->refcnt, grabbing a struct dso refcount and resetting pointers that may be used to determine if a map, when deleted, is in a rb_tree. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-pd4mr80o5b9gvk50iineacec@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> --- tools/perf/util/map.c | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c index ce37e95bc513..b1c475d9b240 100644 --- a/tools/perf/util/map.c +++ b/tools/perf/util/map.c @@ -348,9 +348,18 @@ struct symbol *map__find_symbol_by_name(struct map *map, const char *name, return dso__find_symbol_by_name(map->dso, map->type, name); } -struct map *map__clone(struct map *map) +struct map *map__clone(struct map *from) { - return memdup(map, sizeof(*map)); + struct map *map = memdup(from, sizeof(*map)); + + if (map != NULL) { + atomic_set(&map->refcnt, 1); + RB_CLEAR_NODE(&map->rb_node); + dso__get(map->dso); + map->groups = NULL; + } + + return map; } int map__overlap(struct map *l, struct map *r) -- 2.1.0 ^ permalink raw reply related [flat|nested] 42+ messages in thread
* [PATCH 9/9] perf probe: Try to use symbol table if searching debug info failed 2015-08-21 16:10 [GIT PULL 0/9] perf/core improvements and fixes Arnaldo Carvalho de Melo ` (7 preceding siblings ...) 2015-08-21 16:10 ` [PATCH 8/9] perf tools: Initialize reference counts in map__clone() Arnaldo Carvalho de Melo @ 2015-08-21 16:10 ` Arnaldo Carvalho de Melo 2015-08-22 6:47 ` [GIT PULL 0/9] perf/core improvements and fixes Ingo Molnar 9 siblings, 0 replies; 42+ messages in thread From: Arnaldo Carvalho de Melo @ 2015-08-21 16:10 UTC (permalink / raw) To: Ingo Molnar Cc: linux-kernel, Wang Nan, Alexei Starovoitov, Brendan Gregg, Daniel Borkmann, David Ahern, He Kuang, Jiri Olsa, Kaixu Xia, Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, Zefan Li, pi3orama, Arnaldo Carvalho de Melo From: Wang Nan <wangnan0@huawei.com> A problem can occur in a statically linked perf when vmlinux can be found: # perf probe --add sys_epoll_pwait probe-definition(0): sys_epoll_pwait symbol:sys_epoll_pwait file:(null) line:0 offset:0 return:0 lazy:(null) 0 arguments Looking at the vmlinux_path (7 entries long) Using /lib/modules/4.2.0-rc1+/build/vmlinux for symbols Open Debuginfo file: /lib/modules/4.2.0-rc1+/build/vmlinux Try to find probe point from debuginfo. Symbol sys_epoll_pwait address found : ffffffff8122bd40 Matched function: SyS_epoll_pwait Failed to get call frame on 0xffffffff8122bd40 An error occurred in debuginfo analysis (-2). Error: Failed to add events. Reason: No such file or directory (Code: -2) The reason is caused by libdw that, if libdw is statically linked, it can't load libebl_{arch}.so reliable. In this case it is still possible to get the address from /proc/kalksyms. However, perf tries that only when libdw returns -EBADF. This patch gives it another chance to utilize symbol table, even if libdw returns an error code other than -EBADF. After applying this patch: # perf probe -nv --add sys_epoll_pwait probe-definition(0): sys_epoll_pwait symbol:sys_epoll_pwait file:(null) line:0 offset:0 return:0 lazy:(null) 0 arguments Looking at the vmlinux_path (7 entries long) Using /lib/modules/4.2.0-rc1+/build/vmlinux for symbols Open Debuginfo file: /lib/modules/4.2.0-rc1+/build/vmlinux Try to find probe point from debuginfo. Symbol sys_epoll_pwait address found : ffffffff8122bd40 Matched function: SyS_epoll_pwait Failed to get call frame on 0xffffffff8122bd40 An error occurred in debuginfo analysis (-2). Trying to use symbols. Opening /sys/kernel/debug/tracing/kprobe_events write=1 Added new event: Writing event: p:probe/sys_epoll_pwait _text+2276672 probe:sys_epoll_pwait (on sys_epoll_pwait) You can now use it in all perf tools, such as: perf record -e probe:sys_epoll_pwait -aR sleep 1 Although libdw returns an error (Failed to get call frame), perf tries symbol table and finally gets correct address. Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: David Ahern <dsahern@gmail.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kaixu Xia <xiakaixu@huawei.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1440151770-129878-2-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> --- tools/perf/util/probe-event.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c index fe4941a94a25..f07374bc9c5a 100644 --- a/tools/perf/util/probe-event.c +++ b/tools/perf/util/probe-event.c @@ -705,9 +705,10 @@ static int try_to_find_probe_trace_events(struct perf_probe_event *pev, } /* Error path : ntevs < 0 */ pr_debug("An error occurred in debuginfo analysis (%d).\n", ntevs); - if (ntevs == -EBADF) { - pr_warning("Warning: No dwarf info found in the vmlinux - " - "please rebuild kernel with CONFIG_DEBUG_INFO=y.\n"); + if (ntevs < 0) { + if (ntevs == -EBADF) + pr_warning("Warning: No dwarf info found in the vmlinux - " + "please rebuild kernel with CONFIG_DEBUG_INFO=y.\n"); if (!need_dwarf) { pr_debug("Trying to use symbols.\n"); return 0; -- 2.1.0 ^ permalink raw reply related [flat|nested] 42+ messages in thread
* Re: [GIT PULL 0/9] perf/core improvements and fixes 2015-08-21 16:10 [GIT PULL 0/9] perf/core improvements and fixes Arnaldo Carvalho de Melo ` (8 preceding siblings ...) 2015-08-21 16:10 ` [PATCH 9/9] perf probe: Try to use symbol table if searching debug info failed Arnaldo Carvalho de Melo @ 2015-08-22 6:47 ` Ingo Molnar 9 siblings, 0 replies; 42+ messages in thread From: Ingo Molnar @ 2015-08-22 6:47 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: linux-kernel, Adrian Hunter, Alexei Starovoitov, Borislav Petkov, Brendan Gregg, Daniel Borkmann, David Ahern, Dean Nelson, Frederic Weisbecker, He Kuang, Jiri Olsa, Kaixu Xia, Li Zhang, Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama, Stephane Eranian, Sukadev Bhattiprolu, Wang Nan, Zefan Li, Arnaldo Carvalho de Melo * Arnaldo Carvalho de Melo <acme@kernel.org> wrote: > Hi Ingo, > > Please consider pulling, > > - Arnaldo > > The following changes since commit 82819ffb42fb45197bacf3223191deca31d3eb91: > > perf/x86/msr: Fix the MSR driver build (2015-08-21 08:17:01 +0200) > > are available in the git repository at: > > git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo > > for you to fetch changes up to 1c0bd0e891aaed0219010bfe79b32e1b0b82d662: > > perf probe: Try to use symbol table if searching debug info failed (2015-08-21 12:57:20 -0300) > > ---------------------------------------------------------------- > perf/core improvements and fixes: > > User visible: > > - Fix segfault using 'perf script --show-mmap-events', affects > only current perf/core (Adrian Hunter). > > - /proc/kcore requires CAP_SYS_RAWIO message too noisy, make it > debug only (Adrian Hunter) > > - Fix Intel PT timestamp handling (Adrian Hunter) > > - Add Intel BTS support, with a call-graph script to show it and > PT in use in a GUI using 'perf script' python scripting with > postgresql and Qt (Adrian Hunter) > > - Add checks for returned EVENT_ERROR type in libtraceevent, fixing > a bug that surfaced on arm64 systems (Dean Nelson) > > - Fallback to using kallsyms when libdw fails to handle a vmlinux file, > that can happen, for instance, when perf is statically linked and > then libdw fails to load libebl_{arch}.so (Wang Nan) > > Infrastructure: > > - Initialize reference counts in map__clone() (Arnaldo Carvalho de Melo) > > Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> > > ---------------------------------------------------------------- > Adrian Hunter (6): > perf script: Fix segfault using --show-mmap-events > perf tools: /proc/kcore requires CAP_SYS_RAWIO message too noisy > perf tools: Fix Intel PT timestamp handling > perf tools: Add Intel BTS support > perf tools: Put itrace options into an asciidoc include > perf tools: Add example call-graph script > > Arnaldo Carvalho de Melo (1): > perf tools: Initialize reference counts in map__clone() > > Dean Nelson (1): > tools lib traceevent: Add checks for returned EVENT_ERROR type > > Wang Nan (1): > perf probe: Try to use symbol table if searching debug info failed > > tools/lib/traceevent/event-parse.c | 9 + > tools/perf/Documentation/intel-bts.txt | 86 ++ > tools/perf/Documentation/itrace.txt | 22 + > tools/perf/Documentation/perf-inject.txt | 23 +- > tools/perf/Documentation/perf-report.txt | 23 +- > tools/perf/Documentation/perf-script.txt | 23 +- > tools/perf/arch/x86/util/Build | 1 + > tools/perf/arch/x86/util/auxtrace.c | 49 +- > tools/perf/arch/x86/util/intel-bts.c | 458 ++++++++++ > tools/perf/arch/x86/util/pmu.c | 3 + > .../scripts/python/call-graph-from-postgresql.py | 327 ++++++++ > tools/perf/scripts/python/export-to-postgresql.py | 47 ++ > tools/perf/util/Build | 1 + > tools/perf/util/annotate.c | 1 + > tools/perf/util/auxtrace.c | 3 + > tools/perf/util/auxtrace.h | 1 + > tools/perf/util/evlist.c | 2 +- > tools/perf/util/intel-bts.c | 933 +++++++++++++++++++++ > tools/perf/util/intel-bts.h | 43 + > tools/perf/util/intel-pt.c | 2 +- > tools/perf/util/map.c | 13 +- > tools/perf/util/pmu.c | 4 - > tools/perf/util/probe-event.c | 7 +- > tools/perf/util/symbol.c | 4 +- > 24 files changed, 2004 insertions(+), 81 deletions(-) > create mode 100644 tools/perf/Documentation/intel-bts.txt > create mode 100644 tools/perf/Documentation/itrace.txt > create mode 100644 tools/perf/arch/x86/util/intel-bts.c > create mode 100644 tools/perf/scripts/python/call-graph-from-postgresql.py > create mode 100644 tools/perf/util/intel-bts.c > create mode 100644 tools/perf/util/intel-bts.h Pulled, thanks a lot Arnaldo! Ingo ^ permalink raw reply [flat|nested] 42+ messages in thread
* [GIT PULL 0/9] perf/core improvements and fixes @ 2017-04-01 2:10 Arnaldo Carvalho de Melo 2017-04-01 10:44 ` Ingo Molnar 0 siblings, 1 reply; 42+ messages in thread From: Arnaldo Carvalho de Melo @ 2017-04-01 2:10 UTC (permalink / raw) To: Ingo Molnar Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter, Alexander Shishkin, Alexis Berlemont, Al Viro, Colin King, David Ahern, David Howells, Hemant Kumar, Jan Stancek, Jiri Olsa, Kan Liang, kernel-janitors, Krister Johansen, Luis Claudio Gonçalves, Masami Hiramatsu, Michael Ellerman, Namhyung Kim, Naveen N . Rao, Peter Zijlstra, Ravi Bangoria, Wang Nan, Yao Jin, Arnaldo Carvalho de Melo Hi Ingo, Please consider pulling, - Arnaldo Test results at the end of this message, as usual. The following changes since commit 3906a13a6b4e78fbc0def03a808f091f0dff1b44: Merge tag 'perf-core-for-mingo-4.12-20170327' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core (2017-03-28 07:44:43 +0200) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-4.12-20170331 for you to fetch changes up to fd5cead23f54697310bd565aa2a23ae5128080a0: perf trace: Beautify statx syscall 'flag' and 'mask' arguments (2017-03-31 14:42:31 -0300) ---------------------------------------------------------------- perf/core improvements and fixes: New features: - Beautify the statx syscall arguments in 'perf trace' (Arnaldo Carvalho de Melo) e.g.: System wide strace like session: # trace -e statx 16612.967 ( 0.028 ms): statx/4562 statx(dfd: CWD, filename: /tmp/statx, flags: SYMLINK_NOFOLLOW, mask: TYPE|MODE|NLINK|UID|GID|ATIME|MTIME|CTIME|INO|SIZE|BLOCKS|BTIME, buffer: 0x7ffef195d660) = 0 36050.891 ( 0.007 ms): statx/4576 statx(dfd: CWD, filename: /etc/passwd, flags: SYMLINK_NOFOLLOW|STATX_DONT_SYNC, mask: BTIME, buffer: 0x7ffda9bf50f0) = 0 ^C# User visible: - Handle unpaired raw_syscalls:sys_exit events in 'perf trace', i.e. we shouldn't try to calculate duration or print the timestamp for a missing matching raw_syscalls:sys_enter (Arnaldo Carvalho de Melo) - Do not print "cycles: 0" in perf report LBR lines in platforms not supporting 'cycles', such as Intel's Broadwell (Jin Yao) - Handle missing $HOME env var (Jiri Olsa) - Map 8-bit registers (al, bl, etc), not supported in uprobes_events, to the next best thing (ax, bx, etc) supported (Ravi Bangoria) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> ---------------------------------------------------------------- Arnaldo Carvalho de Melo (4): perf tools: Remove support for command aliases perf trace: Handle unpaired raw_syscalls:sys_exit event tools include uapi: Grab copies of stat.h and fcntl.h perf trace: Beautify statx syscall 'flag' and 'mask' arguments Colin Ian King (1): perf utils: Fix spelling mistake: "Invalud" -> "Invalid" Jin Yao (1): perf report: Drop cycles 0 for LBR print Jiri Olsa (1): perf tools: Do not fail in case of empty HOME env variable Ravi Bangoria (2): perf/sdt/x86: Add renaming logic for (missing) 8 bit registers perf/sdt/x86: Move OP parser to tools/perf/arch/x86/ tools/include/linux/types.h | 1 + tools/include/uapi/linux/fcntl.h | 72 +++++++++ tools/include/uapi/linux/stat.h | 176 ++++++++++++++++++++ tools/perf/Build | 1 + tools/perf/MANIFEST | 2 + tools/perf/arch/x86/entry/syscalls/syscall_64.tbl | 1 + tools/perf/arch/x86/util/perf_regs.c | 187 ++++++++++++++++++---- tools/perf/builtin-help.c | 13 -- tools/perf/builtin-trace.c | 57 ++++--- tools/perf/check-headers.sh | 2 + tools/perf/perf.c | 97 +---------- tools/perf/trace/beauty/Build | 1 + tools/perf/trace/beauty/beauty.h | 24 +++ tools/perf/trace/beauty/statx.c | 72 +++++++++ tools/perf/util/Build | 1 - tools/perf/util/alias.c | 78 --------- tools/perf/util/cache.h | 1 - tools/perf/util/callchain.c | 111 ++++++++----- tools/perf/util/config.c | 54 ++++--- tools/perf/util/help-unknown-cmd.c | 8 +- tools/perf/util/hist.c | 2 +- tools/perf/util/perf_regs.c | 6 +- tools/perf/util/perf_regs.h | 11 +- tools/perf/util/probe-file.c | 132 +++++---------- 24 files changed, 707 insertions(+), 403 deletions(-) create mode 100644 tools/include/uapi/linux/fcntl.h create mode 100644 tools/include/uapi/linux/stat.h create mode 100644 tools/perf/trace/beauty/Build create mode 100644 tools/perf/trace/beauty/beauty.h create mode 100644 tools/perf/trace/beauty/statx.c delete mode 100644 tools/perf/util/alias.c Test results: The first ones are container (docker) based builds of tools/perf with and without libelf support, objtool where it is supported and samples/bpf/, ditto. Where clang is available, it is also used to build perf with/without libelf. For this specific pull request the samples/bpf/ was disabled, as 'make headers_install' is failing with the following error, in this case in fedora:rawhide: INSTALL usr/include/uapi/ (0 file) /git/linux/scripts/Makefile.headersinst:62: *** Missing generated UAPI file ./arch/x86/include/generated/uapi/asm/unistd_32.h. Stop. make[1]: *** [/git/linux/Makefile:1151: headers_install] Error 2 make[1]: Leaving directory '/tmp/build/linux' make: *** [Makefile:152: sub-make] Error 2 make: Leaving directory '/git/linux' I'll investigate later, perf and objtool builds just fine, with clang and gcc. Several are cross builds, the ones with -x-ARCH, and the android one, and those may not have all the features built, due to lack of multi-arch devel packages, available and being used so far on just a few, like debian:experimental-x-{arm64,mipsel}. The 'perf test' one will perform a variety of tests exercising tools/perf/util/, tools/lib/{bpf,traceevent,etc}, as well as run perf commands with a variety of command line event specifications to then intercept the sys_perf_event syscall to check that the perf_event_attr fields are set up as expected, among a variety of other unit tests. Then there is the 'make -C tools/perf build-test' ones, that build tools/perf/ with a variety of feature sets, exercising the build with an incomplete set of features as well as with a complete one. It is planned to have it run on each of the containers mentioned above, using some container orchestration infrastructure. Get in contact if interested in helping having this in place. # dm 1 alpine:3.4: Ok 2 alpine:3.5: Ok 3 alpine:edge: Ok 4 android-ndk:r12b-arm: Ok 5 archlinux:latest: Ok 6 centos:5: Ok 7 centos:6: Ok 8 centos:7: Ok 9 debian:7: Ok 10 debian:8: Ok 11 debian:9: Ok 12 debian:experimental: Ok 13 debian:experimental-x-arm64: Ok 14 debian:experimental-x-mips: Ok 15 debian:experimental-x-mips64: Ok 16 debian:experimental-x-mipsel: Ok 17 fedora:20: Ok 18 fedora:21: Ok 19 fedora:22: Ok 20 fedora:23: Ok 21 fedora:24: Ok 22 fedora:24-x-ARC-uClibc: Ok 23 fedora:25: Ok 24 fedora:rawhide: Ok 25 mageia:5: Ok 26 opensuse:13.2: Ok 27 opensuse:42.1: Ok 28 opensuse:tumbleweed: Ok 29 ubuntu:12.04.5: Ok 30 ubuntu:14.04.4: Ok 31 ubuntu:14.04.4-x-linaro-arm64: Ok 32 ubuntu:15.10: Ok 33 ubuntu:16.04: Ok 34 ubuntu:16.04-x-arm: Ok 35 ubuntu:16.04-x-arm64: Ok 36 ubuntu:16.04-x-powerpc: Ok 37 ubuntu:16.04-x-powerpc64: Ok 38 ubuntu:16.04-x-s390: Ok 39 ubuntu:16.10: Ok 40 ubuntu:17.04: Ok # # uname -a Linux jouet 4.11.0-rc2+ #5 SMP Mon Mar 20 18:12:29 -03 2017 x86_64 x86_64 x86_64 GNU/Linux # 'perf test tsc' already fixed by peterz in tip # perf test 1: vmlinux symtab matches kallsyms : Ok 2: Detect openat syscall event : Ok 3: Detect openat syscall event on all cpus : Ok 4: Read samples using the mmap interface : Ok 5: Parse event definition strings : Ok 6: Simple expression parser : Ok 7: PERF_RECORD_* events & perf_sample fields : Ok 8: Parse perf pmu format : Ok 9: DSO data read : Ok 10: DSO data cache : Ok 11: DSO data reopen : Ok 12: Roundtrip evsel->name : Ok 13: Parse sched tracepoints fields : Ok 14: syscalls:sys_enter_openat event fields : Ok 15: Setup struct perf_event_attr : Ok 16: Match and link multiple hists : Ok 17: 'import perf' in python : Ok 18: Breakpoint overflow signal handler : Ok 19: Breakpoint overflow sampling : Ok 20: Number of exit events of a simple workload : Ok 21: Software clock events period values : Ok 22: Object code reading : Ok 23: Sample parsing : Ok 24: Use a dummy software event to keep tracking: Ok 25: Parse with no sample_id_all bit set : Ok 26: Filter hist entries : Ok 27: Lookup mmap thread : Ok 28: Share thread mg : Ok 29: Sort output of hist entries : Ok 30: Cumulate child hist entries : Ok 31: Track with sched_switch : Ok 32: Filter fds with revents mask in a fdarray : Ok 33: Add fd to a fdarray, making it autogrow : Ok 34: kmod_path__parse : Ok 35: Thread map : Ok 36: LLVM search and compile : 36.1: Basic BPF llvm compile : Ok 36.2: kbuild searching : Ok 36.3: Compile source for BPF prologue generation: Ok 36.4: Compile source for BPF relocation : Ok 37: Session topology : Ok 38: BPF filter : 38.1: Basic BPF filtering : Ok 38.2: BPF pinning : Ok 38.3: BPF prologue generation : Ok 38.4: BPF relocation checker : Ok 39: Synthesize thread map : Ok 40: Remove thread map : Ok 41: Synthesize cpu map : Ok 42: Synthesize stat config : Ok 43: Synthesize stat : Ok 44: Synthesize stat round : Ok 45: Synthesize attr update : Ok 46: Event times : Ok 47: Read backward ring buffer : Ok 48: Print cpu map : Ok 49: Probe SDT events : Ok 50: is_printable_array : Ok 51: Print bitmap : Ok 52: perf hooks : Ok 53: builtin clang support : Skip (not compiled in) 54: unit_number__scnprintf : Ok 55: x86 rdpmc : Ok 56: Convert perf time to TSC : FAILED! 57: DWARF unwind : Ok 58: x86 instruction decoder - new instructions : Ok 59: Intel cqm nmi context read : Skip $ make -C tools/perf build-test make_no_libdw_dwarf_unwind_O: make NO_LIBDW_DWARF_UNWIND=1 make_minimal_O: make NO_LIBPERL=1 NO_LIBPYTHON=1 NO_NEWT=1 NO_GTK2=1 NO_DEMANGLE=1 NO_LIBELF=1 NO_LIBUNWIND=1 NO_BACKTRACE=1 NO_LIBNUMA=1 NO_LIBAUDIT=1 NO_LIBBIONIC=1 NO_LIBDW_DWARF_UNWIND=1 NO_AUXTRACE=1 NO_LIBBPF=1 NO_LIBCRYPTO=1 NO_SDT=1 NO_JVMTI=1 make_no_gtk2_O: make NO_GTK2=1 make_no_newt_O: make NO_NEWT=1 make_debug_O: make DEBUG=1 make_no_backtrace_O: make NO_BACKTRACE=1 make_with_clangllvm_O: make LIBCLANGLLVM=1 make_tags_O: make tags make_no_demangle_O: make NO_DEMANGLE=1 make_install_bin_O: make install-bin make_no_libpython_O: make NO_LIBPYTHON=1 make_no_slang_O: make NO_SLANG=1 make_no_auxtrace_O: make NO_AUXTRACE=1 make_util_pmu_bison_o_O: make util/pmu-bison.o make_no_ui_O: make NO_NEWT=1 NO_SLANG=1 NO_GTK2=1 make_no_scripts_O: make NO_LIBPYTHON=1 NO_LIBPERL=1 make_no_libbpf_O: make NO_LIBBPF=1 make_with_babeltrace_O: make LIBBABELTRACE=1 make_util_map_o_O: make util/map.o make_static_O: make LDFLAGS=-static make_help_O: make help make_pure_O: make make_perf_o_O: make perf.o make_no_libnuma_O: make NO_LIBNUMA=1 make_no_libperl_O: make NO_LIBPERL=1 make_no_libunwind_O: make NO_LIBUNWIND=1 make_no_libbionic_O: make NO_LIBBIONIC=1 make_doc_O: make doc make_no_libelf_O: make NO_LIBELF=1 make_clean_all_O: make clean all make_install_prefix_O: make install prefix=/tmp/krava make_no_libaudit_O: make NO_LIBAUDIT=1 make_install_O: make install make_install_prefix_slash_O: make install prefix=/tmp/krava/ OK make: Leaving directory '/home/acme/git/linux/tools/perf' $ ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [GIT PULL 0/9] perf/core improvements and fixes 2017-04-01 2:10 Arnaldo Carvalho de Melo @ 2017-04-01 10:44 ` Ingo Molnar 0 siblings, 0 replies; 42+ messages in thread From: Ingo Molnar @ 2017-04-01 10:44 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: linux-kernel, Adrian Hunter, Alexander Shishkin, Alexis Berlemont, Al Viro, Colin King, David Ahern, David Howells, Hemant Kumar, Jan Stancek, Jiri Olsa, Kan Liang, kernel-janitors, Krister Johansen, Luis Claudio Gonçalves, Masami Hiramatsu, Michael Ellerman, Namhyung Kim, Naveen N . Rao, Peter Zijlstra, Ravi Bangoria, Wang Nan, Yao Jin, Arnaldo Carvalho de Melo * Arnaldo Carvalho de Melo <acme@kernel.org> wrote: > Hi Ingo, > > Please consider pulling, > > - Arnaldo > > Test results at the end of this message, as usual. > > The following changes since commit 3906a13a6b4e78fbc0def03a808f091f0dff1b44: > > Merge tag 'perf-core-for-mingo-4.12-20170327' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core (2017-03-28 07:44:43 +0200) > > are available in the git repository at: > > git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-4.12-20170331 > > for you to fetch changes up to fd5cead23f54697310bd565aa2a23ae5128080a0: > > perf trace: Beautify statx syscall 'flag' and 'mask' arguments (2017-03-31 14:42:31 -0300) > > ---------------------------------------------------------------- > perf/core improvements and fixes: > > New features: > > - Beautify the statx syscall arguments in 'perf trace' (Arnaldo Carvalho de Melo) > > e.g.: > > System wide strace like session: > > # trace -e statx > 16612.967 ( 0.028 ms): statx/4562 statx(dfd: CWD, filename: /tmp/statx, flags: SYMLINK_NOFOLLOW, mask: TYPE|MODE|NLINK|UID|GID|ATIME|MTIME|CTIME|INO|SIZE|BLOCKS|BTIME, buffer: 0x7ffef195d660) = 0 > 36050.891 ( 0.007 ms): statx/4576 statx(dfd: CWD, filename: /etc/passwd, flags: SYMLINK_NOFOLLOW|STATX_DONT_SYNC, mask: BTIME, buffer: 0x7ffda9bf50f0) = 0 > ^C# > > User visible: > > - Handle unpaired raw_syscalls:sys_exit events in 'perf trace', i.e. we > shouldn't try to calculate duration or print the timestamp for a missing > matching raw_syscalls:sys_enter (Arnaldo Carvalho de Melo) > > - Do not print "cycles: 0" in perf report LBR lines in platforms not > supporting 'cycles', such as Intel's Broadwell (Jin Yao) > > - Handle missing $HOME env var (Jiri Olsa) > > - Map 8-bit registers (al, bl, etc), not supported in uprobes_events, to > the next best thing (ax, bx, etc) supported (Ravi Bangoria) > > Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> > > ---------------------------------------------------------------- > Arnaldo Carvalho de Melo (4): > perf tools: Remove support for command aliases > perf trace: Handle unpaired raw_syscalls:sys_exit event > tools include uapi: Grab copies of stat.h and fcntl.h > perf trace: Beautify statx syscall 'flag' and 'mask' arguments > > Colin Ian King (1): > perf utils: Fix spelling mistake: "Invalud" -> "Invalid" > > Jin Yao (1): > perf report: Drop cycles 0 for LBR print > > Jiri Olsa (1): > perf tools: Do not fail in case of empty HOME env variable > > Ravi Bangoria (2): > perf/sdt/x86: Add renaming logic for (missing) 8 bit registers > perf/sdt/x86: Move OP parser to tools/perf/arch/x86/ > > tools/include/linux/types.h | 1 + > tools/include/uapi/linux/fcntl.h | 72 +++++++++ > tools/include/uapi/linux/stat.h | 176 ++++++++++++++++++++ > tools/perf/Build | 1 + > tools/perf/MANIFEST | 2 + > tools/perf/arch/x86/entry/syscalls/syscall_64.tbl | 1 + > tools/perf/arch/x86/util/perf_regs.c | 187 ++++++++++++++++++---- > tools/perf/builtin-help.c | 13 -- > tools/perf/builtin-trace.c | 57 ++++--- > tools/perf/check-headers.sh | 2 + > tools/perf/perf.c | 97 +---------- > tools/perf/trace/beauty/Build | 1 + > tools/perf/trace/beauty/beauty.h | 24 +++ > tools/perf/trace/beauty/statx.c | 72 +++++++++ > tools/perf/util/Build | 1 - > tools/perf/util/alias.c | 78 --------- > tools/perf/util/cache.h | 1 - > tools/perf/util/callchain.c | 111 ++++++++----- > tools/perf/util/config.c | 54 ++++--- > tools/perf/util/help-unknown-cmd.c | 8 +- > tools/perf/util/hist.c | 2 +- > tools/perf/util/perf_regs.c | 6 +- > tools/perf/util/perf_regs.h | 11 +- > tools/perf/util/probe-file.c | 132 +++++---------- > 24 files changed, 707 insertions(+), 403 deletions(-) > create mode 100644 tools/include/uapi/linux/fcntl.h > create mode 100644 tools/include/uapi/linux/stat.h > create mode 100644 tools/perf/trace/beauty/Build > create mode 100644 tools/perf/trace/beauty/beauty.h > create mode 100644 tools/perf/trace/beauty/statx.c > delete mode 100644 tools/perf/util/alias.c Pulled, thanks a lot Arnaldo! Ingo ^ permalink raw reply [flat|nested] 42+ messages in thread
* [GIT PULL 0/9] perf/core improvements and fixes @ 2015-12-11 14:22 Arnaldo Carvalho de Melo 2015-12-14 8:32 ` Ingo Molnar 0 siblings, 1 reply; 42+ messages in thread From: Arnaldo Carvalho de Melo @ 2015-12-11 14:22 UTC (permalink / raw) To: Ingo Molnar Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter, Alexei Starovoitov, Brendan Gregg, David Ahern, David S . Miller, Jiri Olsa, Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama, Wang Nan, Zefan Li, Arnaldo Carvalho de Melo Hi Ingo, Please consider pulling, - Arnaldo The following changes since commit e7a7865cc0da306542db0b9205cb0a467f59e33d: perf symbols: Fix dso__load_sym to put dso (2015-12-10 16:29:32 -0300) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo for you to fetch changes up to 93b0ba3c60da89043ce2b9f601cd2b3da408903b: perf tools: Clear struct machine during machine__init() (2015-12-11 09:32:41 -0300) ---------------------------------------------------------------- perf/core improvements and fixes: User visible: - Fix 'perf top' annotation in --stdio (Namhyung Kim) - Support hw breakpoint events (mem:0xAddress) in the default output mode in 'perf script' (Wang Nan) Infrastructure: - Do not hold the hists lock while emitting one specific warning (Namhyung Kim) - Fetch map names from correct strtab, worked so far because llvm/clang uses just one string table (Wang Nan) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> ---------------------------------------------------------------- Namhyung Kim (4): perf top: Do not convert address for perf_top__record_precise_ip() perf top: Access hists->lock only if needed perf top: Fix annotation on --stdio perf top: Cleanup condition in perf_top__record_precise_ip() Wang Nan (5): tools lib bpf: Check return value of strdup when reading map names tools lib bpf: Fetch map names from correct strtab perf data: Add u32_hex data type perf script: Add support for PERF_TYPE_BREAKPOINT perf tools: Clear struct machine during machine__init() tools/lib/bpf/libbpf.c | 24 +++++++++++++----- tools/perf/builtin-script.c | 14 +++++++++++ tools/perf/builtin-top.c | 52 +++++++++++++++++---------------------- tools/perf/util/data-convert-bt.c | 2 ++ tools/perf/util/machine.c | 1 + 5 files changed, 57 insertions(+), 36 deletions(-) ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [GIT PULL 0/9] perf/core improvements and fixes 2015-12-11 14:22 Arnaldo Carvalho de Melo @ 2015-12-14 8:32 ` Ingo Molnar 0 siblings, 0 replies; 42+ messages in thread From: Ingo Molnar @ 2015-12-14 8:32 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: linux-kernel, Adrian Hunter, Alexei Starovoitov, Brendan Gregg, David Ahern, David S . Miller, Jiri Olsa, Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama, Wang Nan, Zefan Li, Arnaldo Carvalho de Melo * Arnaldo Carvalho de Melo <acme@kernel.org> wrote: > Hi Ingo, > > Please consider pulling, > > - Arnaldo > > The following changes since commit e7a7865cc0da306542db0b9205cb0a467f59e33d: > > perf symbols: Fix dso__load_sym to put dso (2015-12-10 16:29:32 -0300) > > are available in the git repository at: > > git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo > > for you to fetch changes up to 93b0ba3c60da89043ce2b9f601cd2b3da408903b: > > perf tools: Clear struct machine during machine__init() (2015-12-11 09:32:41 -0300) > > ---------------------------------------------------------------- > perf/core improvements and fixes: > > User visible: > > - Fix 'perf top' annotation in --stdio (Namhyung Kim) > > - Support hw breakpoint events (mem:0xAddress) in the default output mode in > 'perf script' (Wang Nan) > > Infrastructure: > > - Do not hold the hists lock while emitting one specific warning (Namhyung Kim) > > - Fetch map names from correct strtab, worked so far because llvm/clang > uses just one string table (Wang Nan) > > Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> > > ---------------------------------------------------------------- > Namhyung Kim (4): > perf top: Do not convert address for perf_top__record_precise_ip() > perf top: Access hists->lock only if needed > perf top: Fix annotation on --stdio > perf top: Cleanup condition in perf_top__record_precise_ip() > > Wang Nan (5): > tools lib bpf: Check return value of strdup when reading map names > tools lib bpf: Fetch map names from correct strtab > perf data: Add u32_hex data type > perf script: Add support for PERF_TYPE_BREAKPOINT > perf tools: Clear struct machine during machine__init() > > tools/lib/bpf/libbpf.c | 24 +++++++++++++----- > tools/perf/builtin-script.c | 14 +++++++++++ > tools/perf/builtin-top.c | 52 +++++++++++++++++---------------------- > tools/perf/util/data-convert-bt.c | 2 ++ > tools/perf/util/machine.c | 1 + > 5 files changed, 57 insertions(+), 36 deletions(-) Pulled, thanks a lot Arnaldo! Ingo ^ permalink raw reply [flat|nested] 42+ messages in thread
* [GIT PULL 0/9] perf/core improvements and fixes @ 2015-09-23 1:57 Arnaldo Carvalho de Melo 2015-09-23 7:45 ` Ingo Molnar 0 siblings, 1 reply; 42+ messages in thread From: Arnaldo Carvalho de Melo @ 2015-09-23 1:57 UTC (permalink / raw) To: Ingo Molnar Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter, Alexei Starovoitov, Borislav Petkov, Brendan Gregg, Daniel Borkmann, David Ahern, Frederic Weisbecker, He Kuang, H . Peter Anvin, Jiri Olsa, Kaixu Xia, Masami Hiramatsu, Matt Fleming, Milian Wolff, Namhyung Kim, Paul Mackerras, Peter Zijlstra, pi3orama, Raphael Beamonte, Stephane Eranian, Steven Rostedt, Thomas Gleixner, Vinson Lee, Wang Nan, Zefan Li, Arnaldo Carvalho de Melo Hi Ingo, Please consider pulling, - Arnaldo The following changes since commit 96f3eda67fcf2598e9d2794398e0e7ab35138ea6: perf/x86/intel: Fix static checker warning in lbr enable (2015-09-18 09:24:57 +0200) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo for you to fetch changes up to e803cf97a4f90d31bcc2c9a1ea20fe9cdc12b2f9: perf record: Synthesize COMM event for a command line workload (2015-09-22 22:43:12 -0300) ---------------------------------------------------------------- perf/core improvements and fixes: User visible: - Fix a segfault in 'perf probe' when removing uprobe events (Masami Hiramatsu) - Synthesize COMM event for workloads started from the command line in 'perf record' so that we can have the pid->comm mapping before we get the real PERF_RECORD_COMM switching from perf to the workload (Namhyung Kim) - Fix build tools/vm/ due to removal of tools/lib/api/fs/debugfs.h (Arnaldo Carvalho de Melo) Developer stuff: - Fix the make tarball targets by including the recently added err.h header in the perf MANIFEST file (Jiri Olsa) - Don't assume that the event parser returns a non empty evlist (Wang Nan) - Add way to disambiguate feature detection state files, needed to use tools/build feature detection for multiple components in a single O= output dir, which will be the case with tools/perf/ and tools/lib/bpf/ (Arnaldo Carvalho de Melo) - Fixup FEATURE_{TESTS,DISPLAY} inversion in tools/lib/bpf/ (Arnaldo Carvalho de Melo) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> ---------------------------------------------------------------- Arnaldo Carvalho de Melo (5): tools build: Fixup feature detection display function name tools lib bpf: Fix up FEATURE_{TESTS,DISPLAY} usage tools build: Allow setting the feature detection user tools lib bpf: Use FEATURE_USER to allow building in the same dir as perf tools vm: Fix build due to removal of tools/lib/api/fs/debugfs.h Jiri Olsa (1): perf tools: Add include/err.h into MANIFEST Masami Hiramatsu (1): perf probe: Fix a segfault when removing uprobe events Namhyung Kim (1): perf record: Synthesize COMM event for a command line workload Wang Nan (1): perf tools: Don't assume that the parser returns non empty evsel list tools/build/Makefile.feature | 9 +++++---- tools/lib/bpf/Makefile | 5 +++-- tools/perf/MANIFEST | 1 + tools/perf/builtin-probe.c | 7 +++++-- tools/perf/builtin-record.c | 15 ++++++++++++++- tools/perf/util/event.c | 2 +- tools/perf/util/event.h | 5 +++++ tools/perf/util/parse-events.c | 16 ++++++++++++++++ tools/vm/page-types.c | 6 +++--- 9 files changed, 53 insertions(+), 13 deletions(-) ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [GIT PULL 0/9] perf/core improvements and fixes 2015-09-23 1:57 Arnaldo Carvalho de Melo @ 2015-09-23 7:45 ` Ingo Molnar 0 siblings, 0 replies; 42+ messages in thread From: Ingo Molnar @ 2015-09-23 7:45 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: linux-kernel, Adrian Hunter, Alexei Starovoitov, Borislav Petkov, Brendan Gregg, Daniel Borkmann, David Ahern, Frederic Weisbecker, He Kuang, H . Peter Anvin, Jiri Olsa, Kaixu Xia, Masami Hiramatsu, Matt Fleming, Milian Wolff, Namhyung Kim, Paul Mackerras, Peter Zijlstra, pi3orama, Raphael Beamonte, Stephane Eranian, Steven Rostedt, Thomas Gleixner, Vinson Lee, Wang Nan, Zefan Li, Arnaldo Carvalho de Melo * Arnaldo Carvalho de Melo <acme@kernel.org> wrote: > Hi Ingo, > > Please consider pulling, > > - Arnaldo > > The following changes since commit 96f3eda67fcf2598e9d2794398e0e7ab35138ea6: > > perf/x86/intel: Fix static checker warning in lbr enable (2015-09-18 09:24:57 +0200) > > are available in the git repository at: > > git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo > > for you to fetch changes up to e803cf97a4f90d31bcc2c9a1ea20fe9cdc12b2f9: > > perf record: Synthesize COMM event for a command line workload (2015-09-22 22:43:12 -0300) > > ---------------------------------------------------------------- > perf/core improvements and fixes: > > User visible: > > - Fix a segfault in 'perf probe' when removing uprobe events (Masami Hiramatsu) > > - Synthesize COMM event for workloads started from the command line in 'perf > record' so that we can have the pid->comm mapping before we get the real > PERF_RECORD_COMM switching from perf to the workload (Namhyung Kim) > > - Fix build tools/vm/ due to removal of tools/lib/api/fs/debugfs.h > (Arnaldo Carvalho de Melo) > > Developer stuff: > > - Fix the make tarball targets by including the recently added err.h header in > the perf MANIFEST file (Jiri Olsa) > > - Don't assume that the event parser returns a non empty evlist (Wang Nan) > > - Add way to disambiguate feature detection state files, needed to use > tools/build feature detection for multiple components in a single O= output > dir, which will be the case with tools/perf/ and tools/lib/bpf/ > (Arnaldo Carvalho de Melo) > > - Fixup FEATURE_{TESTS,DISPLAY} inversion in tools/lib/bpf/ (Arnaldo Carvalho de Melo) > > Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> > > ---------------------------------------------------------------- > Arnaldo Carvalho de Melo (5): > tools build: Fixup feature detection display function name > tools lib bpf: Fix up FEATURE_{TESTS,DISPLAY} usage > tools build: Allow setting the feature detection user > tools lib bpf: Use FEATURE_USER to allow building in the same dir as perf > tools vm: Fix build due to removal of tools/lib/api/fs/debugfs.h > > Jiri Olsa (1): > perf tools: Add include/err.h into MANIFEST > > Masami Hiramatsu (1): > perf probe: Fix a segfault when removing uprobe events > > Namhyung Kim (1): > perf record: Synthesize COMM event for a command line workload > > Wang Nan (1): > perf tools: Don't assume that the parser returns non empty evsel list > > tools/build/Makefile.feature | 9 +++++---- > tools/lib/bpf/Makefile | 5 +++-- > tools/perf/MANIFEST | 1 + > tools/perf/builtin-probe.c | 7 +++++-- > tools/perf/builtin-record.c | 15 ++++++++++++++- > tools/perf/util/event.c | 2 +- > tools/perf/util/event.h | 5 +++++ > tools/perf/util/parse-events.c | 16 ++++++++++++++++ > tools/vm/page-types.c | 6 +++--- > 9 files changed, 53 insertions(+), 13 deletions(-) Pulled, thanks a lot Arnaldo! Ingo ^ permalink raw reply [flat|nested] 42+ messages in thread
* [GIT PULL 0/9] perf/core improvements and fixes @ 2015-08-10 20:56 Arnaldo Carvalho de Melo 2015-08-12 10:18 ` Ingo Molnar 0 siblings, 1 reply; 42+ messages in thread From: Arnaldo Carvalho de Melo @ 2015-08-10 20:56 UTC (permalink / raw) To: Ingo Molnar Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter, Andi Kleen, Borislav Petkov, David Ahern, Frederic Weisbecker, Jiri Olsa, Kan Liang, Namhyung Kim, Peter Zijlstra, Stephane Eranian, Arnaldo Carvalho de Melo Hi Ingo, Please consider pulling, - Arnaldo The following changes since commit f1d800bf615b84ca253af372d2dac8cdef743a20: Merge tag 'perf-ebpf-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core (2015-08-08 10:05:17 +0200) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo for you to fetch changes up to 4605bb55b91449a1a953a51f0334d3bc02351adb: perf evlist: Be more specific on -F/--freq (2015-08-10 17:20:26 -0300) ---------------------------------------------------------------- perf/core improvements and fixes: User visible: - Introduce 'srcfile' sort key: (Andi Kleen) # perf record -F 10000 usleep 1 # perf report --stdio --dsos '[kernel.vmlinux]' -s srcfile <SNIP> # Overhead Source File 26.49% copy_page_64.S 5.49% signal.c 0.51% msr.h # It can be combined with other fields, for instance, experiment with '-s srcfile,symbol'. There are some oddities in some distros and with some specific DSOs, being investigated, so your mileage may vary. - Update the column width for the "srcline" sort key (Arnaldo Carvalho de Melo) - Support per-event 'freq' term: (Namhyung Kim) $ perf record -e 'cpu/instructions,freq=1234/',cycles -c 1000 sleep 1 $ perf evlist -F cpu/instructions,freq=1234/: sample_freq=1234 cycles: sample_period=1000 $ Infrastructure: - Move perf_counts struct and functions into separate object (Jiri Olsa) - Unset perf_event_attr::freq when period term is set (Jiri Olsa) - Move callchain option parsing code to util.c (Kan Liang) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> ---------------------------------------------------------------- Andi Kleen (2): perf tools: Support full source file paths for srcline perf report: Add support for srcfile sort key Arnaldo Carvalho de Melo (2): perf hists: hist_entry__cmp() may use he_tmp.hists, initialize it perf hists: Update the column width for the "srcline" sort key Jiri Olsa (2): perf stat: Move perf_counts struct and functions into separate object perf tools: Unset perf_event_attr::freq when period term is set Kan Liang (1): perf callchain: Move option parsing code to util.c Namhyung Kim (2): perf record: Support per-event freq term perf evlist: Be more specific on -F/--freq tools/perf/Documentation/perf-record.txt | 1 + tools/perf/Documentation/perf-report.txt | 4 ++ tools/perf/Documentation/perf-script.txt | 3 ++ tools/perf/builtin-report.c | 2 + tools/perf/builtin-script.c | 2 + tools/perf/builtin-stat.c | 1 + tools/perf/util/Build | 1 + tools/perf/util/callchain.c | 89 +------------------------------ tools/perf/util/callchain.h | 1 + tools/perf/util/counts.c | 52 ++++++++++++++++++ tools/perf/util/counts.h | 37 +++++++++++++ tools/perf/util/evsel.c | 14 ++++- tools/perf/util/evsel.h | 4 +- tools/perf/util/hist.c | 9 ++++ tools/perf/util/hist.h | 1 + tools/perf/util/parse-events.c | 6 +++ tools/perf/util/parse-events.h | 1 + tools/perf/util/parse-events.l | 1 + tools/perf/util/pmu.c | 2 +- tools/perf/util/python-ext-sources | 2 +- tools/perf/util/sort.c | 52 ++++++++++++++++++ tools/perf/util/sort.h | 2 + tools/perf/util/srcline.c | 6 ++- tools/perf/util/stat.c | 49 ----------------- tools/perf/util/stat.h | 30 ----------- tools/perf/util/util.c | 90 ++++++++++++++++++++++++++++++++ tools/perf/util/util.h | 3 ++ 27 files changed, 292 insertions(+), 173 deletions(-) create mode 100644 tools/perf/util/counts.c create mode 100644 tools/perf/util/counts.h ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [GIT PULL 0/9] perf/core improvements and fixes 2015-08-10 20:56 Arnaldo Carvalho de Melo @ 2015-08-12 10:18 ` Ingo Molnar 0 siblings, 0 replies; 42+ messages in thread From: Ingo Molnar @ 2015-08-12 10:18 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: linux-kernel, Adrian Hunter, Andi Kleen, Borislav Petkov, David Ahern, Frederic Weisbecker, Jiri Olsa, Kan Liang, Namhyung Kim, Peter Zijlstra, Stephane Eranian, Arnaldo Carvalho de Melo * Arnaldo Carvalho de Melo <acme@kernel.org> wrote: > Hi Ingo, > > Please consider pulling, > > - Arnaldo > > The following changes since commit f1d800bf615b84ca253af372d2dac8cdef743a20: > > Merge tag 'perf-ebpf-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core (2015-08-08 10:05:17 +0200) > > are available in the git repository at: > > git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo > > for you to fetch changes up to 4605bb55b91449a1a953a51f0334d3bc02351adb: > > perf evlist: Be more specific on -F/--freq (2015-08-10 17:20:26 -0300) > > ---------------------------------------------------------------- > perf/core improvements and fixes: > > User visible: > > - Introduce 'srcfile' sort key: (Andi Kleen) > > # perf record -F 10000 usleep 1 > # perf report --stdio --dsos '[kernel.vmlinux]' -s srcfile > <SNIP> > # Overhead Source File > 26.49% copy_page_64.S > 5.49% signal.c > 0.51% msr.h > # > > It can be combined with other fields, for instance, experiment with > '-s srcfile,symbol'. > > There are some oddities in some distros and with some specific DSOs, being > investigated, so your mileage may vary. > > - Update the column width for the "srcline" sort key (Arnaldo Carvalho de Melo) > > - Support per-event 'freq' term: (Namhyung Kim) > > $ perf record -e 'cpu/instructions,freq=1234/',cycles -c 1000 sleep 1 > $ perf evlist -F > cpu/instructions,freq=1234/: sample_freq=1234 > cycles: sample_period=1000 > $ > > Infrastructure: > > - Move perf_counts struct and functions into separate object (Jiri Olsa) > > - Unset perf_event_attr::freq when period term is set (Jiri Olsa) > > - Move callchain option parsing code to util.c (Kan Liang) > > Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> > > ---------------------------------------------------------------- > Andi Kleen (2): > perf tools: Support full source file paths for srcline > perf report: Add support for srcfile sort key > > Arnaldo Carvalho de Melo (2): > perf hists: hist_entry__cmp() may use he_tmp.hists, initialize it > perf hists: Update the column width for the "srcline" sort key > > Jiri Olsa (2): > perf stat: Move perf_counts struct and functions into separate object > perf tools: Unset perf_event_attr::freq when period term is set > > Kan Liang (1): > perf callchain: Move option parsing code to util.c > > Namhyung Kim (2): > perf record: Support per-event freq term > perf evlist: Be more specific on -F/--freq > > tools/perf/Documentation/perf-record.txt | 1 + > tools/perf/Documentation/perf-report.txt | 4 ++ > tools/perf/Documentation/perf-script.txt | 3 ++ > tools/perf/builtin-report.c | 2 + > tools/perf/builtin-script.c | 2 + > tools/perf/builtin-stat.c | 1 + > tools/perf/util/Build | 1 + > tools/perf/util/callchain.c | 89 +------------------------------ > tools/perf/util/callchain.h | 1 + > tools/perf/util/counts.c | 52 ++++++++++++++++++ > tools/perf/util/counts.h | 37 +++++++++++++ > tools/perf/util/evsel.c | 14 ++++- > tools/perf/util/evsel.h | 4 +- > tools/perf/util/hist.c | 9 ++++ > tools/perf/util/hist.h | 1 + > tools/perf/util/parse-events.c | 6 +++ > tools/perf/util/parse-events.h | 1 + > tools/perf/util/parse-events.l | 1 + > tools/perf/util/pmu.c | 2 +- > tools/perf/util/python-ext-sources | 2 +- > tools/perf/util/sort.c | 52 ++++++++++++++++++ > tools/perf/util/sort.h | 2 + > tools/perf/util/srcline.c | 6 ++- > tools/perf/util/stat.c | 49 ----------------- > tools/perf/util/stat.h | 30 ----------- > tools/perf/util/util.c | 90 ++++++++++++++++++++++++++++++++ > tools/perf/util/util.h | 3 ++ > 27 files changed, 292 insertions(+), 173 deletions(-) > create mode 100644 tools/perf/util/counts.c > create mode 100644 tools/perf/util/counts.h Pulled, thanks a lot Arnaldo! Ingo ^ permalink raw reply [flat|nested] 42+ messages in thread
* [GIT PULL 0/9] perf/core improvements and fixes @ 2014-06-19 16:38 Jiri Olsa 0 siblings, 0 replies; 42+ messages in thread From: Jiri Olsa @ 2014-06-19 16:38 UTC (permalink / raw) To: Ingo Molnar Cc: linux-kernel, Arnaldo Carvalho de Melo, Corey Ashford, David Ahern, Frederic Weisbecker, Jan Kiszka, Jiri Olsa, Namhyung Kim, Paul Mackerras, Peter Zijlstra, Simon Que, Steven Rostedt hi Ingo, please consider pulling thanks, jirka The following changes since commit 4ba96195051be30160af6d5f5f83f9a055ab1f23: Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf into perf/core (2014-06-13 08:19:06 +0200) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git tags/perf-core-for-mingo for you to fetch changes up to 26d664a30d08002ac0a4dbd09217ea084da07bc9: perf symbols: Get kernel start address by symbol name (2014-06-19 18:18:38 +0200) ---------------------------------------------------------------- perf/core improvements and fixes: . Updates from trace-cmd for traceevent plugin_kvm plus args cleanup (Steven Rostedt) . Fix kernel start address lookup in report code (Simon Que) . Fix segfault in cumulative.callchain report (Jiri Olsa) Signed-off-by: Jiri Olsa <jolsa@kernel.org> ---------------------------------------------------------------- Jan Kiszka (3): tools lib traceevent: Report unknown VMX exit reasons with code tools lib traceevent: Factor out print_exit_reason in kvm plugin tools lib traceevent: Fix and cleanup kvm_nested_vmexit tracepoints Jiri Olsa (1): perf tools: Fix segfault in cumulative.callchain report Simon Que (1): perf symbols: Get kernel start address by symbol name Steven Rostedt (3): tools lib traceevent: Fix format in plugin_kvm tools lib traceevent: Clean up format of args in cfg80211 plugin tools lib traceevent: Clean up format of args in jbd2 plugin Steven Rostedt (Red Hat) (1): tools lib traceevent: Add back in kvm plugins nested_vmexit events tools/lib/traceevent/plugin_cfg80211.c | 3 +- tools/lib/traceevent/plugin_jbd2.c | 6 ++-- tools/lib/traceevent/plugin_kvm.c | 64 +++++++++++++++++++++++++++++----- tools/perf/ui/browsers/hists.c | 21 ++++++++--- tools/perf/util/machine.c | 54 ++++++++++++---------------- 5 files changed, 97 insertions(+), 51 deletions(-) ^ permalink raw reply [flat|nested] 42+ messages in thread
* [GIT PULL 0/9] perf/core improvements and fixes @ 2014-06-03 20:36 Jiri Olsa 2014-06-05 8:45 ` Ingo Molnar 0 siblings, 1 reply; 42+ messages in thread From: Jiri Olsa @ 2014-06-03 20:36 UTC (permalink / raw) To: Ingo Molnar Cc: linux-kernel, Adrian Hunter, Arnaldo Carvalho de Melo, Arnaldo Carvalho de Melo, Cody P Schafer, Corey Ashford, David Ahern, David Binderman, Dianfang Zhang, Don Zickus, Frederic Weisbecker, Jean Delvare, Jianyu Zhan, Jiri Olsa, Mike Galbraith, Namhyung Kim, Paul Mackerras, Peter Zijlstra, Peter Zijlstra, Sebastian Andrzej Siewior, Stephane Eranian, Sukadev Bhattiprolu hi Ingo, please consider pulling thanks, jirka The following changes since commit 9b261365dd73a5014b49033327ad881708e81f33: Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf into perf/core (2014-06-03 20:22:40 +0200) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git tags/perf-core-for-mingo for you to fetch changes up to fc9cabeabf42d76854059e7bce81a02645e7e5ca: perf tools: Fix 'make help' message error (2014-06-03 21:35:12 +0200) ---------------------------------------------------------------- perf/core improvements and fixes: . Warn the user when trace command is not available (Arnaldo Carvalho de Melo) . Add warning when disabling perl scripting support due to missing devel files (Arnaldo Carvalho de Melo) . Consider header files outside perf directory in tags target (Sebastian Andrzej Siewior) . Allow overriding sysfs and proc finding with env var (Cody P Schafer) . Fix "==" into "=" in ui_browser__warning assignment (zhangdianfang) . Factor elide bool handling in sort code (Jiri Olsa) . Fix poll return value propagation (Jiri Olsa) . Fix 'make help' message error (Jianyu Zhan) Signed-off-by: Jiri Olsa <jolsa@kernel.org> ---------------------------------------------------------------- Arnaldo Carvalho de Melo (2): perf trace: Warn the user when not available perf tools: Add warning when disabling perl scripting support due to missing devel files Cody P Schafer (1): perf tools: Allow overriding sysfs and proc finding with env var Jianyu Zhan (1): perf tools: Fix 'make help' message error Jiri Olsa (3): perf tools: Remove elide setup for SORT_MODE__MEMORY mode perf tools: Move elide bool into perf_hpp_fmt struct perf record: Fix poll return value propagation Sebastian Andrzej Siewior (1): perf tools: Consider header files outside perf directory in tags target zhangdianfang (1): perf tools: Fix "==" into "=" in ui_browser__warning assignment tools/lib/api/fs/fs.c | 43 ++++++++++++++++- tools/perf/Makefile.perf | 13 ++++-- tools/perf/builtin-record.c | 6 ++- tools/perf/config/Makefile | 3 +- tools/perf/perf.c | 8 +++- tools/perf/ui/browser.c | 2 +- tools/perf/ui/browsers/hists.c | 8 ++-- tools/perf/util/hist.h | 8 +++- tools/perf/util/sort.c | 103 ++++++++++++++++++++++------------------- tools/perf/util/sort.h | 2 +- 10 files changed, 132 insertions(+), 64 deletions(-) ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [GIT PULL 0/9] perf/core improvements and fixes 2014-06-03 20:36 Jiri Olsa @ 2014-06-05 8:45 ` Ingo Molnar 0 siblings, 0 replies; 42+ messages in thread From: Ingo Molnar @ 2014-06-05 8:45 UTC (permalink / raw) To: Jiri Olsa Cc: linux-kernel, Adrian Hunter, Arnaldo Carvalho de Melo, Arnaldo Carvalho de Melo, Cody P Schafer, Corey Ashford, David Ahern, David Binderman, Dianfang Zhang, Don Zickus, Frederic Weisbecker, Jean Delvare, Jianyu Zhan, Mike Galbraith, Namhyung Kim, Paul Mackerras, Peter Zijlstra, Peter Zijlstra, Sebastian Andrzej Siewior, Stephane Eranian, Sukadev Bhattiprolu * Jiri Olsa <jolsa@kernel.org> wrote: > hi Ingo, > please consider pulling > > thanks, > jirka > > > The following changes since commit 9b261365dd73a5014b49033327ad881708e81f33: > > Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf into perf/core (2014-06-03 20:22:40 +0200) > > are available in the git repository at: > > > git://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git tags/perf-core-for-mingo > > for you to fetch changes up to fc9cabeabf42d76854059e7bce81a02645e7e5ca: > > perf tools: Fix 'make help' message error (2014-06-03 21:35:12 +0200) > > ---------------------------------------------------------------- > perf/core improvements and fixes: > > . Warn the user when trace command is not available (Arnaldo Carvalho de Melo) > > . Add warning when disabling perl scripting support due to missing devel files (Arnaldo Carvalho de Melo) > > . Consider header files outside perf directory in tags target (Sebastian Andrzej Siewior) > > . Allow overriding sysfs and proc finding with env var (Cody P Schafer) > > . Fix "==" into "=" in ui_browser__warning assignment (zhangdianfang) > > . Factor elide bool handling in sort code (Jiri Olsa) > > . Fix poll return value propagation (Jiri Olsa) > > . Fix 'make help' message error (Jianyu Zhan) > > Signed-off-by: Jiri Olsa <jolsa@kernel.org> > > ---------------------------------------------------------------- > Arnaldo Carvalho de Melo (2): > perf trace: Warn the user when not available > perf tools: Add warning when disabling perl scripting support due to missing devel files > > Cody P Schafer (1): > perf tools: Allow overriding sysfs and proc finding with env var > > Jianyu Zhan (1): > perf tools: Fix 'make help' message error > > Jiri Olsa (3): > perf tools: Remove elide setup for SORT_MODE__MEMORY mode > perf tools: Move elide bool into perf_hpp_fmt struct > perf record: Fix poll return value propagation > > Sebastian Andrzej Siewior (1): > perf tools: Consider header files outside perf directory in tags target > > zhangdianfang (1): > perf tools: Fix "==" into "=" in ui_browser__warning assignment > > tools/lib/api/fs/fs.c | 43 ++++++++++++++++- > tools/perf/Makefile.perf | 13 ++++-- > tools/perf/builtin-record.c | 6 ++- > tools/perf/config/Makefile | 3 +- > tools/perf/perf.c | 8 +++- > tools/perf/ui/browser.c | 2 +- > tools/perf/ui/browsers/hists.c | 8 ++-- > tools/perf/util/hist.h | 8 +++- > tools/perf/util/sort.c | 103 ++++++++++++++++++++++------------------- > tools/perf/util/sort.h | 2 +- > 10 files changed, 132 insertions(+), 64 deletions(-) Pulled, thanks a lot Jiri! Ingo ^ permalink raw reply [flat|nested] 42+ messages in thread
* [GIT PULL 0/9] perf/core improvements and fixes @ 2014-05-28 13:20 Jiri Olsa 0 siblings, 0 replies; 42+ messages in thread From: Jiri Olsa @ 2014-05-28 13:20 UTC (permalink / raw) To: Ingo Molnar Cc: linux-kernel, Adrian Hunter, Arnaldo Carvalho de Melo, Cody P Schafer, David Ahern, Don Zickus, Frederic Weisbecker, Javi Merino, Jiri Olsa, Mike Galbraith, Namhyung Kim, Paul Mackerras, Peter Zijlstra, Sebastian Andrzej Siewior, Stephane Eranian, Steven Rostedt, Sukadev Bhattiprolu hi Ingo, please consider pulling thanks, jirka The following changes since commit e450f90e8c7d0bf70519223c1b848446ae63f313: Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf into perf/core (2014-05-22 11:37:40 +0200) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git tags/perf-core-for-mingo for you to fetch changes up to 34cfec19bd8496279d283498b97069d9a0f7e130: tools lib traceevent: Added support for __get_bitmask() macro (2014-05-28 15:08:26 +0200) ---------------------------------------------------------------- perf/core improvements and fixes: . Added support for __get_bitmask() macro to traceevent library (Steven Rostedt) . Allow overriding sysfs and proc finding with env var (Cody P Schafer) . Consider header files outside perf directory in tags target (Sebastian Andrzej Siewior) . Add warning when disabling perl scripting support due to missing devel files (Arnaldo Carvalho de Melo) . Warn the user when trace command is not available (Arnaldo Carvalho de Melo) . Pass protection and flags bits through mmap2 interface (Peter Zijlstra) . Update perf tool mmap2 interface with protection and flag bits (Don Zickus) . Re-enable mmap interface (Don Zickus) . Add mem-mode documentation to report command (Don Zickus) Signed-off-by: Jiri Olsa <jolsa@kernel.org> ---------------------------------------------------------------- Arnaldo Carvalho de Melo (2): perf trace: Warn the user when not available perf tools: Add warning when disabling perl scripting support due to missing devel files Cody P Schafer (1): perf tools: Allow overriding sysfs and proc finding with env var Don Zickus (3): Revert "perf: Disable PERF_RECORD_MMAP2 support" perf tools: Update mmap2 interface with protection and flag bits perf report: Add mem-mode documentation to report command Peter Zijlstra (1): perf: Pass protection and flags bits through mmap2 interface Sebastian Andrzej Siewior (1): perf tools: Consider header files outside perf directory in tags target Steven Rostedt (Red Hat) (1): tools lib traceevent: Added support for __get_bitmask() macro include/uapi/linux/perf_event.h | 1 + kernel/events/core.c | 37 ++++++- tools/lib/api/fs/fs.c | 43 +++++++- tools/lib/traceevent/event-parse.c | 113 +++++++++++++++++++++ tools/lib/traceevent/event-parse.h | 7 ++ tools/perf/Documentation/perf-report.txt | 22 ++++ tools/perf/Makefile.perf | 9 +- tools/perf/config/Makefile | 1 + tools/perf/perf.c | 8 +- tools/perf/tests/dwarf-unwind.c | 2 +- tools/perf/util/event.c | 59 +++++++---- tools/perf/util/event.h | 2 + tools/perf/util/evsel.c | 1 + tools/perf/util/machine.c | 4 +- tools/perf/util/map.c | 4 +- tools/perf/util/map.h | 4 +- .../perf/util/scripting-engines/trace-event-perl.c | 1 + .../util/scripting-engines/trace-event-python.c | 1 + 18 files changed, 286 insertions(+), 33 deletions(-) ^ permalink raw reply [flat|nested] 42+ messages in thread
* [GIT PULL 0/9] perf/core improvements and fixes @ 2012-10-26 14:31 Arnaldo Carvalho de Melo 2012-10-26 14:54 ` Ingo Molnar 0 siblings, 1 reply; 42+ messages in thread From: Arnaldo Carvalho de Melo @ 2012-10-26 14:31 UTC (permalink / raw) To: Ingo Molnar Cc: linux-kernel, Arnaldo Carvalho de Melo, Andrew Vagin, Borislav Petkov, David Howells, Frederic Weisbecker, Jiri Olsa, Namhyung Kim, Paul Mackerras, Peter Zijlstra, Stephane Eranian, Steven Rostedt, arnaldo.melo, Arnaldo Carvalho de Melo Hi Ingo, Please consider pulling, - Arnaldo The following changes since commit 8f7c1d07ade50dcdea7ec779b277e891f5c8292a: Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core (2012-10-26 10:30:49 +0200) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux tags/perf-core-for-mingo for you to fetch changes up to 1f16c5754d3a4008c29f3bf67b4f1271313ba385: perf stat: Add --pre and --post command (2012-10-26 11:22:25 -0200) ---------------------------------------------------------------- perf/core improvements: . perf inject changes to allow showing where a task sleeps, from Andrew Vagin. . Makefile improvements from Namhyung Kim. . Add --pre and --post command hooks in 'stat', from Peter Zijlstra. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> ---------------------------------------------------------------- Andrew Vagin (3): perf inject: Work with files perf inject: Merge sched_stat_* and sched_switch events perf inject: Mark a dso if it's used Namhyung Kim (5): tools lib traceevent: Do not generate dependency for system header files perf tools: Cleanup doc related targets perf tools: Convert invocation of MAKE into SUBDIR perf tools: Always show CHK message when doing try-cc perf tools: Fix LIBELF_MMAP checking Peter Zijlstra (1): perf stat: Add --pre and --post command tools/lib/traceevent/Makefile | 2 +- tools/perf/Documentation/perf-inject.txt | 11 ++ tools/perf/Documentation/perf-stat.txt | 5 + tools/perf/Makefile | 51 ++------ tools/perf/builtin-inject.c | 189 ++++++++++++++++++++++++++++-- tools/perf/builtin-stat.c | 42 ++++++- tools/perf/config/utilities.mak | 3 +- tools/perf/util/build-id.c | 10 +- tools/perf/util/build-id.h | 4 + 9 files changed, 256 insertions(+), 61 deletions(-) ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [GIT PULL 0/9] perf/core improvements and fixes 2012-10-26 14:31 Arnaldo Carvalho de Melo @ 2012-10-26 14:54 ` Ingo Molnar 2012-10-26 15:06 ` David Ahern 2012-10-26 17:05 ` Arnaldo Carvalho de Melo 0 siblings, 2 replies; 42+ messages in thread From: Ingo Molnar @ 2012-10-26 14:54 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: linux-kernel, Andrew Vagin, Borislav Petkov, David Howells, Frederic Weisbecker, Jiri Olsa, Namhyung Kim, Paul Mackerras, Peter Zijlstra, Stephane Eranian, Steven Rostedt, arnaldo.melo, Arnaldo Carvalho de Melo * Arnaldo Carvalho de Melo <acme@infradead.org> wrote: > Hi Ingo, > > Please consider pulling, > > - Arnaldo > > The following changes since commit 8f7c1d07ade50dcdea7ec779b277e891f5c8292a: > > Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core (2012-10-26 10:30:49 +0200) > > are available in the git repository at: > > > git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux tags/perf-core-for-mingo > > for you to fetch changes up to 1f16c5754d3a4008c29f3bf67b4f1271313ba385: > > perf stat: Add --pre and --post command (2012-10-26 11:22:25 -0200) > > ---------------------------------------------------------------- > perf/core improvements: > > . perf inject changes to allow showing where a task sleeps, from Andrew Vagin. > > . Makefile improvements from Namhyung Kim. These are really useful: there used to be a couple of seconds of wait time at the beginning of every perf build - these are now nicely explained with the various CHK entries. > > . Add --pre and --post command hooks in 'stat', from Peter Zijlstra. > > Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> > > ---------------------------------------------------------------- > Andrew Vagin (3): > perf inject: Work with files > perf inject: Merge sched_stat_* and sched_switch events > perf inject: Mark a dso if it's used > > Namhyung Kim (5): > tools lib traceevent: Do not generate dependency for system header files > perf tools: Cleanup doc related targets > perf tools: Convert invocation of MAKE into SUBDIR > perf tools: Always show CHK message when doing try-cc > perf tools: Fix LIBELF_MMAP checking > > Peter Zijlstra (1): > perf stat: Add --pre and --post command > > tools/lib/traceevent/Makefile | 2 +- > tools/perf/Documentation/perf-inject.txt | 11 ++ > tools/perf/Documentation/perf-stat.txt | 5 + > tools/perf/Makefile | 51 ++------ > tools/perf/builtin-inject.c | 189 ++++++++++++++++++++++++++++-- > tools/perf/builtin-stat.c | 42 ++++++- > tools/perf/config/utilities.mak | 3 +- > tools/perf/util/build-id.c | 10 +- > tools/perf/util/build-id.h | 4 + > 9 files changed, 256 insertions(+), 61 deletions(-) Pulled, thanks Arnaldo! Ingo ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [GIT PULL 0/9] perf/core improvements and fixes 2012-10-26 14:54 ` Ingo Molnar @ 2012-10-26 15:06 ` David Ahern 2012-10-26 15:31 ` Namhyung Kim 2012-10-26 17:05 ` Arnaldo Carvalho de Melo 1 sibling, 1 reply; 42+ messages in thread From: David Ahern @ 2012-10-26 15:06 UTC (permalink / raw) To: Ingo Molnar Cc: Arnaldo Carvalho de Melo, linux-kernel, Andrew Vagin, Borislav Petkov, David Howells, Frederic Weisbecker, Jiri Olsa, Namhyung Kim, Paul Mackerras, Peter Zijlstra, Stephane Eranian, Steven Rostedt, arnaldo.melo, Arnaldo Carvalho de Melo On 10/26/12 8:54 AM, Ingo Molnar wrote: >> perf/core improvements: >> >> . perf inject changes to allow showing where a task sleeps, from Andrew Vagin. >> >> . Makefile improvements from Namhyung Kim. > > These are really useful: there used to be a couple of seconds of > wait time at the beginning of every perf build - these are now > nicely explained with the various CHK entries. PERF-VERSION-GEN and specifically the git commands are the cause of more delay than the config checks, especially when doing the build in a VM with the kernel source on an NFS mount. David ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [GIT PULL 0/9] perf/core improvements and fixes 2012-10-26 15:06 ` David Ahern @ 2012-10-26 15:31 ` Namhyung Kim 2012-10-26 15:34 ` Borislav Petkov 2012-10-27 17:12 ` stephane eranian 0 siblings, 2 replies; 42+ messages in thread From: Namhyung Kim @ 2012-10-26 15:31 UTC (permalink / raw) To: David Ahern Cc: Ingo Molnar, Arnaldo Carvalho de Melo, linux-kernel, Andrew Vagin, Borislav Petkov, David Howells, Frederic Weisbecker, Jiri Olsa, Paul Mackerras, Peter Zijlstra, Stephane Eranian, Steven Rostedt, arnaldo.melo, Arnaldo Carvalho de Melo 2012-10-26 (금), 09:06 -0600, David Ahern: > On 10/26/12 8:54 AM, Ingo Molnar wrote: > >> perf/core improvements: > >> > >> . perf inject changes to allow showing where a task sleeps, from Andrew Vagin. > >> > >> . Makefile improvements from Namhyung Kim. > > > > These are really useful: there used to be a couple of seconds of > > wait time at the beginning of every perf build - these are now > > nicely explained with the various CHK entries. Kudos to Jiri who did the real work! > > PERF-VERSION-GEN and specifically the git commands are the cause of more > delay than the config checks, especially when doing the build in a VM > with the kernel source on an NFS mount. And I see a strange delay when compiling builtin-sched.o. After building perf tools, I deleted builtin-{sched,test,script}.o to rebuild the only since they are largest ones. namhyung@leonhard:perf$ ls -lS *.c | head -3 -rw-r--r-- 1 namhyung namhyung 45522 2012-10-27 00:20 builtin-sched.c -rw-r--r-- 1 namhyung namhyung 36372 2012-10-27 00:20 builtin-test.c -rw-r--r-- 1 namhyung namhyung 35555 2012-10-27 00:20 builtin-script.c namhyung@leonhard:perf$ rm builtin-{sched,test,script}.o And then building each file with time command shows this: namhyung@leonhard:perf$ time make builtin-script.o &> /dev/null real 0m4.577s user 0m2.755s sys 0m1.655s namhyung@leonhard:perf$ time make builtin-test.o &> /dev/null real 0m4.486s user 0m2.707s sys 0m1.658s namhyung@leonhard:perf$ time make builtin-sched.o &> /dev/null real 0m16.936s user 0m15.157s sys 0m1.635s You can see it easily when building perf without -j option. But I have no idea why it takes so long.. Thanks, Namhyung ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [GIT PULL 0/9] perf/core improvements and fixes 2012-10-26 15:31 ` Namhyung Kim @ 2012-10-26 15:34 ` Borislav Petkov 2012-10-26 16:31 ` Arnaldo Carvalho de Melo 2012-10-27 17:12 ` stephane eranian 1 sibling, 1 reply; 42+ messages in thread From: Borislav Petkov @ 2012-10-26 15:34 UTC (permalink / raw) To: Namhyung Kim Cc: David Ahern, Ingo Molnar, Arnaldo Carvalho de Melo, linux-kernel, Andrew Vagin, Borislav Petkov, David Howells, Frederic Weisbecker, Jiri Olsa, Paul Mackerras, Peter Zijlstra, Stephane Eranian, Steven Rostedt, arnaldo.melo, Arnaldo Carvalho de Melo On Sat, Oct 27, 2012 at 12:31:42AM +0900, Namhyung Kim wrote: > And I see a strange delay when compiling builtin-sched.o. After > building perf tools, I deleted builtin-{sched,test,script}.o to rebuild > the only since they are largest ones. > > namhyung@leonhard:perf$ ls -lS *.c | head -3 > -rw-r--r-- 1 namhyung namhyung 45522 2012-10-27 00:20 builtin-sched.c > -rw-r--r-- 1 namhyung namhyung 36372 2012-10-27 00:20 builtin-test.c > -rw-r--r-- 1 namhyung namhyung 35555 2012-10-27 00:20 builtin-script.c > > namhyung@leonhard:perf$ rm builtin-{sched,test,script}.o > > > And then building each file with time command shows this: > > namhyung@leonhard:perf$ time make builtin-script.o &> /dev/null > > real 0m4.577s > user 0m2.755s > sys 0m1.655s > > namhyung@leonhard:perf$ time make builtin-test.o &> /dev/null > > real 0m4.486s > user 0m2.707s > sys 0m1.658s > > namhyung@leonhard:perf$ time make builtin-sched.o &> /dev/null > > real 0m16.936s > user 0m15.157s > sys 0m1.635s > > You can see it easily when building perf without -j option. But I have > no idea why it takes so long.. Well, you can trace that workload with perf itself, no, and see the hotspots. :-) -- Regards/Gruss, Boris. ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [GIT PULL 0/9] perf/core improvements and fixes 2012-10-26 15:34 ` Borislav Petkov @ 2012-10-26 16:31 ` Arnaldo Carvalho de Melo 2012-10-26 17:20 ` Borislav Petkov 0 siblings, 1 reply; 42+ messages in thread From: Arnaldo Carvalho de Melo @ 2012-10-26 16:31 UTC (permalink / raw) To: Borislav Petkov, Namhyung Kim, David Ahern, Ingo Molnar, linux-kernel, Andrew Vagin, Borislav Petkov, David Howells, Frederic Weisbecker, Jiri Olsa, Paul Mackerras, Peter Zijlstra, Stephane Eranian, Steven Rostedt Em Fri, Oct 26, 2012 at 05:34:32PM +0200, Borislav Petkov escreveu: > On Sat, Oct 27, 2012 at 12:31:42AM +0900, Namhyung Kim wrote: > > You can see it easily when building perf without -j option. But I have > > no idea why it takes so long.. > Well, you can trace that workload with perf itself, no, and see the > hotspots. Right, perf'ing perf is a favourite pastime, right? - Arnaldo ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [GIT PULL 0/9] perf/core improvements and fixes 2012-10-26 16:31 ` Arnaldo Carvalho de Melo @ 2012-10-26 17:20 ` Borislav Petkov 2012-10-27 9:16 ` Namhyung Kim 0 siblings, 1 reply; 42+ messages in thread From: Borislav Petkov @ 2012-10-26 17:20 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: Namhyung Kim, David Ahern, Ingo Molnar, linux-kernel, Andrew Vagin, Borislav Petkov, David Howells, Frederic Weisbecker, Jiri Olsa, Paul Mackerras, Peter Zijlstra, Stephane Eranian, Steven Rostedt On Fri, Oct 26, 2012 at 09:31:15AM -0700, Arnaldo Carvalho de Melo wrote: > Right, perf'ing perf is a favourite pastime, right? Sure, can I get "perfing perf" on a T-shirt please? -- Regards/Gruss, Boris. ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [GIT PULL 0/9] perf/core improvements and fixes 2012-10-26 17:20 ` Borislav Petkov @ 2012-10-27 9:16 ` Namhyung Kim 2012-10-27 14:29 ` Arnaldo Carvalho de Melo 0 siblings, 1 reply; 42+ messages in thread From: Namhyung Kim @ 2012-10-27 9:16 UTC (permalink / raw) To: Borislav Petkov Cc: Arnaldo Carvalho de Melo, David Ahern, Ingo Molnar, linux-kernel, Andrew Vagin, Borislav Petkov, David Howells, Frederic Weisbecker, Jiri Olsa, Paul Mackerras, Peter Zijlstra, Stephane Eranian, Steven Rostedt 2012-10-26 (금), 19:20 +0200, Borislav Petkov: > On Fri, Oct 26, 2012 at 09:31:15AM -0700, Arnaldo Carvalho de Melo wrote: > > Right, perf'ing perf is a favourite pastime, right? > > Sure, can I get "perfing perf" on a T-shirt please? Well, guys, this is not perfing perf. It's about perfing make and/or gcc. Anyway I'd also like to get a "perfing perf" T-shirt. ;) Thanks, Namhyung ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [GIT PULL 0/9] perf/core improvements and fixes 2012-10-27 9:16 ` Namhyung Kim @ 2012-10-27 14:29 ` Arnaldo Carvalho de Melo 0 siblings, 0 replies; 42+ messages in thread From: Arnaldo Carvalho de Melo @ 2012-10-27 14:29 UTC (permalink / raw) To: Namhyung Kim Cc: Borislav Petkov, David Ahern, Ingo Molnar, linux-kernel, Andrew Vagin, Borislav Petkov, David Howells, Frederic Weisbecker, Jiri Olsa, Paul Mackerras, Peter Zijlstra, Stephane Eranian, Steven Rostedt Em Sat, Oct 27, 2012 at 06:16:31PM +0900, Namhyung Kim escreveu: > 2012-10-26 (금), 19:20 +0200, Borislav Petkov: > > On Fri, Oct 26, 2012 at 09:31:15AM -0700, Arnaldo Carvalho de Melo wrote: > > > Right, perf'ing perf is a favourite pastime, right? > > > > Sure, can I get "perfing perf" on a T-shirt please? > > Well, guys, this is not perfing perf. It's about perfing make and/or > gcc. Anyway I'd also like to get a "perfing perf" T-shirt. ;) Well, building perf faster will allow us to perf perf faster. ;-) - Arnaldo ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [GIT PULL 0/9] perf/core improvements and fixes 2012-10-26 15:31 ` Namhyung Kim 2012-10-26 15:34 ` Borislav Petkov @ 2012-10-27 17:12 ` stephane eranian 1 sibling, 0 replies; 42+ messages in thread From: stephane eranian @ 2012-10-27 17:12 UTC (permalink / raw) To: Namhyung Kim Cc: David Ahern, Ingo Molnar, Arnaldo Carvalho de Melo, linux-kernel, Andrew Vagin, Borislav Petkov, David Howells, Frederic Weisbecker, Jiri Olsa, Paul Mackerras, Peter Zijlstra, Steven Rostedt, arnaldo.melo, Arnaldo Carvalho de Melo On Fri, Oct 26, 2012 at 5:31 PM, Namhyung Kim <namhyung@kernel.org> wrote: > 2012-10-26 (금), 09:06 -0600, David Ahern: >> On 10/26/12 8:54 AM, Ingo Molnar wrote: >> >> perf/core improvements: >> >> >> >> . perf inject changes to allow showing where a task sleeps, from Andrew Vagin. >> >> >> >> . Makefile improvements from Namhyung Kim. >> > >> > These are really useful: there used to be a couple of seconds of >> > wait time at the beginning of every perf build - these are now >> > nicely explained with the various CHK entries. > > Kudos to Jiri who did the real work! > >> >> PERF-VERSION-GEN and specifically the git commands are the cause of more >> delay than the config checks, especially when doing the build in a VM >> with the kernel source on an NFS mount. > > And I see a strange delay when compiling builtin-sched.o. After > building perf tools, I deleted builtin-{sched,test,script}.o to rebuild > the only since they are largest ones. > Yes, I see that delay on copiling builtin-sched.c on my IVB system. Don't know why it takes a significant number of seconds to compile this file. It did not use to be like that a few revisions back. It takes about 8 seconds on my OC'd IVB (> 4GHz). I don't see much code in that file. > namhyung@leonhard:perf$ ls -lS *.c | head -3 > -rw-r--r-- 1 namhyung namhyung 45522 2012-10-27 00:20 builtin-sched.c > -rw-r--r-- 1 namhyung namhyung 36372 2012-10-27 00:20 builtin-test.c > -rw-r--r-- 1 namhyung namhyung 35555 2012-10-27 00:20 builtin-script.c > > namhyung@leonhard:perf$ rm builtin-{sched,test,script}.o > > > And then building each file with time command shows this: > > namhyung@leonhard:perf$ time make builtin-script.o &> /dev/null > > real 0m4.577s > user 0m2.755s > sys 0m1.655s > > namhyung@leonhard:perf$ time make builtin-test.o &> /dev/null > > real 0m4.486s > user 0m2.707s > sys 0m1.658s > > namhyung@leonhard:perf$ time make builtin-sched.o &> /dev/null > > real 0m16.936s > user 0m15.157s > sys 0m1.635s > > You can see it easily when building perf without -j option. But I have > no idea why it takes so long.. > > Thanks, > Namhyung > > ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [GIT PULL 0/9] perf/core improvements and fixes 2012-10-26 14:54 ` Ingo Molnar 2012-10-26 15:06 ` David Ahern @ 2012-10-26 17:05 ` Arnaldo Carvalho de Melo 2012-10-27 13:19 ` Ingo Molnar 1 sibling, 1 reply; 42+ messages in thread From: Arnaldo Carvalho de Melo @ 2012-10-26 17:05 UTC (permalink / raw) To: Ingo Molnar Cc: linux-kernel, Andrew Vagin, Borislav Petkov, David Howells, Frederic Weisbecker, Jiri Olsa, Namhyung Kim, Paul Mackerras, Peter Zijlstra, Stephane Eranian, Steven Rostedt Em Fri, Oct 26, 2012 at 04:54:51PM +0200, Ingo Molnar escreveu: > * Arnaldo Carvalho de Melo <acme@infradead.org> wrote: > > . Makefile improvements from Namhyung Kim. > > These are really useful: there used to be a couple of seconds of > wait time at the beginning of every perf build - these are now > nicely explained with the various CHK entries. The optimal way, I guess, would be to have some cache file with the results of such feature tests, that would be created and then used till the build fails using its findings, which would trigger a new feature check round, followed by an automatic rebuild. That would be tricky because we would have to have an automated way of discovering if the build failed due to missing packages or if it failed due to some ordinary coding mistake. - Arnaldo ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [GIT PULL 0/9] perf/core improvements and fixes 2012-10-26 17:05 ` Arnaldo Carvalho de Melo @ 2012-10-27 13:19 ` Ingo Molnar 2012-10-30 8:18 ` Ingo Molnar 0 siblings, 1 reply; 42+ messages in thread From: Ingo Molnar @ 2012-10-27 13:19 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: linux-kernel, Andrew Vagin, Borislav Petkov, David Howells, Frederic Weisbecker, Jiri Olsa, Namhyung Kim, Paul Mackerras, Peter Zijlstra, Stephane Eranian, Steven Rostedt * Arnaldo Carvalho de Melo <acme@infradead.org> wrote: > Em Fri, Oct 26, 2012 at 04:54:51PM +0200, Ingo Molnar escreveu: > > * Arnaldo Carvalho de Melo <acme@infradead.org> wrote: > > > . Makefile improvements from Namhyung Kim. > > > > These are really useful: there used to be a couple of > > seconds of wait time at the beginning of every perf build - > > these are now nicely explained with the various CHK entries. > > The optimal way, I guess, would be to have some cache file > with the results of such feature tests, that would be created > and then used till the build fails using its findings, which > would trigger a new feature check round, followed by an > automatic rebuild. > > That would be tricky because we would have to have an > automated way of discovering if the build failed due to > missing packages or if it failed due to some ordinary coding > mistake. The feature tests aren't a big problem right now - but making it *visible* is really useful. It also tells us which feature test fails, etc. Thanks, Ingo ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [GIT PULL 0/9] perf/core improvements and fixes 2012-10-27 13:19 ` Ingo Molnar @ 2012-10-30 8:18 ` Ingo Molnar 2012-10-30 8:21 ` Peter Zijlstra 0 siblings, 1 reply; 42+ messages in thread From: Ingo Molnar @ 2012-10-30 8:18 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: linux-kernel, Andrew Vagin, Borislav Petkov, David Howells, Frederic Weisbecker, Jiri Olsa, Namhyung Kim, Paul Mackerras, Peter Zijlstra, Stephane Eranian, Steven Rostedt * Ingo Molnar <mingo@kernel.org> wrote: > > * Arnaldo Carvalho de Melo <acme@infradead.org> wrote: > > > Em Fri, Oct 26, 2012 at 04:54:51PM +0200, Ingo Molnar escreveu: > > > * Arnaldo Carvalho de Melo <acme@infradead.org> wrote: > > > > . Makefile improvements from Namhyung Kim. > > > > > > These are really useful: there used to be a couple of > > > seconds of wait time at the beginning of every perf build - > > > these are now nicely explained with the various CHK entries. > > > > The optimal way, I guess, would be to have some cache file > > with the results of such feature tests, that would be created > > and then used till the build fails using its findings, which > > would trigger a new feature check round, followed by an > > automatic rebuild. > > > > That would be tricky because we would have to have an > > automated way of discovering if the build failed due to > > missing packages or if it failed due to some ordinary coding > > mistake. > > The feature tests aren't a big problem right now - but making > it *visible* is really useful. It also tells us which feature > test fails, etc. Btw., there's another thing that would be nice in addition to simplifying the PERF-VERSION-GEN script: to be able to run the CHK tests in parallel, like the object file runes. Right now the CHK tests are serialized and they take several seconds to build and run. A parallel make rule would reduce that to about a second I think. Thanks, Ingo ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [GIT PULL 0/9] perf/core improvements and fixes 2012-10-30 8:18 ` Ingo Molnar @ 2012-10-30 8:21 ` Peter Zijlstra 2012-10-30 9:14 ` Ingo Molnar 0 siblings, 1 reply; 42+ messages in thread From: Peter Zijlstra @ 2012-10-30 8:21 UTC (permalink / raw) To: Ingo Molnar Cc: Arnaldo Carvalho de Melo, linux-kernel, Andrew Vagin, Borislav Petkov, David Howells, Frederic Weisbecker, Jiri Olsa, Namhyung Kim, Paul Mackerras, Stephane Eranian, Steven Rostedt On Tue, 2012-10-30 at 09:18 +0100, Ingo Molnar wrote: > > > The optimal way, I guess, would be to have some cache file > > > with the results of such feature tests, that would be created > > > and then used till the build fails using its findings, which > > > would trigger a new feature check round, followed by an > > > automatic rebuild. autoconf!! ;-) /me runs ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [GIT PULL 0/9] perf/core improvements and fixes 2012-10-30 8:21 ` Peter Zijlstra @ 2012-10-30 9:14 ` Ingo Molnar 0 siblings, 0 replies; 42+ messages in thread From: Ingo Molnar @ 2012-10-30 9:14 UTC (permalink / raw) To: Peter Zijlstra Cc: Arnaldo Carvalho de Melo, linux-kernel, Andrew Vagin, Borislav Petkov, David Howells, Frederic Weisbecker, Jiri Olsa, Namhyung Kim, Paul Mackerras, Stephane Eranian, Steven Rostedt * Peter Zijlstra <a.p.zijlstra@chello.nl> wrote: > On Tue, 2012-10-30 at 09:18 +0100, Ingo Molnar wrote: > > > > The optimal way, I guess, would be to have some cache file > > > > with the results of such feature tests, that would be created > > > > and then used till the build fails using its findings, which > > > > would trigger a new feature check round, followed by an > > > > automatic rebuild. I did not write that. I think making the feature tests parallel would be enough to speed it all up - caching brings in a new set of problems. The tests are mostly independent and the feature test makefile rules could be parallelized like the object file rules. > autoconf!! ;-) > > /me runs hey, we build perf much faster than autoconf's 'configure' script finishes running ;-) Thanks, Ingo ^ permalink raw reply [flat|nested] 42+ messages in thread
* [GIT PULL 0/9] perf/core improvements and fixes @ 2012-09-08 20:36 Arnaldo Carvalho de Melo 2012-09-09 8:40 ` Ingo Molnar 0 siblings, 1 reply; 42+ messages in thread From: Arnaldo Carvalho de Melo @ 2012-09-08 20:36 UTC (permalink / raw) To: Ingo Molnar Cc: linux-kernel, Arnaldo Carvalho de Melo, David Ahern, Irina Tirdea, Irina Tirdea, Namhyung Kim, Namhyung Kim, Paul Mackerras, Pekka Enberg, Peter Zijlstra, Srikar Dronamraju, Steven Rostedt From: Arnaldo Carvalho de Melo <acme@redhat.com> Hi Ingo, Please consider pulling, Thanks, - Arnaldo The following changes since commit ef34eb4da3eb62a1511592adf7c76d74faca0b14: Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core (2012-09-08 13:26:02 +0200) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux tags/perf-core-for-mingo for you to fetch changes up to 6c7f631261064762a8ba1ee34fc2b76d117ef3fa: perf symbols: Remove BIONIC wrapper around libgen.h (2012-09-08 17:15:16 -0300) ---------------------------------------------------------------- perf/core improvements and fixes . Don't pass const char pointers to basename, so that we can unconditionally use libgen.h and thus avoid ifdef BIONIC lines, from David Ahern . Fix assert/BUG_ON when NDEBUG is defined, from Irina Tirdea. . Refactor hist formatting so that it can be reused with the GTK browser, From Namhyung Kim Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> ---------------------------------------------------------------- David Ahern (3): perf annotate: Make a copy of filename for passing to basename perf probe: Make a copy of exec path for passing to basename perf symbols: Remove BIONIC wrapper around libgen.h Irina Tirdea (1): perf bench: fix assert when NDEBUG is defined Namhyung Kim (5): perf hists: Introduce perf_hpp for hist period printing perf hists: Handle field separator properly perf hists: Use perf_hpp__format->width to calculate the column widths perf hists browser: Use perf_hpp__format functions perf gtk/browser: Use perf_hpp__format functions tools/perf/Makefile | 2 + tools/perf/bench/sched-pipe.c | 6 +- tools/perf/builtin-diff.c | 1 + tools/perf/ui/browsers/hists.c | 96 ++++++-- tools/perf/ui/gtk/browser.c | 101 +++++++-- tools/perf/ui/gtk/gtk.h | 1 + tools/perf/ui/gtk/setup.c | 1 + tools/perf/ui/hist.c | 389 ++++++++++++++++++++++++++++++++ tools/perf/ui/setup.c | 8 +- tools/perf/ui/stdio/hist.c | 239 ++++---------------- tools/perf/ui/tui/setup.c | 4 + tools/perf/util/annotate.c | 9 +- tools/perf/util/hist.c | 33 --- tools/perf/util/hist.h | 37 +++ tools/perf/util/include/linux/kernel.h | 4 + tools/perf/util/probe-event.c | 12 +- tools/perf/util/symbol.h | 2 - 17 files changed, 665 insertions(+), 280 deletions(-) create mode 100644 tools/perf/ui/hist.c ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [GIT PULL 0/9] perf/core improvements and fixes 2012-09-08 20:36 Arnaldo Carvalho de Melo @ 2012-09-09 8:40 ` Ingo Molnar 0 siblings, 0 replies; 42+ messages in thread From: Ingo Molnar @ 2012-09-09 8:40 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: linux-kernel, Arnaldo Carvalho de Melo, David Ahern, Irina Tirdea, Irina Tirdea, Namhyung Kim, Namhyung Kim, Paul Mackerras, Pekka Enberg, Peter Zijlstra, Srikar Dronamraju, Steven Rostedt * Arnaldo Carvalho de Melo <acme@infradead.org> wrote: > From: Arnaldo Carvalho de Melo <acme@redhat.com> > > Hi Ingo, > > Please consider pulling, > > Thanks, > > - Arnaldo > > The following changes since commit ef34eb4da3eb62a1511592adf7c76d74faca0b14: > > Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core (2012-09-08 13:26:02 +0200) > > are available in the git repository at: > > > git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux tags/perf-core-for-mingo > > for you to fetch changes up to 6c7f631261064762a8ba1ee34fc2b76d117ef3fa: > > perf symbols: Remove BIONIC wrapper around libgen.h (2012-09-08 17:15:16 -0300) > > ---------------------------------------------------------------- > perf/core improvements and fixes > > . Don't pass const char pointers to basename, so that we can unconditionally > use libgen.h and thus avoid ifdef BIONIC lines, from David Ahern > > . Fix assert/BUG_ON when NDEBUG is defined, from Irina Tirdea. > > . Refactor hist formatting so that it can be reused with the GTK browser, > From Namhyung Kim > > Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> > > ---------------------------------------------------------------- > David Ahern (3): > perf annotate: Make a copy of filename for passing to basename > perf probe: Make a copy of exec path for passing to basename > perf symbols: Remove BIONIC wrapper around libgen.h > > Irina Tirdea (1): > perf bench: fix assert when NDEBUG is defined > > Namhyung Kim (5): > perf hists: Introduce perf_hpp for hist period printing > perf hists: Handle field separator properly > perf hists: Use perf_hpp__format->width to calculate the column widths > perf hists browser: Use perf_hpp__format functions > perf gtk/browser: Use perf_hpp__format functions > > tools/perf/Makefile | 2 + > tools/perf/bench/sched-pipe.c | 6 +- > tools/perf/builtin-diff.c | 1 + > tools/perf/ui/browsers/hists.c | 96 ++++++-- > tools/perf/ui/gtk/browser.c | 101 +++++++-- > tools/perf/ui/gtk/gtk.h | 1 + > tools/perf/ui/gtk/setup.c | 1 + > tools/perf/ui/hist.c | 389 ++++++++++++++++++++++++++++++++ > tools/perf/ui/setup.c | 8 +- > tools/perf/ui/stdio/hist.c | 239 ++++---------------- > tools/perf/ui/tui/setup.c | 4 + > tools/perf/util/annotate.c | 9 +- > tools/perf/util/hist.c | 33 --- > tools/perf/util/hist.h | 37 +++ > tools/perf/util/include/linux/kernel.h | 4 + > tools/perf/util/probe-event.c | 12 +- > tools/perf/util/symbol.h | 2 - > 17 files changed, 665 insertions(+), 280 deletions(-) > create mode 100644 tools/perf/ui/hist.c Pulled, thanks Arnaldo! Ingo ^ permalink raw reply [flat|nested] 42+ messages in thread
* [GIT PULL 0/9] perf/core improvements and fixes @ 2012-01-24 23:07 Arnaldo Carvalho de Melo 2012-01-26 11:16 ` Ingo Molnar 0 siblings, 1 reply; 42+ messages in thread From: Arnaldo Carvalho de Melo @ 2012-01-24 23:07 UTC (permalink / raw) To: Ingo Molnar Cc: linux-kernel, Arnaldo Carvalho de Melo, David Ahern, David Daney, Frederic Weisbecker, Jan Beulich, Joerg Roedel, Masami Hiramatsu, Mike Galbraith, Namhyung Kim, Paul Mackerras, Peter Zijlstra, Srikar Dronamraju, Stephane Eranian, arnaldo.melo Hi Ingo, This is a signed tag, please lemme know if everything went well. The --uid feature works for root, we still need to sort out that paranoia with some threads owned by a user that prevents 'perf --uid non-root-user' to work for 'non-root-user'. http://git.kernel.org/?p=linux/kernel/git/acme/linux.git;a=tag;h=ce9600c4c664ce7f97e8aa5e756b0b4ea5b017c7 looks ok to me, need just to improve on the commit log message, I'll get used to it :-) - Arnaldo The following changes since commit 172d1b0b73256551f100fc00c69e356d047103f5: perf tools: Fix compile error on x86_64 Ubuntu (2012-01-08 13:34:55 -0200) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux perf-core-for-mingo for you to fetch changes up to f8f4b2872295dca88339ec0c403b2217b1197353: perf tools: Fix strlen() bug in perf_event__synthesize_event_type() (2012-01-24 20:31:34 -0200) ---------------------------------------------------------------- perf/core improvements and fixes ---------------------------------------------------------------- Arnaldo Carvalho de Melo (2): perf tools: Add fprintf methods for thread_map and cpu_map classes perf tools: Introduce per user view David Daney (1): perf tools: Fix broken build by defining _GNU_SOURCE in Makefile Jan Beulich (4): perf bench: Make "default" memcpy() selection actually use glibc's implementation perf bench: Also allow measuring alternative memcpy implementations perf bench: Also allow measuring memset() perf bench: Allow passing an iteration count to "bench mem mem{cpy,set}" Srikar Dronamraju (1): perf probe: Usability fixes Stephane Eranian (1): perf tools: Fix strlen() bug in perf_event__synthesize_event_type() tools/perf/Documentation/perf-record.txt | 4 + tools/perf/Documentation/perf-top.txt | 4 + tools/perf/Makefile | 11 +- tools/perf/bench/bench.h | 1 + tools/perf/bench/mem-memcpy-x86-64-asm-def.h | 8 + tools/perf/bench/mem-memcpy-x86-64-asm.S | 6 +- tools/perf/bench/mem-memcpy.c | 11 +- tools/perf/bench/mem-memset-arch.h | 12 + tools/perf/bench/mem-memset-x86-64-asm-def.h | 12 + tools/perf/bench/mem-memset-x86-64-asm.S | 6 + tools/perf/bench/mem-memset.c | 298 ++++++++++++++++++++++++++ tools/perf/builtin-bench.c | 3 + tools/perf/builtin-probe.c | 2 - tools/perf/builtin-record.c | 12 +- tools/perf/builtin-stat.c | 2 +- tools/perf/builtin-test.c | 8 +- tools/perf/builtin-top.c | 22 ++- tools/perf/perf.h | 1 + tools/perf/util/cpumap.c | 11 + tools/perf/util/cpumap.h | 4 + tools/perf/util/evlist.c | 6 +- tools/perf/util/evlist.h | 2 +- tools/perf/util/header.c | 2 +- tools/perf/util/hist.h | 1 + tools/perf/util/include/asm/dwarf2.h | 4 +- tools/perf/util/probe-event.c | 8 +- tools/perf/util/python.c | 10 +- tools/perf/util/symbol.c | 1 - tools/perf/util/thread_map.c | 109 +++++++++- tools/perf/util/thread_map.h | 7 +- tools/perf/util/top.c | 3 + tools/perf/util/top.h | 2 + tools/perf/util/trace-event-parse.c | 3 +- tools/perf/util/ui/browsers/hists.c | 5 +- tools/perf/util/ui/helpline.c | 1 - tools/perf/util/usage.c | 39 ++++ tools/perf/util/util.h | 3 +- 37 files changed, 597 insertions(+), 47 deletions(-) create mode 100644 tools/perf/bench/mem-memset-arch.h create mode 100644 tools/perf/bench/mem-memset-x86-64-asm-def.h create mode 100644 tools/perf/bench/mem-memset-x86-64-asm.S create mode 100644 tools/perf/bench/mem-memset.c ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: [GIT PULL 0/9] perf/core improvements and fixes 2012-01-24 23:07 Arnaldo Carvalho de Melo @ 2012-01-26 11:16 ` Ingo Molnar 0 siblings, 0 replies; 42+ messages in thread From: Ingo Molnar @ 2012-01-26 11:16 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: linux-kernel, David Ahern, David Daney, Frederic Weisbecker, Jan Beulich, Joerg Roedel, Masami Hiramatsu, Mike Galbraith, Namhyung Kim, Paul Mackerras, Peter Zijlstra, Srikar Dronamraju, Stephane Eranian, arnaldo.melo * Arnaldo Carvalho de Melo <acme@infradead.org> wrote: > Hi Ingo, > > This is a signed tag, please lemme know if everything went well. > > The --uid feature works for root, we still need to > sort out that paranoia with some threads owned by a user that > prevents 'perf --uid non-root-user' to work for > 'non-root-user'. Just wondering what detail causes that failure - the whole point of --uid mingo would be to enable nonprivileged users to do 'session wide' profiling, *especially* if paranoia is high. So what does --uid do which perf record --pid 1234 wouldnt already do? By all means --uid ought to be a fancy way of doing a whole bunch of perf record --pid 1234 profiling sessions, at once. [ Btw, we should probably alias --user to --uid as well, as that might be the intuitive thing people would typically use? ] > http://git.kernel.org/?p=linux/kernel/git/acme/linux.git;a=tag;h=ce9600c4c664ce7f97e8aa5e756b0b4ea5b017c7 > looks ok to me, need just to improve on the commit log message, I'll get > used to it :-) > > - Arnaldo > > The following changes since commit 172d1b0b73256551f100fc00c69e356d047103f5: > > perf tools: Fix compile error on x86_64 Ubuntu (2012-01-08 13:34:55 -0200) > > are available in the git repository at: > > git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux perf-core-for-mingo > > for you to fetch changes up to f8f4b2872295dca88339ec0c403b2217b1197353: > > perf tools: Fix strlen() bug in perf_event__synthesize_event_type() (2012-01-24 20:31:34 -0200) > > ---------------------------------------------------------------- > perf/core improvements and fixes > > ---------------------------------------------------------------- Anyway, pulled, thanks a lot Arnaldo! One detail: don't we want some of these fixes cherry-picked into perf/urgent as well? Thanks, Ingo ^ permalink raw reply [flat|nested] 42+ messages in thread
end of thread, other threads:[~2017-04-01 10:44 UTC | newest] Thread overview: 42+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2015-08-21 16:10 [GIT PULL 0/9] perf/core improvements and fixes Arnaldo Carvalho de Melo 2015-08-21 16:10 ` [PATCH 1/9] perf script: Fix segfault using --show-mmap-events Arnaldo Carvalho de Melo 2015-08-21 16:10 ` [PATCH 2/9] perf tools: /proc/kcore requires CAP_SYS_RAWIO message too noisy Arnaldo Carvalho de Melo 2015-08-21 16:10 ` [PATCH 3/9] perf tools: Fix Intel PT timestamp handling Arnaldo Carvalho de Melo 2015-08-21 16:10 ` [PATCH 4/9] tools lib traceevent: Add checks for returned EVENT_ERROR type Arnaldo Carvalho de Melo 2015-08-21 16:10 ` [PATCH 5/9] perf tools: Add Intel BTS support Arnaldo Carvalho de Melo 2015-08-21 16:10 ` [PATCH 6/9] perf tools: Put itrace options into an asciidoc include Arnaldo Carvalho de Melo 2015-08-21 16:10 ` [PATCH 7/9] perf tools: Add example call-graph script Arnaldo Carvalho de Melo 2015-08-21 16:10 ` [PATCH 8/9] perf tools: Initialize reference counts in map__clone() Arnaldo Carvalho de Melo 2015-08-21 16:10 ` [PATCH 9/9] perf probe: Try to use symbol table if searching debug info failed Arnaldo Carvalho de Melo 2015-08-22 6:47 ` [GIT PULL 0/9] perf/core improvements and fixes Ingo Molnar -- strict thread matches above, loose matches on Subject: below -- 2017-04-01 2:10 Arnaldo Carvalho de Melo 2017-04-01 10:44 ` Ingo Molnar 2015-12-11 14:22 Arnaldo Carvalho de Melo 2015-12-14 8:32 ` Ingo Molnar 2015-09-23 1:57 Arnaldo Carvalho de Melo 2015-09-23 7:45 ` Ingo Molnar 2015-08-10 20:56 Arnaldo Carvalho de Melo 2015-08-12 10:18 ` Ingo Molnar 2014-06-19 16:38 Jiri Olsa 2014-06-03 20:36 Jiri Olsa 2014-06-05 8:45 ` Ingo Molnar 2014-05-28 13:20 Jiri Olsa 2012-10-26 14:31 Arnaldo Carvalho de Melo 2012-10-26 14:54 ` Ingo Molnar 2012-10-26 15:06 ` David Ahern 2012-10-26 15:31 ` Namhyung Kim 2012-10-26 15:34 ` Borislav Petkov 2012-10-26 16:31 ` Arnaldo Carvalho de Melo 2012-10-26 17:20 ` Borislav Petkov 2012-10-27 9:16 ` Namhyung Kim 2012-10-27 14:29 ` Arnaldo Carvalho de Melo 2012-10-27 17:12 ` stephane eranian 2012-10-26 17:05 ` Arnaldo Carvalho de Melo 2012-10-27 13:19 ` Ingo Molnar 2012-10-30 8:18 ` Ingo Molnar 2012-10-30 8:21 ` Peter Zijlstra 2012-10-30 9:14 ` Ingo Molnar 2012-09-08 20:36 Arnaldo Carvalho de Melo 2012-09-09 8:40 ` Ingo Molnar 2012-01-24 23:07 Arnaldo Carvalho de Melo 2012-01-26 11:16 ` Ingo Molnar
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).