* Make perf script easier to use for itrace
@ 2018-09-18 12:32 Andi Kleen
2018-09-18 12:32 ` [PATCH v5 1/9] perf tools: Report itrace options in help Andi Kleen
` (8 more replies)
0 siblings, 9 replies; 30+ messages in thread
From: Andi Kleen @ 2018-09-18 12:32 UTC (permalink / raw)
To: acme; +Cc: jolsa, adrian.hunter, linux-kernel, kim.phillips
Implement a range of improvements to make it easier to look
at itrace traces with perf script. Nothing here couldn't be done
before with some additional scripting, but add simple high
level options to make it easier to use.
% perf record -e intel_pt//k -a sleep 1
Show function calls:
% perf script --call-trace
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_pmu_enable
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) __x86_indirect_thunk_rax
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) event_filter_match
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) group_sched_in
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) __x86_indirect_thunk_rax
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) event_sched_in.isra.107
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_event_set_state.part.71
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_event_update_time
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_pmu_disable
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_log_itrace_start
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) __x86_indirect_thunk_rax
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_event_update_userpage
Show function calls and returns:
% perf script --call-ret-trace
perf 900 [000] 194167.205652203: tr strt ([unknown]) pt_config
perf 900 [000] 194167.205652203: return ([kernel.kallsyms]) pt_config
perf 900 [000] 194167.205652203: return ([kernel.kallsyms]) pt_event_add
perf 900 [000] 194167.205652203: call ([kernel.kallsyms]) perf_pmu_enable
perf 900 [000] 194167.205652203: return ([kernel.kallsyms]) perf_pmu_nop_void
perf 900 [000] 194167.205652203: return ([kernel.kallsyms]) event_sched_in.isra.107
perf 900 [000] 194167.205652203: call ([kernel.kallsyms]) __x86_indirect_thunk_rax
perf 900 [000] 194167.205652203: return ([kernel.kallsyms]) perf_pmu_nop_int
perf 900 [000] 194167.205652203: return ([kernel.kallsyms]) group_sched_in
perf 900 [000] 194167.205652203: call ([kernel.kallsyms]) event_filter_match
perf 900 [000] 194167.205652203: return ([kernel.kallsyms]) event_filter_match
perf 900 [000] 194167.205652203: call ([kernel.kallsyms]) group_sched_in
Show instruction traces (using XED):
% perf script --insn-trace --xed
swapper 0 [000] 117276.429606186: ffffffff81010486 pt_config ([kernel.kallsyms]) nopl %eax, (%rax,%rax,1)
swapper 0 [000] 117276.429606186: ffffffff8101048b pt_config ([kernel.kallsyms]) add $0x10, %rsp
swapper 0 [000] 117276.429606186: ffffffff8101048f pt_config ([kernel.kallsyms]) popq %rbx
swapper 0 [000] 117276.429606186: ffffffff81010490 pt_config ([kernel.kallsyms]) popq %rbp
swapper 0 [000] 117276.429606186: ffffffff81010491 pt_config ([kernel.kallsyms]) popq %r12
swapper 0 [000] 117276.429606186: ffffffff81010493 pt_config ([kernel.kallsyms]) popq %r13
swapper 0 [000] 117276.429606186: ffffffff81010495 pt_config ([kernel.kallsyms]) popq %r14
swapper 0 [000] 117276.429606186: ffffffff81010497 pt_config ([kernel.kallsyms]) popq %r15
swapper 0 [000] 117276.429606186: ffffffff81010499 pt_config ([kernel.kallsyms]) retq
Filter by a ftrace style graph function:
% perf script --graph-function group_sched_in --call-trace
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) group_sched_in
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) __x86_indirect_thunk_rax
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) event_sched_in.isra.107
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_event_set_state.part.71
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_event_update_time
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_pmu_disable
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_log_itrace_start
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) __x86_indirect_thunk_rax
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_event_update_userpage
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) calc_timer_values
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) sched_clock_cpu
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) __x86_indirect_thunk_rax
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) arch_perf_update_userpage
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) __fentry__
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) using_native_sched_clock
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) sched_clock_stable
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_pmu_enable
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) __x86_indirect_thunk_rax
Also available in
git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-misc.git pt/easy-script-6
v1: Initial post
v2: Address review comments. Minor fixes to descriptions. Now builds everywhere.
v3: Merge wildcard+remove addr/sym patch for one logical review unit.
Drop typed remove patch
Fix build failure on ARM ETM builds.
Fix git branch number.
v4: Remove debug printout.
Really fix ETM build failure (Kim Phillips)
v5: Split out --xed into separate option to handle other architectures better,
and support brstackinsn too. Add total cycles printing for brstackinsn
^ permalink raw reply [flat|nested] 30+ messages in thread
* [PATCH v5 1/9] perf tools: Report itrace options in help
2018-09-18 12:32 Make perf script easier to use for itrace Andi Kleen
@ 2018-09-18 12:32 ` Andi Kleen
2018-09-19 18:13 ` Arnaldo Carvalho de Melo
2018-09-18 12:32 ` [PATCH v5 2/9] tools, pager: Support overwriting the pager Andi Kleen
` (7 subsequent siblings)
8 siblings, 1 reply; 30+ messages in thread
From: Andi Kleen @ 2018-09-18 12:32 UTC (permalink / raw)
To: acme; +Cc: jolsa, adrian.hunter, linux-kernel, kim.phillips, Andi Kleen
From: Andi Kleen <ak@linux.intel.com>
I often forget all the options that --itrace accepts. Instead of burying
them in the man page only report them in the normal command line help
too to make them easier accessible.
v2: Align
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
tools/perf/builtin-inject.c | 3 ++-
tools/perf/builtin-report.c | 2 +-
tools/perf/builtin-script.c | 2 +-
tools/perf/util/auxtrace.h | 19 +++++++++++++++++++
4 files changed, 23 insertions(+), 3 deletions(-)
diff --git a/tools/perf/builtin-inject.c b/tools/perf/builtin-inject.c
index a3b346359ba0..b649a0511ed8 100644
--- a/tools/perf/builtin-inject.c
+++ b/tools/perf/builtin-inject.c
@@ -803,7 +803,8 @@ int cmd_inject(int argc, const char **argv)
"kallsyms pathname"),
OPT_BOOLEAN('f', "force", &data.force, "don't complain, do it"),
OPT_CALLBACK_OPTARG(0, "itrace", &inject.itrace_synth_opts,
- NULL, "opts", "Instruction Tracing options",
+ NULL, "opts", "Instruction Tracing options\n"
+ ITRACE_HELP,
itrace_parse_synth_opts),
OPT_BOOLEAN(0, "strip", &inject.strip,
"strip non-synthesized events (use with --itrace)"),
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 76e12bcd1765..191a389b45d1 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -1105,7 +1105,7 @@ int cmd_report(int argc, const char **argv)
OPT_CALLBACK(0, "percentage", NULL, "relative|absolute",
"how to display percentage of filtered entries", parse_filter_percentage),
OPT_CALLBACK_OPTARG(0, "itrace", &itrace_synth_opts, NULL, "opts",
- "Instruction Tracing options",
+ "Instruction Tracing options\n" ITRACE_HELP,
itrace_parse_synth_opts),
OPT_BOOLEAN(0, "full-source-path", &srcline_full_filename,
"Show full source file name path for source lines"),
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 6176bae177c2..49d84c85e846 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -3195,7 +3195,7 @@ int cmd_script(int argc, const char **argv)
OPT_BOOLEAN(0, "ns", &nanosecs,
"Use 9 decimal places when displaying time"),
OPT_CALLBACK_OPTARG(0, "itrace", &itrace_synth_opts, NULL, "opts",
- "Instruction Tracing options",
+ "Instruction Tracing options\n" ITRACE_HELP,
itrace_parse_synth_opts),
OPT_BOOLEAN(0, "full-source-path", &srcline_full_filename,
"Show full source file name path for source lines"),
diff --git a/tools/perf/util/auxtrace.h b/tools/perf/util/auxtrace.h
index 71fc3bd74299..a516a891d7ea 100644
--- a/tools/perf/util/auxtrace.h
+++ b/tools/perf/util/auxtrace.h
@@ -577,6 +577,23 @@ static inline void auxtrace__free(struct perf_session *session)
return session->auxtrace->free(session);
}
+#define ITRACE_HELP \
+" i: synthesize instructions events\n" \
+" b: synthesize branches events\n" \
+" c: synthesize branches events (calls only)\n" \
+" r: synthesize branches events (returns only)\n" \
+" x: synthesize transactions events\n" \
+" w: synthesize ptwrite events\n" \
+" p: synthesize power events\n" \
+" e: synthesize error events\n" \
+" d: create a debug log\n" \
+" g[len]: synthesize a call chain (use with i or x)\n" \
+" l[len]: synthesize last branch entries (use with i or x)\n" \
+" sNUMBER: skip initial number of events\n" \
+" PERIOD[ns|us|ms|i|t]: specify period to sample stream\n" \
+" concatenate multiple options. Default is ibxwpe or cewp\n"
+
+
#else
static inline struct auxtrace_record *
@@ -717,6 +734,8 @@ void auxtrace_mmap_params__set_idx(struct auxtrace_mmap_params *mp,
struct perf_evlist *evlist, int idx,
bool per_cpu);
+#define ITRACE_HELP ""
+
#endif
#endif
--
2.17.1
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH v5 2/9] tools, pager: Support overwriting the pager
2018-09-18 12:32 Make perf script easier to use for itrace Andi Kleen
2018-09-18 12:32 ` [PATCH v5 1/9] perf tools: Report itrace options in help Andi Kleen
@ 2018-09-18 12:32 ` Andi Kleen
2018-09-19 18:17 ` Arnaldo Carvalho de Melo
2018-09-26 8:44 ` [tip:perf/core] tools lib subcmd: " tip-bot for Andi Kleen
2018-09-18 12:32 ` [PATCH v5 3/9] tools, perf, script: Add --insn-trace for instruction decoding Andi Kleen
` (6 subsequent siblings)
8 siblings, 2 replies; 30+ messages in thread
From: Andi Kleen @ 2018-09-18 12:32 UTC (permalink / raw)
To: acme; +Cc: jolsa, adrian.hunter, linux-kernel, kim.phillips, Andi Kleen
From: Andi Kleen <ak@linux.intel.com>
Add an interface to the auto pager code that allows callers
to overwrite the pager.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
tools/lib/subcmd/pager.c | 11 ++++++++++-
tools/lib/subcmd/pager.h | 1 +
2 files changed, 11 insertions(+), 1 deletion(-)
diff --git a/tools/lib/subcmd/pager.c b/tools/lib/subcmd/pager.c
index 9997a8805a82..e3d47b59b14d 100644
--- a/tools/lib/subcmd/pager.c
+++ b/tools/lib/subcmd/pager.c
@@ -23,6 +23,13 @@ void pager_init(const char *pager_env)
subcmd_config.pager_env = pager_env;
}
+static const char *forced_pager;
+
+void force_pager(const char *pager)
+{
+ forced_pager = pager;
+}
+
static void pager_preexec(void)
{
/*
@@ -66,7 +73,9 @@ void setup_pager(void)
const char *pager = getenv(subcmd_config.pager_env);
struct winsize sz;
- if (!isatty(1))
+ if (forced_pager)
+ pager = forced_pager;
+ if (!isatty(1) && !forced_pager)
return;
if (ioctl(1, TIOCGWINSZ, &sz) == 0)
pager_columns = sz.ws_col;
diff --git a/tools/lib/subcmd/pager.h b/tools/lib/subcmd/pager.h
index f1a53cf29880..a818964693ab 100644
--- a/tools/lib/subcmd/pager.h
+++ b/tools/lib/subcmd/pager.h
@@ -7,5 +7,6 @@ extern void pager_init(const char *pager_env);
extern void setup_pager(void);
extern int pager_in_use(void);
extern int pager_get_columns(void);
+extern void force_pager(const char *);
#endif /* __SUBCMD_PAGER_H */
--
2.17.1
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH v5 3/9] tools, perf, script: Add --insn-trace for instruction decoding
2018-09-18 12:32 Make perf script easier to use for itrace Andi Kleen
2018-09-18 12:32 ` [PATCH v5 1/9] perf tools: Report itrace options in help Andi Kleen
2018-09-18 12:32 ` [PATCH v5 2/9] tools, pager: Support overwriting the pager Andi Kleen
@ 2018-09-18 12:32 ` Andi Kleen
2018-09-18 12:32 ` [PATCH v5 4/9] perf, tools, script: Allow sym and dso without ip, addr Andi Kleen
` (5 subsequent siblings)
8 siblings, 0 replies; 30+ messages in thread
From: Andi Kleen @ 2018-09-18 12:32 UTC (permalink / raw)
To: acme; +Cc: jolsa, adrian.hunter, linux-kernel, kim.phillips, Andi Kleen
From: Andi Kleen <ak@linux.intel.com>
Add a --insn-trace short hand option for decoding and disassembling
instruction streams for intel_pt. This automatically pipes the
output into the xed disassembler to generate disassembled instructions.
This just makes this use model much nicer to use
Before
% perf record -e intel_pt// ...
% perf script --itrace=i0ns --ns -F +insn,-event,-period | xed -F insn: -A -64
swapper 0 [000] 117276.429606186: ffffffff81010486 pt_config ([kernel.kallsyms]) nopl %eax, (%rax,%rax,1)
swapper 0 [000] 117276.429606186: ffffffff8101048b pt_config ([kernel.kallsyms]) add $0x10, %rsp
swapper 0 [000] 117276.429606186: ffffffff8101048f pt_config ([kernel.kallsyms]) popq %rbx
swapper 0 [000] 117276.429606186: ffffffff81010490 pt_config ([kernel.kallsyms]) popq %rbp
swapper 0 [000] 117276.429606186: ffffffff81010491 pt_config ([kernel.kallsyms]) popq %r12
swapper 0 [000] 117276.429606186: ffffffff81010493 pt_config ([kernel.kallsyms]) popq %r13
swapper 0 [000] 117276.429606186: ffffffff81010495 pt_config ([kernel.kallsyms]) popq %r14
swapper 0 [000] 117276.429606186: ffffffff81010497 pt_config ([kernel.kallsyms]) popq %r15
swapper 0 [000] 117276.429606186: ffffffff81010499 pt_config ([kernel.kallsyms]) retq
swapper 0 [000] 117276.429606186: ffffffff8101063e pt_event_add ([kernel.kallsyms]) cmpl $0x1, 0x1b0(%rbx)
swapper 0 [000] 117276.429606186: ffffffff81010645 pt_event_add ([kernel.kallsyms]) mov $0xffffffea, %eax
swapper 0 [000] 117276.429606186: ffffffff8101064a pt_event_add ([kernel.kallsyms]) mov $0x0, %edx
swapper 0 [000] 117276.429606186: ffffffff8101064f pt_event_add ([kernel.kallsyms]) popq %rbx
swapper 0 [000] 117276.429606186: ffffffff81010650 pt_event_add ([kernel.kallsyms]) cmovnz %edx, %eax
swapper 0 [000] 117276.429606186: ffffffff81010653 pt_event_add ([kernel.kallsyms]) jmp 0xffffffff81010635
swapper 0 [000] 117276.429606186: ffffffff81010635 pt_event_add ([kernel.kallsyms]) retq
swapper 0 [000] 117276.429606186: ffffffff8115e687 event_sched_in.isra.107 ([kernel.kallsyms]) test %eax, %eax
Now
% perf record -e intel_pt// ...
% perf script --insn-trace --xed
... same output ...
Signed-off-by: Andi Kleen <ak@linux.intel.com>
--
v2: Add separate --xed option
---
tools/perf/Documentation/perf-script.txt | 7 +++++++
tools/perf/builtin-script.c | 23 +++++++++++++++++++++++
2 files changed, 30 insertions(+)
diff --git a/tools/perf/Documentation/perf-script.txt b/tools/perf/Documentation/perf-script.txt
index afdafe2110a1..00c655ab4968 100644
--- a/tools/perf/Documentation/perf-script.txt
+++ b/tools/perf/Documentation/perf-script.txt
@@ -383,6 +383,13 @@ include::itrace.txt[]
will be printed. Each entry has function name and file/line. Enabled by
default, disable with --no-inline.
+--insn-trace::
+ Show instruction stream for intel_pt traces. Combine with --xed to
+ show disassembly.
+
+--xed::
+ Run xed disassembler on output. Requires installing the xed disassembler.
+
SEE ALSO
--------
linkperf:perf-record[1], linkperf:perf-script-perl[1],
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 49d84c85e846..61d7ce8fe250 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -44,6 +44,7 @@
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
+#include <subcmd/pager.h>
#include "sane_ctype.h"
@@ -3074,6 +3075,24 @@ static int perf_script__process_auxtrace_info(struct perf_tool *tool,
#define perf_script__process_auxtrace_info 0
#endif
+static int parse_insn_trace(const struct option *opt __maybe_unused,
+ const char *str __maybe_unused,
+ int unset __maybe_unused)
+{
+ parse_output_fields(NULL, "+insn,-event,-period", 0);
+ itrace_parse_synth_opts(opt, "i0ns", 0);
+ nanosecs = true;
+ return 0;
+}
+
+static int parse_xed(const struct option *opt __maybe_unused,
+ const char *str __maybe_unused,
+ int unset __maybe_unused)
+{
+ force_pager("xed -F insn: -A -64 | less");
+ return 0;
+}
+
int cmd_script(int argc, const char **argv)
{
bool show_full_info = false;
@@ -3158,6 +3177,10 @@ int cmd_script(int argc, const char **argv)
"system-wide collection from all CPUs"),
OPT_STRING('S', "symbols", &symbol_conf.sym_list_str, "symbol[,symbol...]",
"only consider these symbols"),
+ OPT_CALLBACK_OPTARG(0, "insn-trace", &itrace_synth_opts, NULL, NULL,
+ "Decode instructions from itrace", parse_insn_trace),
+ OPT_CALLBACK_OPTARG(0, "xed", NULL, NULL, NULL,
+ "Run xed disassembler on output", parse_xed),
OPT_STRING(0, "stop-bt", &symbol_conf.bt_stop_list_str, "symbol[,symbol...]",
"Stop display of callgraph at these symbols"),
OPT_STRING('C', "cpu", &cpu_list, "cpu", "list of cpus to profile"),
--
2.17.1
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH v5 4/9] perf, tools, script: Allow sym and dso without ip, addr
2018-09-18 12:32 Make perf script easier to use for itrace Andi Kleen
` (2 preceding siblings ...)
2018-09-18 12:32 ` [PATCH v5 3/9] tools, perf, script: Add --insn-trace for instruction decoding Andi Kleen
@ 2018-09-18 12:32 ` Andi Kleen
2018-09-19 18:24 ` Arnaldo Carvalho de Melo
2018-09-26 8:44 ` [tip:perf/core] perf " tip-bot for Andi Kleen
2018-09-18 12:32 ` [PATCH v5 5/9] perf, tools, script: Print DSO for callindent Andi Kleen
` (4 subsequent siblings)
8 siblings, 2 replies; 30+ messages in thread
From: Andi Kleen @ 2018-09-18 12:32 UTC (permalink / raw)
To: acme; +Cc: jolsa, adrian.hunter, linux-kernel, kim.phillips, Andi Kleen
From: Andi Kleen <ak@linux.intel.com>
Currently sym and dso require printing ip and addr because
the print function is tied to those outputs. With callindent
it makes sense to print the symbol or dso without numerical
IP or ADDR. So change the dependency check to only check the
underlying attribute.
Also the branch target output relies on the user_set flag to determine
if the branch target should be implicitely printed. When
modifying the fields with + or - also set user_set,
so that ADDR can be removed. We also need to set wildcard_set
to make the initial sanity check pass.
This allows to remove a lot of noise in callindent output by
dropping the numerical addresses, which are not all that useful.
Before
% perf script --itrace=cr -F +callindent
swapper 0 [000] 156546.354971: 1 branches: pt_config 0 [unknown] ([unknown]) => ffffffff81010486 pt_config ([kernel.kallsyms])
swapper 0 [000] 156546.354971: 1 branches: pt_config ffffffff81010499 pt_config ([kernel.kallsyms]) => ffffffff8101063e pt_event_add ([kernel.kallsyms])
swapper 0 [000] 156546.354971: 1 branches: pt_event_add ffffffff81010635 pt_event_add ([kernel.kallsyms]) => ffffffff8115e687 event_sched_in.isra.107 ([kernel.kallsyms])
swapper 0 [000] 156546.354971: 1 branches: perf_pmu_enable ffffffff8115e726 event_sched_in.isra.107 ([kernel.kallsyms]) => ffffffff811579b0 perf_pmu_enable ([kernel.kallsyms])
swapper 0 [000] 156546.354971: 1 branches: perf_pmu_nop_void ffffffff81151730 perf_pmu_nop_void ([kernel.kallsyms]) => ffffffff8115e72b event_sched_in.isra.107 ([kernel.kallsyms])
swapper 0 [000] 156546.354971: 1 branches: event_sched_in.isra.107 ffffffff8115e737 event_sched_in.isra.107 ([kernel.kallsyms]) => ffffffff8115e7a5 group_sched_in ([kernel.kallsyms])
swapper 0 [000] 156546.354971: 1 branches: __x86_indirect_thunk_rax ffffffff8115e7f6 group_sched_in ([kernel.kallsyms]) => ffffffff81a03000 __x86_indirect_thunk_rax ([kernel.kallsyms])
After
% perf script --itrace=cr -F +callindent,-ip,-sym,-symoff
swapper 0 [000] 156546.354971: 1 branches: pt_config
swapper 0 [000] 156546.354971: 1 branches: pt_config
swapper 0 [000] 156546.354971: 1 branches: pt_event_add
swapper 0 [000] 156546.354971: 1 branches: perf_pmu_enable
swapper 0 [000] 156546.354971: 1 branches: perf_pmu_nop_void
swapper 0 [000] 156546.354971: 1 branches: event_sched_in.isra.107
swapper 0 [000] 156546.354971: 1 branches: __x86_indirect_thunk_rax
swapper 0 [000] 156546.354971: 1 branches: perf_pmu_nop_int
swapper 0 [000] 156546.354971: 1 branches: group_sched_in
swapper 0 [000] 156546.354971: 1 branches: event_filter_match
swapper 0 [000] 156546.354971: 1 branches: event_filter_match
swapper 0 [000] 156546.354971: 1 branches: group_sched_in
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
tools/perf/builtin-script.c | 17 ++++++++++-------
1 file changed, 10 insertions(+), 7 deletions(-)
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 61d7ce8fe250..1f9ed909ae3e 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -407,9 +407,10 @@ static int perf_evsel__check_attr(struct perf_evsel *evsel,
PERF_OUTPUT_WEIGHT))
return -EINVAL;
- if (PRINT_FIELD(SYM) && !PRINT_FIELD(IP) && !PRINT_FIELD(ADDR)) {
+ if (PRINT_FIELD(SYM) &&
+ !(evsel->attr.sample_type & (PERF_SAMPLE_IP|PERF_SAMPLE_ADDR))) {
pr_err("Display of symbols requested but neither sample IP nor "
- "sample address\nis selected. Hence, no addresses to convert "
+ "sample address\navailable. Hence, no addresses to convert "
"to symbols.\n");
return -EINVAL;
}
@@ -418,10 +419,9 @@ static int perf_evsel__check_attr(struct perf_evsel *evsel,
"selected.\n");
return -EINVAL;
}
- if (PRINT_FIELD(DSO) && !PRINT_FIELD(IP) && !PRINT_FIELD(ADDR) &&
- !PRINT_FIELD(BRSTACK) && !PRINT_FIELD(BRSTACKSYM) && !PRINT_FIELD(BRSTACKOFF)) {
- pr_err("Display of DSO requested but no address to convert. Select\n"
- "sample IP, sample address, brstack, brstacksym, or brstackoff.\n");
+ if (PRINT_FIELD(DSO) &&
+ !(evsel->attr.sample_type & (PERF_SAMPLE_IP|PERF_SAMPLE_ADDR))) {
+ pr_err("Display of DSO requested but no address to convert.\n");
return -EINVAL;
}
if (PRINT_FIELD(SRCLINE) && !PRINT_FIELD(IP)) {
@@ -2492,6 +2492,8 @@ static int parse_output_fields(const struct option *opt __maybe_unused,
output[j].fields &= ~all_output_options[i].field;
else
output[j].fields |= all_output_options[i].field;
+ output[j].user_set = true;
+ output[j].wildcard_set = true;
}
}
} else {
@@ -2502,7 +2504,8 @@ static int parse_output_fields(const struct option *opt __maybe_unused,
rc = -EINVAL;
goto out;
}
- output[type].fields |= all_output_options[i].field;
+ output[type].user_set = true;
+ output[type].wildcard_set = true;
}
}
--
2.17.1
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH v5 5/9] perf, tools, script: Print DSO for callindent
2018-09-18 12:32 Make perf script easier to use for itrace Andi Kleen
` (3 preceding siblings ...)
2018-09-18 12:32 ` [PATCH v5 4/9] perf, tools, script: Allow sym and dso without ip, addr Andi Kleen
@ 2018-09-18 12:32 ` Andi Kleen
2018-09-19 18:26 ` Arnaldo Carvalho de Melo
2018-09-26 8:45 ` [tip:perf/core] perf " tip-bot for Andi Kleen
2018-09-18 12:32 ` [PATCH v5 6/9] perf, tools, script: Make itrace script default to all calls Andi Kleen
` (3 subsequent siblings)
8 siblings, 2 replies; 30+ messages in thread
From: Andi Kleen @ 2018-09-18 12:32 UTC (permalink / raw)
To: acme; +Cc: jolsa, adrian.hunter, linux-kernel, kim.phillips, Andi Kleen
From: Andi Kleen <ak@linux.intel.com>
Now that we don't need to print the IP/ADDR for callindent the DSO
is also not printed. It's useful for some cases, so add an own DSO
printout for callindent for the case when IP/ADDR is not enabled.
Before:
% perf script --itrace=cr -F +callindent,-ip,-sym,-symoff,-addr
swapper 0 [000] 3377.917072: 1 branches: pt_config
swapper 0 [000] 3377.917072: 1 branches: pt_config
swapper 0 [000] 3377.917072: 1 branches: pt_event_add
swapper 0 [000] 3377.917072: 1 branches: perf_pmu_enable
swapper 0 [000] 3377.917072: 1 branches: perf_pmu_nop_void
swapper 0 [000] 3377.917072: 1 branches: event_sched_in.isra.107
swapper 0 [000] 3377.917072: 1 branches: __x86_indirect_thunk_rax
swapper 0 [000] 3377.917072: 1 branches: perf_pmu_nop_int
swapper 0 [000] 3377.917072: 1 branches: group_sched_in
swapper 0 [000] 3377.917072: 1 branches: event_filter_match
swapper 0 [000] 3377.917072: 1 branches: event_filter_match
swapper 0 [000] 3377.917072: 1 branches: group_sched_in
After:
swapper 0 [000] 3377.917072: 1 branches: ([unknown]) pt_config
swapper 0 [000] 3377.917072: 1 branches: ([kernel.kallsyms]) pt_config
swapper 0 [000] 3377.917072: 1 branches: ([kernel.kallsyms]) pt_event_add
swapper 0 [000] 3377.917072: 1 branches: ([kernel.kallsyms]) perf_pmu_enable
swapper 0 [000] 3377.917072: 1 branches: ([kernel.kallsyms]) perf_pmu_nop_void
swapper 0 [000] 3377.917072: 1 branches: ([kernel.kallsyms]) event_sched_in.isra.107
swapper 0 [000] 3377.917072: 1 branches: ([kernel.kallsyms]) __x86_indirect_thunk_rax
swapper 0 [000] 3377.917072: 1 branches: ([kernel.kallsyms]) perf_pmu_nop_int
swapper 0 [000] 3377.917072: 1 branches: ([kernel.kallsyms]) group_sched_in
swapper 0 [000] 3377.917072: 1 branches: ([kernel.kallsyms]) event_filter_match
swapper 0 [000] 3377.917072: 1 branches: ([kernel.kallsyms]) event_filter_match
swapper 0 [000] 3377.917072: 1 branches: ([kernel.kallsyms]) group_sched_in
swapper 0 [000] 3377.917072: 1 branches: ([kernel.kallsyms]) __x86_indirect_thunk_rax
swapper 0 [000] 3377.917072: 1 branches: ([kernel.kallsyms]) perf_pmu_nop_txn
swapper 0 [000] 3377.917072: 1 branches: ([kernel.kallsyms]) event_sched_in.isra.107
(in the kernel case of course it's not very useful, but it's important
with user programs where symbols are not unique)
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
tools/perf/builtin-script.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 1f9ed909ae3e..57df6f7cff28 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -1116,6 +1116,7 @@ static int perf_sample__fprintf_callindent(struct perf_sample *sample,
const char *name = NULL;
static int spacing;
int len = 0;
+ int dlen = 0;
u64 ip = 0;
/*
@@ -1142,6 +1143,12 @@ static int perf_sample__fprintf_callindent(struct perf_sample *sample,
ip = sample->ip;
}
+ if (PRINT_FIELD(DSO) && !(PRINT_FIELD(IP) || PRINT_FIELD(ADDR))) {
+ dlen += fprintf(fp, "(");
+ dlen += map__fprintf_dsoname(al->map, fp);
+ dlen += fprintf(fp, ")\t");
+ }
+
if (name)
len = fprintf(fp, "%*s%s", (int)depth * 4, "", name);
else if (ip)
@@ -1160,7 +1167,7 @@ static int perf_sample__fprintf_callindent(struct perf_sample *sample,
if (len < spacing)
len += fprintf(fp, "%*s", spacing - len, "");
- return len;
+ return len + dlen;
}
static int perf_sample__fprintf_insn(struct perf_sample *sample,
--
2.17.1
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH v5 6/9] perf, tools, script: Make itrace script default to all calls
2018-09-18 12:32 Make perf script easier to use for itrace Andi Kleen
` (4 preceding siblings ...)
2018-09-18 12:32 ` [PATCH v5 5/9] perf, tools, script: Print DSO for callindent Andi Kleen
@ 2018-09-18 12:32 ` Andi Kleen
2018-09-18 13:24 ` Adrian Hunter
2018-09-18 12:32 ` [PATCH v5 7/9] tools, perf, script: Add --call-trace and --call-ret-trace Andi Kleen
` (2 subsequent siblings)
8 siblings, 1 reply; 30+ messages in thread
From: Andi Kleen @ 2018-09-18 12:32 UTC (permalink / raw)
To: acme; +Cc: jolsa, adrian.hunter, linux-kernel, kim.phillips, Andi Kleen
From: Andi Kleen <ak@linux.intel.com>
By default perf script for itrace outputs sampled instructions or
branches. In my experience this is confusing to users because it's
hard to correlate with real program behavior. The sampling makes
sense for tools like report that actually sample to reduce
the run time, but run time is normally not a problem for perf script.
It's better to give an accurate representation of the program flow.
Default perf script to output all calls for itrace. That's a much saner
default. The old behavior can be still requested with
perf script --itrace=ibxwpe100000
v2: Fix ETM build failure
v3: Really fix ETM build failure (Kim Phillips)
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
tools/perf/Documentation/itrace.txt | 7 ++++---
tools/perf/builtin-script.c | 5 ++++-
tools/perf/util/auxtrace.c | 17 ++++++++++++-----
tools/perf/util/auxtrace.h | 5 ++++-
tools/perf/util/cs-etm.c | 3 ++-
tools/perf/util/intel-bts.c | 3 ++-
tools/perf/util/intel-pt.c | 3 ++-
7 files changed, 30 insertions(+), 13 deletions(-)
diff --git a/tools/perf/Documentation/itrace.txt b/tools/perf/Documentation/itrace.txt
index a3abe04c779d..c2182cbabde3 100644
--- a/tools/perf/Documentation/itrace.txt
+++ b/tools/perf/Documentation/itrace.txt
@@ -11,10 +11,11 @@
l synthesize last branch entries (use with i or x)
s skip initial number of events
- The default is all events i.e. the same as --itrace=ibxwpe
+ The default is all events i.e. the same as --itrace=ibxwpe,
+ except for perf script where it is --itrace=ce
- In addition, the period (default 100000) for instructions events
- can be specified in units of:
+ In addition, the period (default 100000, except for perf script where it is 1)
+ for instructions events can be specified in units of:
i instructions
t ticks
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 57df6f7cff28..d3a463766e2d 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -3112,7 +3112,10 @@ int cmd_script(int argc, const char **argv)
char *rec_script_path = NULL;
char *rep_script_path = NULL;
struct perf_session *session;
- struct itrace_synth_opts itrace_synth_opts = { .set = false, };
+ struct itrace_synth_opts itrace_synth_opts = {
+ .set = false,
+ .default_no_sample = true,
+ };
char *script_path = NULL;
const char **__argv;
int i, j, err = 0;
diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
index db1511359c5e..2363a41f0e81 100644
--- a/tools/perf/util/auxtrace.c
+++ b/tools/perf/util/auxtrace.c
@@ -964,16 +964,23 @@ s64 perf_event__process_auxtrace(struct perf_tool *tool,
#define PERF_ITRACE_DEFAULT_LAST_BRANCH_SZ 64
#define PERF_ITRACE_MAX_LAST_BRANCH_SZ 1024
-void itrace_synth_opts__set_default(struct itrace_synth_opts *synth_opts)
+void itrace_synth_opts__set_default(struct itrace_synth_opts *synth_opts,
+ bool no_sample)
{
- synth_opts->instructions = true;
synth_opts->branches = true;
synth_opts->transactions = true;
synth_opts->ptwrites = true;
synth_opts->pwr_events = true;
synth_opts->errors = true;
- synth_opts->period_type = PERF_ITRACE_DEFAULT_PERIOD_TYPE;
- synth_opts->period = PERF_ITRACE_DEFAULT_PERIOD;
+ if (no_sample) {
+ synth_opts->period_type = PERF_ITRACE_PERIOD_INSTRUCTIONS;
+ synth_opts->period = 1;
+ synth_opts->calls = true;
+ } else {
+ synth_opts->instructions = true;
+ synth_opts->period_type = PERF_ITRACE_DEFAULT_PERIOD_TYPE;
+ synth_opts->period = PERF_ITRACE_DEFAULT_PERIOD;
+ }
synth_opts->callchain_sz = PERF_ITRACE_DEFAULT_CALLCHAIN_SZ;
synth_opts->last_branch_sz = PERF_ITRACE_DEFAULT_LAST_BRANCH_SZ;
synth_opts->initial_skip = 0;
@@ -1001,7 +1008,7 @@ int itrace_parse_synth_opts(const struct option *opt, const char *str,
}
if (!str) {
- itrace_synth_opts__set_default(synth_opts);
+ itrace_synth_opts__set_default(synth_opts, false);
return 0;
}
diff --git a/tools/perf/util/auxtrace.h b/tools/perf/util/auxtrace.h
index a516a891d7ea..ff611cc6fd75 100644
--- a/tools/perf/util/auxtrace.h
+++ b/tools/perf/util/auxtrace.h
@@ -56,6 +56,7 @@ enum itrace_period_type {
/**
* struct itrace_synth_opts - AUX area tracing synthesis options.
* @set: indicates whether or not options have been set
+ * @default_no_sample: Default to no sampling.
* @inject: indicates the event (not just the sample) must be fully synthesized
* because 'perf inject' will write it out
* @instructions: whether to synthesize 'instructions' events
@@ -80,6 +81,7 @@ enum itrace_period_type {
*/
struct itrace_synth_opts {
bool set;
+ bool default_no_sample;
bool inject;
bool instructions;
bool branches;
@@ -528,7 +530,8 @@ int perf_event__process_auxtrace_error(struct perf_tool *tool,
struct perf_session *session);
int itrace_parse_synth_opts(const struct option *opt, const char *str,
int unset);
-void itrace_synth_opts__set_default(struct itrace_synth_opts *synth_opts);
+void itrace_synth_opts__set_default(struct itrace_synth_opts *synth_opts,
+ bool no_sample);
size_t perf_event__fprintf_auxtrace_error(union perf_event *event, FILE *fp);
void perf_session__auxtrace_error_inc(struct perf_session *session,
diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
index 2ae640257fdb..3b37d66dc533 100644
--- a/tools/perf/util/cs-etm.c
+++ b/tools/perf/util/cs-etm.c
@@ -1432,7 +1432,8 @@ int cs_etm__process_auxtrace_info(union perf_event *event,
if (session->itrace_synth_opts && session->itrace_synth_opts->set) {
etm->synth_opts = *session->itrace_synth_opts;
} else {
- itrace_synth_opts__set_default(&etm->synth_opts);
+ itrace_synth_opts__set_default(&etm->synth_opts,
+ session->itrace_synth_opts->default_no_sample);
etm->synth_opts.callchain = false;
}
diff --git a/tools/perf/util/intel-bts.c b/tools/perf/util/intel-bts.c
index 7f0c83b6332b..3b3a3d55dca1 100644
--- a/tools/perf/util/intel-bts.c
+++ b/tools/perf/util/intel-bts.c
@@ -910,7 +910,8 @@ int intel_bts_process_auxtrace_info(union perf_event *event,
if (session->itrace_synth_opts && session->itrace_synth_opts->set) {
bts->synth_opts = *session->itrace_synth_opts;
} else {
- itrace_synth_opts__set_default(&bts->synth_opts);
+ itrace_synth_opts__set_default(&bts->synth_opts,
+ session->itrace_synth_opts->default_no_sample);
if (session->itrace_synth_opts)
bts->synth_opts.thread_stack =
session->itrace_synth_opts->thread_stack;
diff --git a/tools/perf/util/intel-pt.c b/tools/perf/util/intel-pt.c
index aec68908d604..c6c934c8f615 100644
--- a/tools/perf/util/intel-pt.c
+++ b/tools/perf/util/intel-pt.c
@@ -2554,7 +2554,8 @@ int intel_pt_process_auxtrace_info(union perf_event *event,
if (session->itrace_synth_opts && session->itrace_synth_opts->set) {
pt->synth_opts = *session->itrace_synth_opts;
} else {
- itrace_synth_opts__set_default(&pt->synth_opts);
+ itrace_synth_opts__set_default(&pt->synth_opts,
+ session->itrace_synth_opts->default_no_sample);
if (use_browser != -1) {
pt->synth_opts.branches = false;
pt->synth_opts.callchain = true;
--
2.17.1
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH v5 7/9] tools, perf, script: Add --call-trace and --call-ret-trace
2018-09-18 12:32 Make perf script easier to use for itrace Andi Kleen
` (5 preceding siblings ...)
2018-09-18 12:32 ` [PATCH v5 6/9] perf, tools, script: Make itrace script default to all calls Andi Kleen
@ 2018-09-18 12:32 ` Andi Kleen
2018-09-18 12:32 ` [PATCH v5 8/9] tools, perf, script: Implement --graph-function Andi Kleen
2018-09-18 12:32 ` [PATCH v5 9/9] perf, tools, script: Support total cycles count Andi Kleen
8 siblings, 0 replies; 30+ messages in thread
From: Andi Kleen @ 2018-09-18 12:32 UTC (permalink / raw)
To: acme; +Cc: jolsa, adrian.hunter, linux-kernel, kim.phillips, Andi Kleen
From: Andi Kleen <ak@linux.intel.com>
Add short cut options to print PT call trace and call-ret-trace,
for calls and call and returns. Roughly corresponds to ftrace
function tracer and function graph tracer.
Just makes these common use cases nicer to use.
% perf record -a -e intel_pt// sleep 1
% perf script --call-trace
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_pmu_enable
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) __x86_indirect_thunk_rax
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) event_filter_match
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) group_sched_in
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) __x86_indirect_thunk_rax
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) event_sched_in.isra.107
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_event_set_state.part.71
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_event_update_time
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_pmu_disable
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_log_itrace_start
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) __x86_indirect_thunk_rax
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_event_update_userpage
% perf script --call-ret-trace
perf 900 [000] 194167.205652203: tr strt ([unknown]) pt_config
perf 900 [000] 194167.205652203: return ([kernel.kallsyms]) pt_config
perf 900 [000] 194167.205652203: return ([kernel.kallsyms]) pt_event_add
perf 900 [000] 194167.205652203: call ([kernel.kallsyms]) perf_pmu_enable
perf 900 [000] 194167.205652203: return ([kernel.kallsyms]) perf_pmu_nop_void
perf 900 [000] 194167.205652203: return ([kernel.kallsyms]) event_sched_in.isra.107
perf 900 [000] 194167.205652203: call ([kernel.kallsyms]) __x86_indirect_thunk_rax
perf 900 [000] 194167.205652203: return ([kernel.kallsyms]) perf_pmu_nop_int
perf 900 [000] 194167.205652203: return ([kernel.kallsyms]) group_sched_in
perf 900 [000] 194167.205652203: call ([kernel.kallsyms]) event_filter_match
perf 900 [000] 194167.205652203: return ([kernel.kallsyms]) event_filter_match
perf 900 [000] 194167.205652203: call ([kernel.kallsyms]) group_sched_in
perf 900 [000] 194167.205652203: call ([kernel.kallsyms]) __x86_indirect_thunk_rax
perf 900 [000] 194167.205652203: return ([kernel.kallsyms]) perf_pmu_nop_txn
perf 900 [000] 194167.205652203: call ([kernel.kallsyms]) event_sched_in.isra.107
perf 900 [000] 194167.205652203: call ([kernel.kallsyms]) perf_event_set_state.part.71
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
v2: Print errors, power, ptwrite too
---
tools/perf/Documentation/perf-script.txt | 7 +++++++
tools/perf/builtin-script.c | 24 ++++++++++++++++++++++++
2 files changed, 31 insertions(+)
diff --git a/tools/perf/Documentation/perf-script.txt b/tools/perf/Documentation/perf-script.txt
index 00c655ab4968..805baabd238e 100644
--- a/tools/perf/Documentation/perf-script.txt
+++ b/tools/perf/Documentation/perf-script.txt
@@ -390,6 +390,13 @@ include::itrace.txt[]
--xed::
Run xed disassembler on output. Requires installing the xed disassembler.
+--call-trace::
+ Show call stream for intel_pt traces. The CPUs are interleaved, but
+ can be filtered with -C.
+
+--call-ret-trace::
+ Show call and return stream for intel_pt traces.
+
SEE ALSO
--------
linkperf:perf-record[1], linkperf:perf-script-perl[1],
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index d3a463766e2d..7e531f6b1d7a 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -3103,6 +3103,26 @@ static int parse_xed(const struct option *opt __maybe_unused,
return 0;
}
+static int parse_call_trace(const struct option *opt __maybe_unused,
+ const char *str __maybe_unused,
+ int unset __maybe_unused)
+{
+ parse_output_fields(NULL, "-ip,-addr,-event,-period,+callindent", 0);
+ itrace_parse_synth_opts(opt, "cewp", 0);
+ nanosecs = true;
+ return 0;
+}
+
+static int parse_callret_trace(const struct option *opt __maybe_unused,
+ const char *str __maybe_unused,
+ int unset __maybe_unused)
+{
+ parse_output_fields(NULL, "-ip,-addr,-event,-period,+callindent,+flags", 0);
+ itrace_parse_synth_opts(opt, "crewp", 0);
+ nanosecs = true;
+ return 0;
+}
+
int cmd_script(int argc, const char **argv)
{
bool show_full_info = false;
@@ -3194,6 +3214,10 @@ int cmd_script(int argc, const char **argv)
"Decode instructions from itrace", parse_insn_trace),
OPT_CALLBACK_OPTARG(0, "xed", NULL, NULL, NULL,
"Run xed disassembler on output", parse_xed),
+ OPT_CALLBACK_OPTARG(0, "call-trace", &itrace_synth_opts, NULL, NULL,
+ "Decode calls from from itrace", parse_call_trace),
+ OPT_CALLBACK_OPTARG(0, "call-ret-trace", &itrace_synth_opts, NULL, NULL,
+ "Decode calls and returns from itrace", parse_callret_trace),
OPT_STRING(0, "stop-bt", &symbol_conf.bt_stop_list_str, "symbol[,symbol...]",
"Stop display of callgraph at these symbols"),
OPT_STRING('C', "cpu", &cpu_list, "cpu", "list of cpus to profile"),
--
2.17.1
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH v5 8/9] tools, perf, script: Implement --graph-function
2018-09-18 12:32 Make perf script easier to use for itrace Andi Kleen
` (6 preceding siblings ...)
2018-09-18 12:32 ` [PATCH v5 7/9] tools, perf, script: Add --call-trace and --call-ret-trace Andi Kleen
@ 2018-09-18 12:32 ` Andi Kleen
2018-09-18 12:32 ` [PATCH v5 9/9] perf, tools, script: Support total cycles count Andi Kleen
8 siblings, 0 replies; 30+ messages in thread
From: Andi Kleen @ 2018-09-18 12:32 UTC (permalink / raw)
To: acme; +Cc: jolsa, adrian.hunter, linux-kernel, kim.phillips, Andi Kleen
From: Andi Kleen <ak@linux.intel.com>
Add a ftrace style --graph-function argument to perf script that allows
to print itrace function calls only below a given function. This
makes it easier to find the code of interest in a large trace.
% perf record -e intel_pt//k -a sleep 1
% perf script --graph-function group_sched_in --call-trace
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) group_sched_in
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) __x86_indirect_thunk_rax
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) event_sched_in.isra.107
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_event_set_state.part.71
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_event_update_time
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_pmu_disable
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_log_itrace_start
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) __x86_indirect_thunk_rax
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_event_update_userpage
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) calc_timer_values
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) sched_clock_cpu
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) __x86_indirect_thunk_rax
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) arch_perf_update_userpage
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) __fentry__
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) using_native_sched_clock
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) sched_clock_stable
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_pmu_enable
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) __x86_indirect_thunk_rax
swapper 0 [001] 194167.205660693: ([kernel.kallsyms]) group_sched_in
swapper 0 [001] 194167.205660693: ([kernel.kallsyms]) __x86_indirect_thunk_rax
swapper 0 [001] 194167.205660693: ([kernel.kallsyms]) event_sched_in.isra.107
swapper 0 [001] 194167.205660693: ([kernel.kallsyms]) perf_event_set_state.part.71
swapper 0 [001] 194167.205660693: ([kernel.kallsyms]) perf_event_update_time
swapper 0 [001] 194167.205660693: ([kernel.kallsyms]) perf_pmu_disable
swapper 0 [001] 194167.205660693: ([kernel.kallsyms]) perf_log_itrace_start
swapper 0 [001] 194167.205660693: ([kernel.kallsyms]) __x86_indirect_thunk_rax
swapper 0 [001] 194167.205660693: ([kernel.kallsyms]) perf_event_update_userpage
swapper 0 [001] 194167.205660693: ([kernel.kallsyms]) calc_timer_values
swapper 0 [001] 194167.205660693: ([kernel.kallsyms]) sched_clock_cpu
swapper 0 [001] 194167.205660693: ([kernel.kallsyms]) __x86_indirect_thunk_rax
swapper 0 [001] 194167.205660693: ([kernel.kallsyms]) arch_perf_update_userpage
swapper 0 [001] 194167.205660693: ([kernel.kallsyms]) __fentry__
swapper 0 [001] 194167.205660693: ([kernel.kallsyms]) using_native_sched_clock
swapper 0 [001] 194167.205660693: ([kernel.kallsyms]) sched_clock_stable
v2: Remove debug printout
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
tools/perf/Documentation/perf-script.txt | 4 +
tools/perf/builtin-script.c | 96 +++++++++++++++++++-----
tools/perf/util/symbol.h | 3 +-
tools/perf/util/thread.h | 2 +
4 files changed, 86 insertions(+), 19 deletions(-)
diff --git a/tools/perf/Documentation/perf-script.txt b/tools/perf/Documentation/perf-script.txt
index 805baabd238e..a2b37ce48094 100644
--- a/tools/perf/Documentation/perf-script.txt
+++ b/tools/perf/Documentation/perf-script.txt
@@ -397,6 +397,10 @@ include::itrace.txt[]
--call-ret-trace::
Show call and return stream for intel_pt traces.
+--graph-function::
+ For itrace only show specified functions and their callees for
+ itrace. Multiple functions can be separated by comma.
+
SEE ALSO
--------
linkperf:perf-record[1], linkperf:perf-script-perl[1],
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 7e531f6b1d7a..3b73c3226141 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -1105,6 +1105,35 @@ static int perf_sample__fprintf_addr(struct perf_sample *sample,
return printed;
}
+static const char *resolve_branch_sym(struct perf_sample *sample,
+ struct perf_evsel *evsel,
+ struct thread *thread,
+ struct addr_location *al,
+ u64 *ip)
+{
+ struct addr_location addr_al;
+ struct perf_event_attr *attr = &evsel->attr;
+ const char *name = NULL;
+
+ if (sample->flags & (PERF_IP_FLAG_CALL | PERF_IP_FLAG_TRACE_BEGIN)) {
+ if (sample_addr_correlates_sym(attr)) {
+ thread__resolve(thread, &addr_al, sample);
+ if (addr_al.sym)
+ name = addr_al.sym->name;
+ else
+ *ip = sample->addr;
+ } else {
+ *ip = sample->addr;
+ }
+ } else if (sample->flags & (PERF_IP_FLAG_RETURN | PERF_IP_FLAG_TRACE_END)) {
+ if (al->sym)
+ name = al->sym->name;
+ else
+ *ip = sample->ip;
+ }
+ return name;
+}
+
static int perf_sample__fprintf_callindent(struct perf_sample *sample,
struct perf_evsel *evsel,
struct thread *thread,
@@ -1112,7 +1141,6 @@ static int perf_sample__fprintf_callindent(struct perf_sample *sample,
{
struct perf_event_attr *attr = &evsel->attr;
size_t depth = thread_stack__depth(thread);
- struct addr_location addr_al;
const char *name = NULL;
static int spacing;
int len = 0;
@@ -1126,22 +1154,7 @@ static int perf_sample__fprintf_callindent(struct perf_sample *sample,
if (thread->ts && sample->flags & PERF_IP_FLAG_RETURN)
depth += 1;
- if (sample->flags & (PERF_IP_FLAG_CALL | PERF_IP_FLAG_TRACE_BEGIN)) {
- if (sample_addr_correlates_sym(attr)) {
- thread__resolve(thread, &addr_al, sample);
- if (addr_al.sym)
- name = addr_al.sym->name;
- else
- ip = sample->addr;
- } else {
- ip = sample->addr;
- }
- } else if (sample->flags & (PERF_IP_FLAG_RETURN | PERF_IP_FLAG_TRACE_END)) {
- if (al->sym)
- name = al->sym->name;
- else
- ip = sample->ip;
- }
+ name = resolve_branch_sym(sample, evsel, thread, al, &ip);
if (PRINT_FIELD(DSO) && !(PRINT_FIELD(IP) || PRINT_FIELD(ADDR))) {
dlen += fprintf(fp, "(");
@@ -1629,6 +1642,47 @@ static void perf_sample__fprint_metric(struct perf_script *script,
}
}
+static bool show_event(struct perf_sample *sample,
+ struct perf_evsel *evsel,
+ struct thread *thread,
+ struct addr_location *al)
+{
+ int depth = thread_stack__depth(thread);
+
+ if (!symbol_conf.graph_function)
+ return true;
+
+ if (thread->filter) {
+ if (depth <= thread->filter_entry_depth) {
+ thread->filter = false;
+ return false;
+ }
+ return true;
+ } else {
+ const char *s = symbol_conf.graph_function;
+ u64 ip;
+ const char *name = resolve_branch_sym(sample, evsel, thread, al,
+ &ip);
+ unsigned nlen;
+
+ if (!name)
+ return false;
+ nlen = strlen(name);
+ while (*s) {
+ unsigned len = strcspn(s, ",");
+ if (nlen == len && !strncmp(name, s, len)) {
+ thread->filter = true;
+ thread->filter_entry_depth = depth;
+ return true;
+ }
+ s += len;
+ if (*s == ',')
+ s++;
+ }
+ return false;
+ }
+}
+
static void process_event(struct perf_script *script,
struct perf_sample *sample, struct perf_evsel *evsel,
struct addr_location *al,
@@ -1643,6 +1697,9 @@ static void process_event(struct perf_script *script,
if (output[type].fields == 0)
return;
+ if (!show_event(sample, evsel, thread, al))
+ return;
+
++es->samples;
perf_sample__fprintf_start(sample, thread, evsel,
@@ -3218,6 +3275,8 @@ int cmd_script(int argc, const char **argv)
"Decode calls from from itrace", parse_call_trace),
OPT_CALLBACK_OPTARG(0, "call-ret-trace", &itrace_synth_opts, NULL, NULL,
"Decode calls and returns from itrace", parse_callret_trace),
+ OPT_STRING(0, "graph-function", &symbol_conf.graph_function, "symbol[,symbol...]",
+ "Only print symbols and callees with --call-trace/--call-ret-trace"),
OPT_STRING(0, "stop-bt", &symbol_conf.bt_stop_list_str, "symbol[,symbol...]",
"Stop display of callgraph at these symbols"),
OPT_STRING('C', "cpu", &cpu_list, "cpu", "list of cpus to profile"),
@@ -3473,7 +3532,8 @@ int cmd_script(int argc, const char **argv)
script.session = session;
script__setup_sample_type(&script);
- if (output[PERF_TYPE_HARDWARE].fields & PERF_OUTPUT_CALLINDENT)
+ if ((output[PERF_TYPE_HARDWARE].fields & PERF_OUTPUT_CALLINDENT) ||
+ symbol_conf.graph_function)
itrace_synth_opts.thread_stack = true;
session->itrace_synth_opts = &itrace_synth_opts;
diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
index f25fae4b5743..d726a8a7bb1b 100644
--- a/tools/perf/util/symbol.h
+++ b/tools/perf/util/symbol.h
@@ -123,7 +123,8 @@ struct symbol_conf {
const char *vmlinux_name,
*kallsyms_name,
*source_prefix,
- *field_sep;
+ *field_sep,
+ *graph_function;
const char *default_guest_vmlinux_name,
*default_guest_kallsyms,
*default_guest_modules;
diff --git a/tools/perf/util/thread.h b/tools/perf/util/thread.h
index 07606aa6998d..36c09a9904e6 100644
--- a/tools/perf/util/thread.h
+++ b/tools/perf/util/thread.h
@@ -42,6 +42,8 @@ struct thread {
void *addr_space;
struct unwind_libunwind_ops *unwind_libunwind_ops;
#endif
+ bool filter;
+ int filter_entry_depth;
};
struct machine;
--
2.17.1
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH v5 9/9] perf, tools, script: Support total cycles count
2018-09-18 12:32 Make perf script easier to use for itrace Andi Kleen
` (7 preceding siblings ...)
2018-09-18 12:32 ` [PATCH v5 8/9] tools, perf, script: Implement --graph-function Andi Kleen
@ 2018-09-18 12:32 ` Andi Kleen
2018-09-19 18:31 ` Arnaldo Carvalho de Melo
8 siblings, 1 reply; 30+ messages in thread
From: Andi Kleen @ 2018-09-18 12:32 UTC (permalink / raw)
To: acme; +Cc: jolsa, adrian.hunter, linux-kernel, kim.phillips, Andi Kleen
From: Andi Kleen <ak@linux.intel.com>
For perf script brstackinsn also print a running cycles count.
This makes it easier to calculate cycle deltas for code sections
measured with LBRs.
% perf record -b -a sleep 1
% perf script -F +brstackinsn
...
_dl_sysdep_start+330:
00007eff9f20583a insn: 75 c4 # PRED 24 cycles [24]
00007eff9f205800 insn: 48 83 e8 03
00007eff9f205804 insn: 48 83 f8 1e
00007eff9f205808 insn: 77 26
00007eff9f20580a insn: 48 63 04 81
00007eff9f20580e insn: 48 01 c8
00007eff9f205811 insn: ff e0 # MISPRED 31 cycles [7] 0.71 IPC
00007eff9f2059c0 insn: 44 8b 62 08
00007eff9f2059c4 insn: e9 67 fe ff ff # PRED 55 cycles [24] 0.04 IPC
00007eff9f205830 insn: 48 83 c2 10
00007eff9f205834 insn: 48 8b 02
00007eff9f205837 insn: 48 85 c0
00007eff9f20583a insn: 75 c4 # PRED 68 cycles [13] 0.23 IPC
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
tools/perf/builtin-script.c | 12 ++++++++----
1 file changed, 8 insertions(+), 4 deletions(-)
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 3b73c3226141..84895c861b84 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -913,7 +913,7 @@ static int grab_bb(u8 *buffer, u64 start, u64 end,
static int ip__fprintf_jump(uint64_t ip, struct branch_entry *en,
struct perf_insn *x, u8 *inbuf, int len,
- int insn, FILE *fp)
+ int insn, FILE *fp, int *total_cycles)
{
int printed = fprintf(fp, "\t%016" PRIx64 "\t%-30s\t#%s%s%s%s", ip,
dump_insn(x, ip, inbuf, len, NULL),
@@ -922,7 +922,8 @@ static int ip__fprintf_jump(uint64_t ip, struct branch_entry *en,
en->flags.in_tx ? " INTX" : "",
en->flags.abort ? " ABORT" : "");
if (en->flags.cycles) {
- printed += fprintf(fp, " %d cycles", en->flags.cycles);
+ *total_cycles += en->flags.cycles;
+ printed += fprintf(fp, " %d cycles [%d]", *total_cycles, en->flags.cycles);
if (insn)
printed += fprintf(fp, " %.2f IPC", (float)insn / en->flags.cycles);
}
@@ -979,6 +980,7 @@ static int perf_sample__fprintf_brstackinsn(struct perf_sample *sample,
u8 buffer[MAXBB];
unsigned off;
struct symbol *lastsym = NULL;
+ int total_cycles = 0;
if (!(br && br->nr))
return 0;
@@ -999,7 +1001,7 @@ static int perf_sample__fprintf_brstackinsn(struct perf_sample *sample,
printed += ip__fprintf_sym(br->entries[nr - 1].from, thread,
x.cpumode, x.cpu, &lastsym, attr, fp);
printed += ip__fprintf_jump(br->entries[nr - 1].from, &br->entries[nr - 1],
- &x, buffer, len, 0, fp);
+ &x, buffer, len, 0, fp, &total_cycles);
}
/* Print all blocks */
@@ -1027,7 +1029,9 @@ static int perf_sample__fprintf_brstackinsn(struct perf_sample *sample,
printed += ip__fprintf_sym(ip, thread, x.cpumode, x.cpu, &lastsym, attr, fp);
if (ip == end) {
- printed += ip__fprintf_jump(ip, &br->entries[i], &x, buffer + off, len - off, insn, fp);
+ printed += ip__fprintf_jump(ip, &br->entries[i],
+ &x, buffer + off, len - off,
+ insn, fp, &total_cycles);
break;
} else {
printed += fprintf(fp, "\t%016" PRIx64 "\t%s\n", ip,
--
2.17.1
^ permalink raw reply related [flat|nested] 30+ messages in thread
* Re: [PATCH v5 6/9] perf, tools, script: Make itrace script default to all calls
2018-09-18 12:32 ` [PATCH v5 6/9] perf, tools, script: Make itrace script default to all calls Andi Kleen
@ 2018-09-18 13:24 ` Adrian Hunter
2018-09-19 18:28 ` Arnaldo Carvalho de Melo
0 siblings, 1 reply; 30+ messages in thread
From: Adrian Hunter @ 2018-09-18 13:24 UTC (permalink / raw)
To: Andi Kleen, acme; +Cc: jolsa, linux-kernel, kim.phillips, Andi Kleen
On 18/09/18 15:32, Andi Kleen wrote:
> From: Andi Kleen <ak@linux.intel.com>
>
> By default perf script for itrace outputs sampled instructions or
> branches. In my experience this is confusing to users because it's
> hard to correlate with real program behavior. The sampling makes
> sense for tools like report that actually sample to reduce
> the run time, but run time is normally not a problem for perf script.
> It's better to give an accurate representation of the program flow.
>
> Default perf script to output all calls for itrace. That's a much saner
> default. The old behavior can be still requested with
> perf script --itrace=ibxwpe100000
>
> v2: Fix ETM build failure
> v3: Really fix ETM build failure (Kim Phillips)
> Signed-off-by: Andi Kleen <ak@linux.intel.com>
> ---
> tools/perf/Documentation/itrace.txt | 7 ++++---
> tools/perf/builtin-script.c | 5 ++++-
> tools/perf/util/auxtrace.c | 17 ++++++++++++-----
> tools/perf/util/auxtrace.h | 5 ++++-
> tools/perf/util/cs-etm.c | 3 ++-
> tools/perf/util/intel-bts.c | 3 ++-
> tools/perf/util/intel-pt.c | 3 ++-
> 7 files changed, 30 insertions(+), 13 deletions(-)
>
> diff --git a/tools/perf/Documentation/itrace.txt b/tools/perf/Documentation/itrace.txt
> index a3abe04c779d..c2182cbabde3 100644
> --- a/tools/perf/Documentation/itrace.txt
> +++ b/tools/perf/Documentation/itrace.txt
> @@ -11,10 +11,11 @@
> l synthesize last branch entries (use with i or x)
> s skip initial number of events
>
> - The default is all events i.e. the same as --itrace=ibxwpe
> + The default is all events i.e. the same as --itrace=ibxwpe,
> + except for perf script where it is --itrace=ce
>
> - In addition, the period (default 100000) for instructions events
> - can be specified in units of:
> + In addition, the period (default 100000, except for perf script where it is 1)
> + for instructions events can be specified in units of:
>
> i instructions
> t ticks
> diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
> index 57df6f7cff28..d3a463766e2d 100644
> --- a/tools/perf/builtin-script.c
> +++ b/tools/perf/builtin-script.c
> @@ -3112,7 +3112,10 @@ int cmd_script(int argc, const char **argv)
> char *rec_script_path = NULL;
> char *rep_script_path = NULL;
> struct perf_session *session;
> - struct itrace_synth_opts itrace_synth_opts = { .set = false, };
> + struct itrace_synth_opts itrace_synth_opts = {
> + .set = false,
> + .default_no_sample = true,
> + };
> char *script_path = NULL;
> const char **__argv;
> int i, j, err = 0;
> diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
> index db1511359c5e..2363a41f0e81 100644
> --- a/tools/perf/util/auxtrace.c
> +++ b/tools/perf/util/auxtrace.c
> @@ -964,16 +964,23 @@ s64 perf_event__process_auxtrace(struct perf_tool *tool,
> #define PERF_ITRACE_DEFAULT_LAST_BRANCH_SZ 64
> #define PERF_ITRACE_MAX_LAST_BRANCH_SZ 1024
>
> -void itrace_synth_opts__set_default(struct itrace_synth_opts *synth_opts)
> +void itrace_synth_opts__set_default(struct itrace_synth_opts *synth_opts,
> + bool no_sample)
Rather than pass no_sample, what about passing the whole of the tool's
options i.e.
diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
index db1511359c5e..d4db61626a5f 100644
--- a/tools/perf/util/auxtrace.c
+++ b/tools/perf/util/auxtrace.c
@@ -964,8 +964,14 @@ s64 perf_event__process_auxtrace(struct perf_tool *tool,
#define PERF_ITRACE_DEFAULT_LAST_BRANCH_SZ 64
#define PERF_ITRACE_MAX_LAST_BRANCH_SZ 1024
-void itrace_synth_opts__set_default(struct itrace_synth_opts *synth_opts)
+bool itrace_synth_opts__set_default(struct itrace_synth_opts *synth_opts,
+ const struct itrace_synth_opts *tool_synth_opts)
{
+ if (tool_synth_opts && tool_synth_opts->set) {
+ *synth_opts = *tool_synth_opts;
+ return false;
+ }
+
synth_opts->instructions = true;
synth_opts->branches = true;
synth_opts->transactions = true;
@@ -977,6 +983,8 @@ void itrace_synth_opts__set_default(struct itrace_synth_opts *synth_opts)
synth_opts->callchain_sz = PERF_ITRACE_DEFAULT_CALLCHAIN_SZ;
synth_opts->last_branch_sz = PERF_ITRACE_DEFAULT_LAST_BRANCH_SZ;
synth_opts->initial_skip = 0;
+
+ return true;
}
/*
@@ -1001,7 +1009,7 @@ int itrace_parse_synth_opts(const struct option *opt, const char *str,
}
if (!str) {
- itrace_synth_opts__set_default(synth_opts);
+ itrace_synth_opts__set_default(synth_opts, NULL);
return 0;
}
diff --git a/tools/perf/util/auxtrace.h b/tools/perf/util/auxtrace.h
index a516a891d7ea..172c92a8c596 100644
--- a/tools/perf/util/auxtrace.h
+++ b/tools/perf/util/auxtrace.h
@@ -528,7 +528,8 @@ int perf_event__process_auxtrace_error(struct perf_tool *tool,
struct perf_session *session);
int itrace_parse_synth_opts(const struct option *opt, const char *str,
int unset);
-void itrace_synth_opts__set_default(struct itrace_synth_opts *synth_opts);
+bool itrace_synth_opts__set_default(struct itrace_synth_opts *synth_opts,
+ const struct itrace_synth_opts *tool_synth_opts);
size_t perf_event__fprintf_auxtrace_error(union perf_event *event, FILE *fp);
void perf_session__auxtrace_error_inc(struct perf_session *session,
diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
index 2ae640257fdb..45b9a2b9f473 100644
--- a/tools/perf/util/cs-etm.c
+++ b/tools/perf/util/cs-etm.c
@@ -1429,12 +1429,9 @@ int cs_etm__process_auxtrace_info(union perf_event *event,
return 0;
}
- if (session->itrace_synth_opts && session->itrace_synth_opts->set) {
- etm->synth_opts = *session->itrace_synth_opts;
- } else {
- itrace_synth_opts__set_default(&etm->synth_opts);
+ if (itrace_synth_opts__set_default(&etm->synth_opts,
+ session->itrace_synth_opts))
etm->synth_opts.callchain = false;
- }
err = cs_etm__synth_events(etm, session);
if (err)
diff --git a/tools/perf/util/intel-bts.c b/tools/perf/util/intel-bts.c
index 7f0c83b6332b..2d96f4743789 100644
--- a/tools/perf/util/intel-bts.c
+++ b/tools/perf/util/intel-bts.c
@@ -907,12 +907,9 @@ int intel_bts_process_auxtrace_info(union perf_event *event,
if (dump_trace)
return 0;
- if (session->itrace_synth_opts && session->itrace_synth_opts->set) {
- bts->synth_opts = *session->itrace_synth_opts;
- } else {
- itrace_synth_opts__set_default(&bts->synth_opts);
- if (session->itrace_synth_opts)
- bts->synth_opts.thread_stack =
+ if (itrace_synth_opts__set_default(&bts->synth_opts,
+ session->itrace_synth_opts)) {
+ bts->synth_opts.thread_stack =
session->itrace_synth_opts->thread_stack;
}
diff --git a/tools/perf/util/intel-pt.c b/tools/perf/util/intel-pt.c
index aec68908d604..b6ac8ac2f4c3 100644
--- a/tools/perf/util/intel-pt.c
+++ b/tools/perf/util/intel-pt.c
@@ -2551,10 +2551,8 @@ int intel_pt_process_auxtrace_info(union perf_event *event,
goto err_delete_thread;
}
- if (session->itrace_synth_opts && session->itrace_synth_opts->set) {
- pt->synth_opts = *session->itrace_synth_opts;
- } else {
- itrace_synth_opts__set_default(&pt->synth_opts);
+ if (itrace_synth_opts__set_default(&pt->synth_opts,
+ session->itrace_synth_opts)) {
if (use_browser != -1) {
pt->synth_opts.branches = false;
pt->synth_opts.callchain = true;
^ permalink raw reply related [flat|nested] 30+ messages in thread
* Re: [PATCH v5 1/9] perf tools: Report itrace options in help
2018-09-18 12:32 ` [PATCH v5 1/9] perf tools: Report itrace options in help Andi Kleen
@ 2018-09-19 18:13 ` Arnaldo Carvalho de Melo
0 siblings, 0 replies; 30+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-09-19 18:13 UTC (permalink / raw)
To: Andi Kleen; +Cc: jolsa, adrian.hunter, linux-kernel, kim.phillips, Andi Kleen
Em Tue, Sep 18, 2018 at 05:32:06AM -0700, Andi Kleen escreveu:
> From: Andi Kleen <ak@linux.intel.com>
>
> I often forget all the options that --itrace accepts. Instead of burying
> them in the man page only report them in the normal command line help
> too to make them easier accessible.
Looks reasonable, applied.
- Arnaldo
> v2: Align
> Signed-off-by: Andi Kleen <ak@linux.intel.com>
> ---
> tools/perf/builtin-inject.c | 3 ++-
> tools/perf/builtin-report.c | 2 +-
> tools/perf/builtin-script.c | 2 +-
> tools/perf/util/auxtrace.h | 19 +++++++++++++++++++
> 4 files changed, 23 insertions(+), 3 deletions(-)
>
> diff --git a/tools/perf/builtin-inject.c b/tools/perf/builtin-inject.c
> index a3b346359ba0..b649a0511ed8 100644
> --- a/tools/perf/builtin-inject.c
> +++ b/tools/perf/builtin-inject.c
> @@ -803,7 +803,8 @@ int cmd_inject(int argc, const char **argv)
> "kallsyms pathname"),
> OPT_BOOLEAN('f', "force", &data.force, "don't complain, do it"),
> OPT_CALLBACK_OPTARG(0, "itrace", &inject.itrace_synth_opts,
> - NULL, "opts", "Instruction Tracing options",
> + NULL, "opts", "Instruction Tracing options\n"
> + ITRACE_HELP,
> itrace_parse_synth_opts),
> OPT_BOOLEAN(0, "strip", &inject.strip,
> "strip non-synthesized events (use with --itrace)"),
> diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
> index 76e12bcd1765..191a389b45d1 100644
> --- a/tools/perf/builtin-report.c
> +++ b/tools/perf/builtin-report.c
> @@ -1105,7 +1105,7 @@ int cmd_report(int argc, const char **argv)
> OPT_CALLBACK(0, "percentage", NULL, "relative|absolute",
> "how to display percentage of filtered entries", parse_filter_percentage),
> OPT_CALLBACK_OPTARG(0, "itrace", &itrace_synth_opts, NULL, "opts",
> - "Instruction Tracing options",
> + "Instruction Tracing options\n" ITRACE_HELP,
> itrace_parse_synth_opts),
> OPT_BOOLEAN(0, "full-source-path", &srcline_full_filename,
> "Show full source file name path for source lines"),
> diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
> index 6176bae177c2..49d84c85e846 100644
> --- a/tools/perf/builtin-script.c
> +++ b/tools/perf/builtin-script.c
> @@ -3195,7 +3195,7 @@ int cmd_script(int argc, const char **argv)
> OPT_BOOLEAN(0, "ns", &nanosecs,
> "Use 9 decimal places when displaying time"),
> OPT_CALLBACK_OPTARG(0, "itrace", &itrace_synth_opts, NULL, "opts",
> - "Instruction Tracing options",
> + "Instruction Tracing options\n" ITRACE_HELP,
> itrace_parse_synth_opts),
> OPT_BOOLEAN(0, "full-source-path", &srcline_full_filename,
> "Show full source file name path for source lines"),
> diff --git a/tools/perf/util/auxtrace.h b/tools/perf/util/auxtrace.h
> index 71fc3bd74299..a516a891d7ea 100644
> --- a/tools/perf/util/auxtrace.h
> +++ b/tools/perf/util/auxtrace.h
> @@ -577,6 +577,23 @@ static inline void auxtrace__free(struct perf_session *session)
> return session->auxtrace->free(session);
> }
>
> +#define ITRACE_HELP \
> +" i: synthesize instructions events\n" \
> +" b: synthesize branches events\n" \
> +" c: synthesize branches events (calls only)\n" \
> +" r: synthesize branches events (returns only)\n" \
> +" x: synthesize transactions events\n" \
> +" w: synthesize ptwrite events\n" \
> +" p: synthesize power events\n" \
> +" e: synthesize error events\n" \
> +" d: create a debug log\n" \
> +" g[len]: synthesize a call chain (use with i or x)\n" \
> +" l[len]: synthesize last branch entries (use with i or x)\n" \
> +" sNUMBER: skip initial number of events\n" \
> +" PERIOD[ns|us|ms|i|t]: specify period to sample stream\n" \
> +" concatenate multiple options. Default is ibxwpe or cewp\n"
> +
> +
> #else
>
> static inline struct auxtrace_record *
> @@ -717,6 +734,8 @@ void auxtrace_mmap_params__set_idx(struct auxtrace_mmap_params *mp,
> struct perf_evlist *evlist, int idx,
> bool per_cpu);
>
> +#define ITRACE_HELP ""
> +
> #endif
>
> #endif
> --
> 2.17.1
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v5 2/9] tools, pager: Support overwriting the pager
2018-09-18 12:32 ` [PATCH v5 2/9] tools, pager: Support overwriting the pager Andi Kleen
@ 2018-09-19 18:17 ` Arnaldo Carvalho de Melo
2018-09-26 8:44 ` [tip:perf/core] tools lib subcmd: " tip-bot for Andi Kleen
1 sibling, 0 replies; 30+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-09-19 18:17 UTC (permalink / raw)
To: Andi Kleen
Cc: jolsa, adrian.hunter, linux-kernel, kim.phillips, Andi Kleen,
Josh Poimboeuf, Jiri Olsa, Namhyung Kim
Em Tue, Sep 18, 2018 at 05:32:07AM -0700, Andi Kleen escreveu:
> From: Andi Kleen <ak@linux.intel.com>
>
> Add an interface to the auto pager code that allows callers
> to overwrite the pager.
Applied,
- Arnaldo
> Signed-off-by: Andi Kleen <ak@linux.intel.com>
> ---
> tools/lib/subcmd/pager.c | 11 ++++++++++-
> tools/lib/subcmd/pager.h | 1 +
> 2 files changed, 11 insertions(+), 1 deletion(-)
>
> diff --git a/tools/lib/subcmd/pager.c b/tools/lib/subcmd/pager.c
> index 9997a8805a82..e3d47b59b14d 100644
> --- a/tools/lib/subcmd/pager.c
> +++ b/tools/lib/subcmd/pager.c
> @@ -23,6 +23,13 @@ void pager_init(const char *pager_env)
> subcmd_config.pager_env = pager_env;
> }
>
> +static const char *forced_pager;
> +
> +void force_pager(const char *pager)
> +{
> + forced_pager = pager;
> +}
> +
> static void pager_preexec(void)
> {
> /*
> @@ -66,7 +73,9 @@ void setup_pager(void)
> const char *pager = getenv(subcmd_config.pager_env);
> struct winsize sz;
>
> - if (!isatty(1))
> + if (forced_pager)
> + pager = forced_pager;
> + if (!isatty(1) && !forced_pager)
> return;
> if (ioctl(1, TIOCGWINSZ, &sz) == 0)
> pager_columns = sz.ws_col;
> diff --git a/tools/lib/subcmd/pager.h b/tools/lib/subcmd/pager.h
> index f1a53cf29880..a818964693ab 100644
> --- a/tools/lib/subcmd/pager.h
> +++ b/tools/lib/subcmd/pager.h
> @@ -7,5 +7,6 @@ extern void pager_init(const char *pager_env);
> extern void setup_pager(void);
> extern int pager_in_use(void);
> extern int pager_get_columns(void);
> +extern void force_pager(const char *);
>
> #endif /* __SUBCMD_PAGER_H */
> --
> 2.17.1
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v5 4/9] perf, tools, script: Allow sym and dso without ip, addr
2018-09-18 12:32 ` [PATCH v5 4/9] perf, tools, script: Allow sym and dso without ip, addr Andi Kleen
@ 2018-09-19 18:24 ` Arnaldo Carvalho de Melo
2018-09-26 8:44 ` [tip:perf/core] perf " tip-bot for Andi Kleen
1 sibling, 0 replies; 30+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-09-19 18:24 UTC (permalink / raw)
To: Andi Kleen; +Cc: jolsa, adrian.hunter, linux-kernel, kim.phillips, Andi Kleen
Em Tue, Sep 18, 2018 at 05:32:09AM -0700, Andi Kleen escreveu:
> This allows to remove a lot of noise in callindent output by
> dropping the numerical addresses, which are not all that useful.
>
> Before
>
> % perf script --itrace=cr -F +callindent
> swapper 0 [000] 156546.354971: 1 branches: pt_config 0 [unknown] ([unknown]) => ffffffff81010486 pt_config ([kernel.kallsyms])
> swapper 0 [000] 156546.354971: 1 branches: pt_config ffffffff81010499 pt_config ([kernel.kallsyms]) => ffffffff8101063e pt_event_add ([kernel.kallsyms])
>
> After
>
> % perf script --itrace=cr -F +callindent,-ip,-sym,-symoff
> swapper 0 [000] 156546.354971: 1 branches: pt_config
> swapper 0 [000] 156546.354971: 1 branches: pt_config
Tested, seems to work as advertised,
Applied,
- Arnaldo
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v5 5/9] perf, tools, script: Print DSO for callindent
2018-09-18 12:32 ` [PATCH v5 5/9] perf, tools, script: Print DSO for callindent Andi Kleen
@ 2018-09-19 18:26 ` Arnaldo Carvalho de Melo
2018-09-26 8:45 ` [tip:perf/core] perf " tip-bot for Andi Kleen
1 sibling, 0 replies; 30+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-09-19 18:26 UTC (permalink / raw)
To: Andi Kleen; +Cc: jolsa, adrian.hunter, linux-kernel, kim.phillips, Andi Kleen
Em Tue, Sep 18, 2018 at 05:32:10AM -0700, Andi Kleen escreveu:
> From: Andi Kleen <ak@linux.intel.com>
>
> Now that we don't need to print the IP/ADDR for callindent the DSO
> is also not printed. It's useful for some cases, so add an own DSO
> printout for callindent for the case when IP/ADDR is not enabled.
>
> Before:
>
> % perf script --itrace=cr -F +callindent,-ip,-sym,-symoff,-addr
> swapper 0 [000] 3377.917072: 1 branches: pt_config
> swapper 0 [000] 3377.917072: 1 branches: pt_config
> swapper 0 [000] 3377.917072: 1 branches: pt_event_add
> swapper 0 [000] 3377.917072: 1 branches: perf_pmu_enable
> swapper 0 [000] 3377.917072: 1 branches: perf_pmu_nop_void
> swapper 0 [000] 3377.917072: 1 branches: event_sched_in.isra.107
> swapper 0 [000] 3377.917072: 1 branches: __x86_indirect_thunk_rax
> swapper 0 [000] 3377.917072: 1 branches: perf_pmu_nop_int
> swapper 0 [000] 3377.917072: 1 branches: group_sched_in
> swapper 0 [000] 3377.917072: 1 branches: event_filter_match
> swapper 0 [000] 3377.917072: 1 branches: event_filter_match
> swapper 0 [000] 3377.917072: 1 branches: group_sched_in
>
> After:
>
> swapper 0 [000] 3377.917072: 1 branches: ([unknown]) pt_config
> swapper 0 [000] 3377.917072: 1 branches: ([kernel.kallsyms]) pt_config
Tested, applied.
- Arnaldo
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v5 6/9] perf, tools, script: Make itrace script default to all calls
2018-09-18 13:24 ` Adrian Hunter
@ 2018-09-19 18:28 ` Arnaldo Carvalho de Melo
0 siblings, 0 replies; 30+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-09-19 18:28 UTC (permalink / raw)
To: Adrian Hunter; +Cc: Andi Kleen, jolsa, linux-kernel, kim.phillips, Andi Kleen
Em Tue, Sep 18, 2018 at 04:24:55PM +0300, Adrian Hunter escreveu:
> On 18/09/18 15:32, Andi Kleen wrote:
> > +++ b/tools/perf/util/auxtrace.c
> > @@ -964,16 +964,23 @@ s64 perf_event__process_auxtrace(struct perf_tool *tool,
> > #define PERF_ITRACE_DEFAULT_LAST_BRANCH_SZ 64
> > #define PERF_ITRACE_MAX_LAST_BRANCH_SZ 1024
> >
> > -void itrace_synth_opts__set_default(struct itrace_synth_opts *synth_opts)
> > +void itrace_synth_opts__set_default(struct itrace_synth_opts *synth_opts,
> > + bool no_sample)
>
> Rather than pass no_sample, what about passing the whole of the tool's
> options i.e.
This can be done on top, right? Or will you need the other options on
work you're doing right now and that would be an advantage to do this
right now?
- Arnaldo
> diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
> index db1511359c5e..d4db61626a5f 100644
> --- a/tools/perf/util/auxtrace.c
> +++ b/tools/perf/util/auxtrace.c
> @@ -964,8 +964,14 @@ s64 perf_event__process_auxtrace(struct perf_tool *tool,
> #define PERF_ITRACE_DEFAULT_LAST_BRANCH_SZ 64
> #define PERF_ITRACE_MAX_LAST_BRANCH_SZ 1024
>
> -void itrace_synth_opts__set_default(struct itrace_synth_opts *synth_opts)
> +bool itrace_synth_opts__set_default(struct itrace_synth_opts *synth_opts,
> + const struct itrace_synth_opts *tool_synth_opts)
> {
> + if (tool_synth_opts && tool_synth_opts->set) {
> + *synth_opts = *tool_synth_opts;
> + return false;
> + }
> +
> synth_opts->instructions = true;
> synth_opts->branches = true;
> synth_opts->transactions = true;
> @@ -977,6 +983,8 @@ void itrace_synth_opts__set_default(struct itrace_synth_opts *synth_opts)
> synth_opts->callchain_sz = PERF_ITRACE_DEFAULT_CALLCHAIN_SZ;
> synth_opts->last_branch_sz = PERF_ITRACE_DEFAULT_LAST_BRANCH_SZ;
> synth_opts->initial_skip = 0;
> +
> + return true;
> }
>
> /*
> @@ -1001,7 +1009,7 @@ int itrace_parse_synth_opts(const struct option *opt, const char *str,
> }
>
> if (!str) {
> - itrace_synth_opts__set_default(synth_opts);
> + itrace_synth_opts__set_default(synth_opts, NULL);
> return 0;
> }
>
> diff --git a/tools/perf/util/auxtrace.h b/tools/perf/util/auxtrace.h
> index a516a891d7ea..172c92a8c596 100644
> --- a/tools/perf/util/auxtrace.h
> +++ b/tools/perf/util/auxtrace.h
> @@ -528,7 +528,8 @@ int perf_event__process_auxtrace_error(struct perf_tool *tool,
> struct perf_session *session);
> int itrace_parse_synth_opts(const struct option *opt, const char *str,
> int unset);
> -void itrace_synth_opts__set_default(struct itrace_synth_opts *synth_opts);
> +bool itrace_synth_opts__set_default(struct itrace_synth_opts *synth_opts,
> + const struct itrace_synth_opts *tool_synth_opts);
>
> size_t perf_event__fprintf_auxtrace_error(union perf_event *event, FILE *fp);
> void perf_session__auxtrace_error_inc(struct perf_session *session,
> diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
> index 2ae640257fdb..45b9a2b9f473 100644
> --- a/tools/perf/util/cs-etm.c
> +++ b/tools/perf/util/cs-etm.c
> @@ -1429,12 +1429,9 @@ int cs_etm__process_auxtrace_info(union perf_event *event,
> return 0;
> }
>
> - if (session->itrace_synth_opts && session->itrace_synth_opts->set) {
> - etm->synth_opts = *session->itrace_synth_opts;
> - } else {
> - itrace_synth_opts__set_default(&etm->synth_opts);
> + if (itrace_synth_opts__set_default(&etm->synth_opts,
> + session->itrace_synth_opts))
> etm->synth_opts.callchain = false;
> - }
>
> err = cs_etm__synth_events(etm, session);
> if (err)
> diff --git a/tools/perf/util/intel-bts.c b/tools/perf/util/intel-bts.c
> index 7f0c83b6332b..2d96f4743789 100644
> --- a/tools/perf/util/intel-bts.c
> +++ b/tools/perf/util/intel-bts.c
> @@ -907,12 +907,9 @@ int intel_bts_process_auxtrace_info(union perf_event *event,
> if (dump_trace)
> return 0;
>
> - if (session->itrace_synth_opts && session->itrace_synth_opts->set) {
> - bts->synth_opts = *session->itrace_synth_opts;
> - } else {
> - itrace_synth_opts__set_default(&bts->synth_opts);
> - if (session->itrace_synth_opts)
> - bts->synth_opts.thread_stack =
> + if (itrace_synth_opts__set_default(&bts->synth_opts,
> + session->itrace_synth_opts)) {
> + bts->synth_opts.thread_stack =
> session->itrace_synth_opts->thread_stack;
> }
>
> diff --git a/tools/perf/util/intel-pt.c b/tools/perf/util/intel-pt.c
> index aec68908d604..b6ac8ac2f4c3 100644
> --- a/tools/perf/util/intel-pt.c
> +++ b/tools/perf/util/intel-pt.c
> @@ -2551,10 +2551,8 @@ int intel_pt_process_auxtrace_info(union perf_event *event,
> goto err_delete_thread;
> }
>
> - if (session->itrace_synth_opts && session->itrace_synth_opts->set) {
> - pt->synth_opts = *session->itrace_synth_opts;
> - } else {
> - itrace_synth_opts__set_default(&pt->synth_opts);
> + if (itrace_synth_opts__set_default(&pt->synth_opts,
> + session->itrace_synth_opts)) {
> if (use_browser != -1) {
> pt->synth_opts.branches = false;
> pt->synth_opts.callchain = true;
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v5 9/9] perf, tools, script: Support total cycles count
2018-09-18 12:32 ` [PATCH v5 9/9] perf, tools, script: Support total cycles count Andi Kleen
@ 2018-09-19 18:31 ` Arnaldo Carvalho de Melo
0 siblings, 0 replies; 30+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-09-19 18:31 UTC (permalink / raw)
To: Andi Kleen; +Cc: jolsa, adrian.hunter, linux-kernel, kim.phillips, Andi Kleen
Em Tue, Sep 18, 2018 at 05:32:14AM -0700, Andi Kleen escreveu:
> From: Andi Kleen <ak@linux.intel.com>
>
> For perf script brstackinsn also print a running cycles count.
> This makes it easier to calculate cycle deltas for code sections
> measured with LBRs.
>
> % perf record -b -a sleep 1
> % perf script -F +brstackinsn
> ...
> _dl_sysdep_start+330:
> 00007eff9f20583a insn: 75 c4 # PRED 24 cycles [24]
> 00007eff9f205800 insn: 48 83 e8 03
> 00007eff9f205804 insn: 48 83 f8 1e
> 00007eff9f205808 insn: 77 26
> 00007eff9f20580a insn: 48 63 04 81
> 00007eff9f20580e insn: 48 01 c8
> 00007eff9f205811 insn: ff e0 # MISPRED 31 cycles [7] 0.71 IPC
> 00007eff9f2059c0 insn: 44 8b 62 08
> 00007eff9f2059c4 insn: e9 67 fe ff ff # PRED 55 cycles [24] 0.04 IPC
> 00007eff9f205830 insn: 48 83 c2 10
> 00007eff9f205834 insn: 48 8b 02
> 00007eff9f205837 insn: 48 85 c0
> 00007eff9f20583a insn: 75 c4 # PRED 68 cycles [13] 0.23 IPC
>
> Signed-off-by: Andi Kleen <ak@linux.intel.com>
> ---
> tools/perf/builtin-script.c | 12 ++++++++----
> 1 file changed, 8 insertions(+), 4 deletions(-)
>
> diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
> index 3b73c3226141..84895c861b84 100644
> --- a/tools/perf/builtin-script.c
> +++ b/tools/perf/builtin-script.c
> @@ -913,7 +913,7 @@ static int grab_bb(u8 *buffer, u64 start, u64 end,
>
> static int ip__fprintf_jump(uint64_t ip, struct branch_entry *en,
> struct perf_insn *x, u8 *inbuf, int len,
> - int insn, FILE *fp)
> + int insn, FILE *fp, int *total_cycles)
> {
> int printed = fprintf(fp, "\t%016" PRIx64 "\t%-30s\t#%s%s%s%s", ip,
> dump_insn(x, ip, inbuf, len, NULL),
> @@ -922,7 +922,8 @@ static int ip__fprintf_jump(uint64_t ip, struct branch_entry *en,
> en->flags.in_tx ? " INTX" : "",
> en->flags.abort ? " ABORT" : "");
> if (en->flags.cycles) {
> - printed += fprintf(fp, " %d cycles", en->flags.cycles);
> + *total_cycles += en->flags.cycles;
> + printed += fprintf(fp, " %d cycles [%d]", *total_cycles, en->flags.cycles);
> if (insn)
> printed += fprintf(fp, " %.2f IPC", (float)insn / en->flags.cycles);
> }
> @@ -979,6 +980,7 @@ static int perf_sample__fprintf_brstackinsn(struct perf_sample *sample,
> u8 buffer[MAXBB];
> unsigned off;
> struct symbol *lastsym = NULL;
> + int total_cycles = 0;
>
> if (!(br && br->nr))
> return 0;
> @@ -999,7 +1001,7 @@ static int perf_sample__fprintf_brstackinsn(struct perf_sample *sample,
> printed += ip__fprintf_sym(br->entries[nr - 1].from, thread,
> x.cpumode, x.cpu, &lastsym, attr, fp);
> printed += ip__fprintf_jump(br->entries[nr - 1].from, &br->entries[nr - 1],
> - &x, buffer, len, 0, fp);
> + &x, buffer, len, 0, fp, &total_cycles);
> }
>
> /* Print all blocks */
> @@ -1027,7 +1029,9 @@ static int perf_sample__fprintf_brstackinsn(struct perf_sample *sample,
>
> printed += ip__fprintf_sym(ip, thread, x.cpumode, x.cpu, &lastsym, attr, fp);
> if (ip == end) {
> - printed += ip__fprintf_jump(ip, &br->entries[i], &x, buffer + off, len - off, insn, fp);
> + printed += ip__fprintf_jump(ip, &br->entries[i],
> + &x, buffer + off, len - off,
> + insn, fp, &total_cycles);
Please don't reflow these things like that, just align the next line
with the new thing, parses quicker when reviewing.
- Arnaldo
> break;
> } else {
> printed += fprintf(fp, "\t%016" PRIx64 "\t%s\n", ip,
> --
> 2.17.1
^ permalink raw reply [flat|nested] 30+ messages in thread
* [tip:perf/core] tools lib subcmd: Support overwriting the pager
2018-09-18 12:32 ` [PATCH v5 2/9] tools, pager: Support overwriting the pager Andi Kleen
2018-09-19 18:17 ` Arnaldo Carvalho de Melo
@ 2018-09-26 8:44 ` tip-bot for Andi Kleen
1 sibling, 0 replies; 30+ messages in thread
From: tip-bot for Andi Kleen @ 2018-09-26 8:44 UTC (permalink / raw)
To: linux-tip-commits
Cc: jpoimboe, ak, namhyung, linux-kernel, acme, mingo, jolsa, hpa,
adrian.hunter, kim.phillips, tglx
Commit-ID: 03a1f49f26482cf832e7f0157ae5da716d927701
Gitweb: https://git.kernel.org/tip/03a1f49f26482cf832e7f0157ae5da716d927701
Author: Andi Kleen <ak@linux.intel.com>
AuthorDate: Tue, 18 Sep 2018 05:32:07 -0700
Committer: Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Wed, 19 Sep 2018 15:16:19 -0300
tools lib subcmd: Support overwriting the pager
Add an interface to the auto pager code that allows callers to overwrite
the pager.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Kim Phillips <kim.phillips@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/20180918123214.26728-3-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
| 11 ++++++++++-
| 1 +
2 files changed, 11 insertions(+), 1 deletion(-)
--git a/tools/lib/subcmd/pager.c b/tools/lib/subcmd/pager.c
index 9997a8805a82..e3d47b59b14d 100644
--- a/tools/lib/subcmd/pager.c
+++ b/tools/lib/subcmd/pager.c
@@ -23,6 +23,13 @@ void pager_init(const char *pager_env)
subcmd_config.pager_env = pager_env;
}
+static const char *forced_pager;
+
+void force_pager(const char *pager)
+{
+ forced_pager = pager;
+}
+
static void pager_preexec(void)
{
/*
@@ -66,7 +73,9 @@ void setup_pager(void)
const char *pager = getenv(subcmd_config.pager_env);
struct winsize sz;
- if (!isatty(1))
+ if (forced_pager)
+ pager = forced_pager;
+ if (!isatty(1) && !forced_pager)
return;
if (ioctl(1, TIOCGWINSZ, &sz) == 0)
pager_columns = sz.ws_col;
--git a/tools/lib/subcmd/pager.h b/tools/lib/subcmd/pager.h
index f1a53cf29880..a818964693ab 100644
--- a/tools/lib/subcmd/pager.h
+++ b/tools/lib/subcmd/pager.h
@@ -7,5 +7,6 @@ extern void pager_init(const char *pager_env);
extern void setup_pager(void);
extern int pager_in_use(void);
extern int pager_get_columns(void);
+extern void force_pager(const char *);
#endif /* __SUBCMD_PAGER_H */
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [tip:perf/core] perf script: Allow sym and dso without ip, addr
2018-09-18 12:32 ` [PATCH v5 4/9] perf, tools, script: Allow sym and dso without ip, addr Andi Kleen
2018-09-19 18:24 ` Arnaldo Carvalho de Melo
@ 2018-09-26 8:44 ` tip-bot for Andi Kleen
1 sibling, 0 replies; 30+ messages in thread
From: tip-bot for Andi Kleen @ 2018-09-26 8:44 UTC (permalink / raw)
To: linux-tip-commits
Cc: jolsa, tglx, adrian.hunter, hpa, kim.phillips, ak, linux-kernel,
mingo, acme
Commit-ID: 37fed3de555199733805a2d3e03aee7727c09ea4
Gitweb: https://git.kernel.org/tip/37fed3de555199733805a2d3e03aee7727c09ea4
Author: Andi Kleen <ak@linux.intel.com>
AuthorDate: Tue, 18 Sep 2018 05:32:09 -0700
Committer: Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Wed, 19 Sep 2018 15:20:03 -0300
perf script: Allow sym and dso without ip, addr
Currently sym and dso require printing ip and addr because the print
function is tied to those outputs. With callindent it makes sense to
print the symbol or dso without numerical IP or ADDR. So change the
dependency check to only check the underlying attribute.
Also the branch target output relies on the user_set flag to determine
if the branch target should be implicitely printed. When modifying the
fields with + or - also set user_set, so that ADDR can be removed. We
also need to set wildcard_set to make the initial sanity check pass.
This allows to remove a lot of noise in callindent output by dropping
the numerical addresses, which are not all that useful.
Before
% perf script --itrace=cr -F +callindent
swapper 0 [000] 156546.354971: 1 branches: pt_config 0 [unknown] ([unknown]) => ffffffff81010486 pt_config ([kernel.kallsyms])
swapper 0 [000] 156546.354971: 1 branches: pt_config ffffffff81010499 pt_config ([kernel.kallsyms]) => ffffffff8101063e pt_event_add ([kernel.kallsyms])
swapper 0 [000] 156546.354971: 1 branches: pt_event_add ffffffff81010635 pt_event_add ([kernel.kallsyms]) => ffffffff8115e687 event_sched_in.isra.107 ([kernel.kallsyms])
swapper 0 [000] 156546.354971: 1 branches: perf_pmu_enable ffffffff8115e726 event_sched_in.isra.107 ([kernel.kallsyms]) => ffffffff811579b0 perf_pmu_enable ([kernel.kallsyms])
swapper 0 [000] 156546.354971: 1 branches: perf_pmu_nop_void ffffffff81151730 perf_pmu_nop_void ([kernel.kallsyms]) => ffffffff8115e72b event_sched_in.isra.107 ([kernel.kallsyms])
swapper 0 [000] 156546.354971: 1 branches: event_sched_in.isra.107 ffffffff8115e737 event_sched_in.isra.107 ([kernel.kallsyms]) => ffffffff8115e7a5 group_sched_in ([kernel.kallsyms])
swapper 0 [000] 156546.354971: 1 branches: __x86_indirect_thunk_rax ffffffff8115e7f6 group_sched_in ([kernel.kallsyms]) => ffffffff81a03000 __x86_indirect_thunk_rax ([kernel.kallsyms])
After
% perf script --itrace=cr -F +callindent,-ip,-sym,-symoff
swapper 0 [000] 156546.354971: 1 branches: pt_config
swapper 0 [000] 156546.354971: 1 branches: pt_config
swapper 0 [000] 156546.354971: 1 branches: pt_event_add
swapper 0 [000] 156546.354971: 1 branches: perf_pmu_enable
swapper 0 [000] 156546.354971: 1 branches: perf_pmu_nop_void
swapper 0 [000] 156546.354971: 1 branches: event_sched_in.isra.107
swapper 0 [000] 156546.354971: 1 branches: __x86_indirect_thunk_rax
swapper 0 [000] 156546.354971: 1 branches: perf_pmu_nop_int
swapper 0 [000] 156546.354971: 1 branches: group_sched_in
swapper 0 [000] 156546.354971: 1 branches: event_filter_match
swapper 0 [000] 156546.354971: 1 branches: event_filter_match
swapper 0 [000] 156546.354971: 1 branches: group_sched_in
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kim Phillips <kim.phillips@arm.com>
Link: http://lkml.kernel.org/r/20180918123214.26728-5-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/builtin-script.c | 17 ++++++++++-------
1 file changed, 10 insertions(+), 7 deletions(-)
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 66699dd351e0..7aa59696e97a 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -406,9 +406,10 @@ static int perf_evsel__check_attr(struct perf_evsel *evsel,
PERF_OUTPUT_WEIGHT))
return -EINVAL;
- if (PRINT_FIELD(SYM) && !PRINT_FIELD(IP) && !PRINT_FIELD(ADDR)) {
+ if (PRINT_FIELD(SYM) &&
+ !(evsel->attr.sample_type & (PERF_SAMPLE_IP|PERF_SAMPLE_ADDR))) {
pr_err("Display of symbols requested but neither sample IP nor "
- "sample address\nis selected. Hence, no addresses to convert "
+ "sample address\navailable. Hence, no addresses to convert "
"to symbols.\n");
return -EINVAL;
}
@@ -417,10 +418,9 @@ static int perf_evsel__check_attr(struct perf_evsel *evsel,
"selected.\n");
return -EINVAL;
}
- if (PRINT_FIELD(DSO) && !PRINT_FIELD(IP) && !PRINT_FIELD(ADDR) &&
- !PRINT_FIELD(BRSTACK) && !PRINT_FIELD(BRSTACKSYM) && !PRINT_FIELD(BRSTACKOFF)) {
- pr_err("Display of DSO requested but no address to convert. Select\n"
- "sample IP, sample address, brstack, brstacksym, or brstackoff.\n");
+ if (PRINT_FIELD(DSO) &&
+ !(evsel->attr.sample_type & (PERF_SAMPLE_IP|PERF_SAMPLE_ADDR))) {
+ pr_err("Display of DSO requested but no address to convert.\n");
return -EINVAL;
}
if (PRINT_FIELD(SRCLINE) && !PRINT_FIELD(IP)) {
@@ -2491,6 +2491,8 @@ parse:
output[j].fields &= ~all_output_options[i].field;
else
output[j].fields |= all_output_options[i].field;
+ output[j].user_set = true;
+ output[j].wildcard_set = true;
}
}
} else {
@@ -2501,7 +2503,8 @@ parse:
rc = -EINVAL;
goto out;
}
- output[type].fields |= all_output_options[i].field;
+ output[type].user_set = true;
+ output[type].wildcard_set = true;
}
}
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [tip:perf/core] perf script: Print DSO for callindent
2018-09-18 12:32 ` [PATCH v5 5/9] perf, tools, script: Print DSO for callindent Andi Kleen
2018-09-19 18:26 ` Arnaldo Carvalho de Melo
@ 2018-09-26 8:45 ` tip-bot for Andi Kleen
1 sibling, 0 replies; 30+ messages in thread
From: tip-bot for Andi Kleen @ 2018-09-26 8:45 UTC (permalink / raw)
To: linux-tip-commits
Cc: tglx, hpa, ak, jolsa, mingo, kim.phillips, adrian.hunter, acme,
linux-kernel
Commit-ID: a78cdee6fbb136694334ade4cedb331a9d0b4d5e
Gitweb: https://git.kernel.org/tip/a78cdee6fbb136694334ade4cedb331a9d0b4d5e
Author: Andi Kleen <ak@linux.intel.com>
AuthorDate: Tue, 18 Sep 2018 05:32:10 -0700
Committer: Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Wed, 19 Sep 2018 15:25:51 -0300
perf script: Print DSO for callindent
Now that we don't need to print the IP/ADDR for callindent the DSO is
also not printed. It's useful for some cases, so add an own DSO printout
for callindent for the case when IP/ADDR is not enabled.
Before:
% perf script --itrace=cr -F +callindent,-ip,-sym,-symoff,-addr
swapper 0 [000] 3377.917072: 1 branches: pt_config
swapper 0 [000] 3377.917072: 1 branches: pt_config
swapper 0 [000] 3377.917072: 1 branches: pt_event_add
swapper 0 [000] 3377.917072: 1 branches: perf_pmu_enable
swapper 0 [000] 3377.917072: 1 branches: perf_pmu_nop_void
swapper 0 [000] 3377.917072: 1 branches: event_sched_in.isra.107
swapper 0 [000] 3377.917072: 1 branches: __x86_indirect_thunk_rax
swapper 0 [000] 3377.917072: 1 branches: perf_pmu_nop_int
swapper 0 [000] 3377.917072: 1 branches: group_sched_in
swapper 0 [000] 3377.917072: 1 branches: event_filter_match
swapper 0 [000] 3377.917072: 1 branches: event_filter_match
swapper 0 [000] 3377.917072: 1 branches: group_sched_in
After:
swapper 0 [000] 3377.917072: 1 branches: ([unknown]) pt_config
swapper 0 [000] 3377.917072: 1 branches: ([kernel.kallsyms]) pt_config
swapper 0 [000] 3377.917072: 1 branches: ([kernel.kallsyms]) pt_event_add
swapper 0 [000] 3377.917072: 1 branches: ([kernel.kallsyms]) perf_pmu_enable
swapper 0 [000] 3377.917072: 1 branches: ([kernel.kallsyms]) perf_pmu_nop_void
swapper 0 [000] 3377.917072: 1 branches: ([kernel.kallsyms]) event_sched_in.isra.107
swapper 0 [000] 3377.917072: 1 branches: ([kernel.kallsyms]) __x86_indirect_thunk_rax
swapper 0 [000] 3377.917072: 1 branches: ([kernel.kallsyms]) perf_pmu_nop_int
swapper 0 [000] 3377.917072: 1 branches: ([kernel.kallsyms]) group_sched_in
swapper 0 [000] 3377.917072: 1 branches: ([kernel.kallsyms]) event_filter_match
swapper 0 [000] 3377.917072: 1 branches: ([kernel.kallsyms]) event_filter_match
swapper 0 [000] 3377.917072: 1 branches: ([kernel.kallsyms]) group_sched_in
swapper 0 [000] 3377.917072: 1 branches: ([kernel.kallsyms]) __x86_indirect_thunk_rax
swapper 0 [000] 3377.917072: 1 branches: ([kernel.kallsyms]) perf_pmu_nop_txn
swapper 0 [000] 3377.917072: 1 branches: ([kernel.kallsyms]) event_sched_in.isra.107
(in the kernel case of course it's not very useful, but it's important
with user programs where symbols are not unique)
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kim Phillips <kim.phillips@arm.com>
Link: http://lkml.kernel.org/r/20180918123214.26728-6-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/builtin-script.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 7aa59696e97a..7732346bd9dd 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -1115,6 +1115,7 @@ static int perf_sample__fprintf_callindent(struct perf_sample *sample,
const char *name = NULL;
static int spacing;
int len = 0;
+ int dlen = 0;
u64 ip = 0;
/*
@@ -1141,6 +1142,12 @@ static int perf_sample__fprintf_callindent(struct perf_sample *sample,
ip = sample->ip;
}
+ if (PRINT_FIELD(DSO) && !(PRINT_FIELD(IP) || PRINT_FIELD(ADDR))) {
+ dlen += fprintf(fp, "(");
+ dlen += map__fprintf_dsoname(al->map, fp);
+ dlen += fprintf(fp, ")\t");
+ }
+
if (name)
len = fprintf(fp, "%*s%s", (int)depth * 4, "", name);
else if (ip)
@@ -1159,7 +1166,7 @@ static int perf_sample__fprintf_callindent(struct perf_sample *sample,
if (len < spacing)
len += fprintf(fp, "%*s", spacing - len, "");
- return len;
+ return len + dlen;
}
static int perf_sample__fprintf_insn(struct perf_sample *sample,
^ permalink raw reply related [flat|nested] 30+ messages in thread
* Make perf script easier to use for itrace
@ 2018-09-20 18:05 Andi Kleen
0 siblings, 0 replies; 30+ messages in thread
From: Andi Kleen @ 2018-09-20 18:05 UTC (permalink / raw)
To: acme; +Cc: jolsa, linux-kernel
Implement a range of improvements to make it easier to look
at itrace traces with perf script. Nothing here couldn't be done
before with some additional scripting, but add simple high
level options to make it easier to use.
% perf record -e intel_pt//k -a sleep 1
Show function calls:
% perf script --call-trace
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_pmu_enable
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) __x86_indirect_thunk_rax
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) event_filter_match
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) group_sched_in
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) __x86_indirect_thunk_rax
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) event_sched_in.isra.107
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_event_set_state.part.71
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_event_update_time
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_pmu_disable
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_log_itrace_start
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) __x86_indirect_thunk_rax
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_event_update_userpage
Show function calls and returns:
% perf script --call-ret-trace
perf 900 [000] 194167.205652203: tr strt ([unknown]) pt_config
perf 900 [000] 194167.205652203: return ([kernel.kallsyms]) pt_config
perf 900 [000] 194167.205652203: return ([kernel.kallsyms]) pt_event_add
perf 900 [000] 194167.205652203: call ([kernel.kallsyms]) perf_pmu_enable
perf 900 [000] 194167.205652203: return ([kernel.kallsyms]) perf_pmu_nop_void
perf 900 [000] 194167.205652203: return ([kernel.kallsyms]) event_sched_in.isra.107
perf 900 [000] 194167.205652203: call ([kernel.kallsyms]) __x86_indirect_thunk_rax
perf 900 [000] 194167.205652203: return ([kernel.kallsyms]) perf_pmu_nop_int
perf 900 [000] 194167.205652203: return ([kernel.kallsyms]) group_sched_in
perf 900 [000] 194167.205652203: call ([kernel.kallsyms]) event_filter_match
perf 900 [000] 194167.205652203: return ([kernel.kallsyms]) event_filter_match
perf 900 [000] 194167.205652203: call ([kernel.kallsyms]) group_sched_in
Show instruction traces (using XED):
% perf script --insn-trace --xed
swapper 0 [000] 117276.429606186: ffffffff81010486 pt_config ([kernel.kallsyms]) nopl %eax, (%rax,%rax,1)
swapper 0 [000] 117276.429606186: ffffffff8101048b pt_config ([kernel.kallsyms]) add $0x10, %rsp
swapper 0 [000] 117276.429606186: ffffffff8101048f pt_config ([kernel.kallsyms]) popq %rbx
swapper 0 [000] 117276.429606186: ffffffff81010490 pt_config ([kernel.kallsyms]) popq %rbp
swapper 0 [000] 117276.429606186: ffffffff81010491 pt_config ([kernel.kallsyms]) popq %r12
swapper 0 [000] 117276.429606186: ffffffff81010493 pt_config ([kernel.kallsyms]) popq %r13
swapper 0 [000] 117276.429606186: ffffffff81010495 pt_config ([kernel.kallsyms]) popq %r14
swapper 0 [000] 117276.429606186: ffffffff81010497 pt_config ([kernel.kallsyms]) popq %r15
swapper 0 [000] 117276.429606186: ffffffff81010499 pt_config ([kernel.kallsyms]) retq
Filter by a ftrace style graph function:
% perf script --graph-function group_sched_in --call-trace
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) group_sched_in
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) __x86_indirect_thunk_rax
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) event_sched_in.isra.107
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_event_set_state.part.71
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_event_update_time
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_pmu_disable
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_log_itrace_start
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) __x86_indirect_thunk_rax
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_event_update_userpage
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) calc_timer_values
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) sched_clock_cpu
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) __x86_indirect_thunk_rax
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) arch_perf_update_userpage
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) __fentry__
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) using_native_sched_clock
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) sched_clock_stable
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_pmu_enable
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) __x86_indirect_thunk_rax
Also available in
git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-misc.git pt/easy-script-7
v1: Initial post
v2: Address review comments. Minor fixes to descriptions. Now builds everywhere.
v3: Merge wildcard+remove addr/sym patch for one logical review unit.
Drop typed remove patch
Fix build failure on ARM ETM builds.
Fix git branch number.
v4: Remove debug printout.
Really fix ETM build failure (Kim Phillips)
v5: Split out --xed into separate option to handle other architectures better,
and support brstackinsn too. Add total cycles printing for brstackinsn
v6: Document xed build. Change code formatting for one patch.
^ permalink raw reply [flat|nested] 30+ messages in thread
* Make perf script easier to use for itrace
@ 2018-09-14 3:10 Andi Kleen
0 siblings, 0 replies; 30+ messages in thread
From: Andi Kleen @ 2018-09-14 3:10 UTC (permalink / raw)
To: acme; +Cc: jolsa, kim.phillips, linux-kernel
Implement a range of improvements to make it easier to look
at itrace traces with perf script. Nothing here couldn't be done
before with some additional scripting, but add simple high
level options to make it easier to use.
% perf record -e intel_pt//k -a sleep 1
Show function calls:
% perf script --call-trace
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_pmu_enable
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) __x86_indirect_thunk_rax
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) event_filter_match
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) group_sched_in
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) __x86_indirect_thunk_rax
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) event_sched_in.isra.107
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_event_set_state.part.71
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_event_update_time
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_pmu_disable
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_log_itrace_start
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) __x86_indirect_thunk_rax
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_event_update_userpage
Show function calls and returns:
% perf script --call-ret-trace
perf 900 [000] 194167.205652203: tr strt ([unknown]) pt_config
perf 900 [000] 194167.205652203: return ([kernel.kallsyms]) pt_config
perf 900 [000] 194167.205652203: return ([kernel.kallsyms]) pt_event_add
perf 900 [000] 194167.205652203: call ([kernel.kallsyms]) perf_pmu_enable
perf 900 [000] 194167.205652203: return ([kernel.kallsyms]) perf_pmu_nop_void
perf 900 [000] 194167.205652203: return ([kernel.kallsyms]) event_sched_in.isra.107
perf 900 [000] 194167.205652203: call ([kernel.kallsyms]) __x86_indirect_thunk_rax
perf 900 [000] 194167.205652203: return ([kernel.kallsyms]) perf_pmu_nop_int
perf 900 [000] 194167.205652203: return ([kernel.kallsyms]) group_sched_in
perf 900 [000] 194167.205652203: call ([kernel.kallsyms]) event_filter_match
perf 900 [000] 194167.205652203: return ([kernel.kallsyms]) event_filter_match
perf 900 [000] 194167.205652203: call ([kernel.kallsyms]) group_sched_in
Show instruction traces (using XED):
% perf script --insn-trace
swapper 0 [000] 117276.429606186: ffffffff81010486 pt_config ([kernel.kallsyms]) nopl %eax, (%rax,%rax,1)
swapper 0 [000] 117276.429606186: ffffffff8101048b pt_config ([kernel.kallsyms]) add $0x10, %rsp
swapper 0 [000] 117276.429606186: ffffffff8101048f pt_config ([kernel.kallsyms]) popq %rbx
swapper 0 [000] 117276.429606186: ffffffff81010490 pt_config ([kernel.kallsyms]) popq %rbp
swapper 0 [000] 117276.429606186: ffffffff81010491 pt_config ([kernel.kallsyms]) popq %r12
swapper 0 [000] 117276.429606186: ffffffff81010493 pt_config ([kernel.kallsyms]) popq %r13
swapper 0 [000] 117276.429606186: ffffffff81010495 pt_config ([kernel.kallsyms]) popq %r14
swapper 0 [000] 117276.429606186: ffffffff81010497 pt_config ([kernel.kallsyms]) popq %r15
swapper 0 [000] 117276.429606186: ffffffff81010499 pt_config ([kernel.kallsyms]) retq
Filter by a ftrace style graph function:
% perf script --graph-function group_sched_in --call-trace
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) group_sched_in
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) __x86_indirect_thunk_rax
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) event_sched_in.isra.107
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_event_set_state.part.71
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_event_update_time
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_pmu_disable
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_log_itrace_start
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) __x86_indirect_thunk_rax
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_event_update_userpage
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) calc_timer_values
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) sched_clock_cpu
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) __x86_indirect_thunk_rax
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) arch_perf_update_userpage
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) __fentry__
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) using_native_sched_clock
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) sched_clock_stable
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_pmu_enable
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) __x86_indirect_thunk_rax
Also available in
git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-misc.git pt/easy-script-5
v1: Initial post
v2: Address review comments. Minor fixes to descriptions. Now builds everywhere.
v3: Merge wildcard+remove addr/sym patch for one logical review unit.
Drop typed remove patch
Fix build failure on ARM ETM builds.
Fix git branch number.
v4: Remove debug printout.
Really fix ETM build failure (Kim Phillips)
^ permalink raw reply [flat|nested] 30+ messages in thread
* Make perf script easier to use for itrace
@ 2018-09-03 16:22 Andi Kleen
0 siblings, 0 replies; 30+ messages in thread
From: Andi Kleen @ 2018-09-03 16:22 UTC (permalink / raw)
To: acme; +Cc: jolsa, adrian.hunter, kim.phillips, linux-kernel
Implement a range of improvements to make it easier to look
at itrace traces with perf script. Nothing here couldn't be done
before with some additional scripting, but add simple high
level options to make it easier to use.
% perf record -e intel_pt//k -a sleep 1
Show function calls:
% perf script --call-trace
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_pmu_enable
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) __x86_indirect_thunk_rax
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) event_filter_match
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) group_sched_in
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) __x86_indirect_thunk_rax
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) event_sched_in.isra.107
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_event_set_state.part.71
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_event_update_time
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_pmu_disable
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_log_itrace_start
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) __x86_indirect_thunk_rax
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_event_update_userpage
Show function calls and returns:
% perf script --call-ret-trace
perf 900 [000] 194167.205652203: tr strt ([unknown]) pt_config
perf 900 [000] 194167.205652203: return ([kernel.kallsyms]) pt_config
perf 900 [000] 194167.205652203: return ([kernel.kallsyms]) pt_event_add
perf 900 [000] 194167.205652203: call ([kernel.kallsyms]) perf_pmu_enable
perf 900 [000] 194167.205652203: return ([kernel.kallsyms]) perf_pmu_nop_void
perf 900 [000] 194167.205652203: return ([kernel.kallsyms]) event_sched_in.isra.107
perf 900 [000] 194167.205652203: call ([kernel.kallsyms]) __x86_indirect_thunk_rax
perf 900 [000] 194167.205652203: return ([kernel.kallsyms]) perf_pmu_nop_int
perf 900 [000] 194167.205652203: return ([kernel.kallsyms]) group_sched_in
perf 900 [000] 194167.205652203: call ([kernel.kallsyms]) event_filter_match
perf 900 [000] 194167.205652203: return ([kernel.kallsyms]) event_filter_match
perf 900 [000] 194167.205652203: call ([kernel.kallsyms]) group_sched_in
Show instruction traces (using XED):
% perf script --insn-trace
swapper 0 [000] 117276.429606186: ffffffff81010486 pt_config ([kernel.kallsyms]) nopl %eax, (%rax,%rax,1)
swapper 0 [000] 117276.429606186: ffffffff8101048b pt_config ([kernel.kallsyms]) add $0x10, %rsp
swapper 0 [000] 117276.429606186: ffffffff8101048f pt_config ([kernel.kallsyms]) popq %rbx
swapper 0 [000] 117276.429606186: ffffffff81010490 pt_config ([kernel.kallsyms]) popq %rbp
swapper 0 [000] 117276.429606186: ffffffff81010491 pt_config ([kernel.kallsyms]) popq %r12
swapper 0 [000] 117276.429606186: ffffffff81010493 pt_config ([kernel.kallsyms]) popq %r13
swapper 0 [000] 117276.429606186: ffffffff81010495 pt_config ([kernel.kallsyms]) popq %r14
swapper 0 [000] 117276.429606186: ffffffff81010497 pt_config ([kernel.kallsyms]) popq %r15
swapper 0 [000] 117276.429606186: ffffffff81010499 pt_config ([kernel.kallsyms]) retq
Filter by a ftrace style graph function:
% perf script --graph-function group_sched_in --call-trace
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) group_sched_in
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) __x86_indirect_thunk_rax
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) event_sched_in.isra.107
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_event_set_state.part.71
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_event_update_time
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_pmu_disable
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_log_itrace_start
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) __x86_indirect_thunk_rax
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_event_update_userpage
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) calc_timer_values
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) sched_clock_cpu
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) __x86_indirect_thunk_rax
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) arch_perf_update_userpage
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) __fentry__
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) using_native_sched_clock
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) sched_clock_stable
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_pmu_enable
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) __x86_indirect_thunk_rax
Also available in
git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-misc.git pt/easy-script-4
v1: Initial post
v2: Address review comments. Minor fixes to descriptions. Now builds everywhere.
v3: Merge wildcard+remove addr/sym patch for one logical review unit.
Drop typed remove patch
Fix build failure on ARM ETM builds.
Fix git branch number.
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Make perf script easier to use for itrace
2018-08-31 22:54 ` Kim Phillips
2018-09-03 7:31 ` Jiri Olsa
@ 2018-09-03 15:50 ` Andi Kleen
1 sibling, 0 replies; 30+ messages in thread
From: Andi Kleen @ 2018-09-03 15:50 UTC (permalink / raw)
To: Kim Phillips; +Cc: Andi Kleen, acme, jolsa, linux-kernel
> (Ubuntu 18.04):
>
> builtin-script.c: In function ‘parse_output_fields’:
> builtin-script.c:2571:22: error: array subscript is above array bounds [-Werror=array-bounds]
> output[j].fields &= ~all_output_options[i].field;
> ^~
> builtin-script.c:2571:22: error: array subscript is above array bounds [-Werror=array-bounds]
> output[j].fields &= ~all_output_options[i].field;
> ~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> builtin-script.c:2574:10: error: array subscript is above array bounds [-Werror=array-bounds]
> output[j].user_set = true;
> ~~~~~~^~~
> builtin-script.c:2575:10: error: array subscript is above array bounds [-Werror=array-bounds]
> output[j].wildcard_set = true;
> ~~~~~~^~~
These were fixed earlier in -3.
> ...
> util/cs-etm.c: In function ‘cs_etm__process_auxtrace_info’:
> util/cs-etm.c:1436:14: error: ‘struct perf_session’ has no member named ‘itrace_synth’; did you mean ‘itrace_synth_opts’?
> session->itrace_synth.default_no_sample);
> ^~~~~~~~~~~~
> itrace_synth_opts
For some reason my x86 targetted build doesn't seem to cover this file, so
I missed it. What do I need to do to enable it?
-Andi
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Make perf script easier to use for itrace
2018-08-31 22:01 Andi Kleen
2018-08-31 22:54 ` Kim Phillips
@ 2018-09-03 7:52 ` Jiri Olsa
1 sibling, 0 replies; 30+ messages in thread
From: Jiri Olsa @ 2018-09-03 7:52 UTC (permalink / raw)
To: Andi Kleen; +Cc: acme, jolsa, linux-kernel
On Fri, Aug 31, 2018 at 03:01:56PM -0700, Andi Kleen wrote:
SNIP
> perf 900 [000] 194167.205652203: ([kernel.kallsyms]) calc_timer_values
> perf 900 [000] 194167.205652203: ([kernel.kallsyms]) sched_clock_cpu
> perf 900 [000] 194167.205652203: ([kernel.kallsyms]) __x86_indirect_thunk_rax
> perf 900 [000] 194167.205652203: ([kernel.kallsyms]) arch_perf_update_userpage
> perf 900 [000] 194167.205652203: ([kernel.kallsyms]) __fentry__
> perf 900 [000] 194167.205652203: ([kernel.kallsyms]) using_native_sched_clock
> perf 900 [000] 194167.205652203: ([kernel.kallsyms]) sched_clock_stable
> perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_pmu_enable
> perf 900 [000] 194167.205652203: ([kernel.kallsyms]) __x86_indirect_thunk_rax
>
>
> Also available in
> git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-misc.git pt/easy-script-2
>
> v1: Initial post
> v2: Address review comments. Minor fixes to descriptions. Now builds everywhere.
with those changelogs updates:
Acked-by: Jiri Olsa <jolsa@kernel.org>
thanks,
jirka
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Make perf script easier to use for itrace
2018-08-31 22:54 ` Kim Phillips
@ 2018-09-03 7:31 ` Jiri Olsa
2018-09-03 15:50 ` Andi Kleen
1 sibling, 0 replies; 30+ messages in thread
From: Jiri Olsa @ 2018-09-03 7:31 UTC (permalink / raw)
To: Kim Phillips; +Cc: Andi Kleen, acme, jolsa, linux-kernel
On Fri, Aug 31, 2018 at 05:54:53PM -0500, Kim Phillips wrote:
> On Fri, 31 Aug 2018 15:01:56 -0700
> Andi Kleen <andi@firstfloor.org> wrote:
>
> > Also available in
> > git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-misc.git pt/easy-script-2
> >
> > v1: Initial post
> > v2: Address review comments. Minor fixes to descriptions. Now builds everywhere.
>
> OK, I grabbed the pt/easy-script-2 branch (commit 4c69fd388e17), and
> see build failures for a 'make -C perf' build using gcc version 7.3.0
> (Ubuntu 18.04):
>
> builtin-script.c: In function ‘parse_output_fields’:
> builtin-script.c:2571:22: error: array subscript is above array bounds [-Werror=array-bounds]
> output[j].fields &= ~all_output_options[i].field;
> ^~
> builtin-script.c:2571:22: error: array subscript is above array bounds [-Werror=array-bounds]
> output[j].fields &= ~all_output_options[i].field;
> ~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> builtin-script.c:2574:10: error: array subscript is above array bounds [-Werror=array-bounds]
> output[j].user_set = true;
> ~~~~~~^~~
> builtin-script.c:2575:10: error: array subscript is above array bounds [-Werror=array-bounds]
> output[j].wildcard_set = true;
> ~~~~~~^~~
> ...
> util/cs-etm.c: In function ‘cs_etm__process_auxtrace_info’:
> util/cs-etm.c:1436:14: error: ‘struct perf_session’ has no member named ‘itrace_synth’; did you mean ‘itrace_synth_opts’?
> session->itrace_synth.default_no_sample);
> ^~~~~~~~~~~~
> itrace_synth_opts
I think Andi has the new changes in pt/easy-script-3 branch,
just did not update the changelog
jirka
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Make perf script easier to use for itrace
2018-08-31 22:01 Andi Kleen
@ 2018-08-31 22:54 ` Kim Phillips
2018-09-03 7:31 ` Jiri Olsa
2018-09-03 15:50 ` Andi Kleen
2018-09-03 7:52 ` Jiri Olsa
1 sibling, 2 replies; 30+ messages in thread
From: Kim Phillips @ 2018-08-31 22:54 UTC (permalink / raw)
To: Andi Kleen; +Cc: acme, jolsa, linux-kernel
On Fri, 31 Aug 2018 15:01:56 -0700
Andi Kleen <andi@firstfloor.org> wrote:
> Also available in
> git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-misc.git pt/easy-script-2
>
> v1: Initial post
> v2: Address review comments. Minor fixes to descriptions. Now builds everywhere.
OK, I grabbed the pt/easy-script-2 branch (commit 4c69fd388e17), and
see build failures for a 'make -C perf' build using gcc version 7.3.0
(Ubuntu 18.04):
builtin-script.c: In function ‘parse_output_fields’:
builtin-script.c:2571:22: error: array subscript is above array bounds [-Werror=array-bounds]
output[j].fields &= ~all_output_options[i].field;
^~
builtin-script.c:2571:22: error: array subscript is above array bounds [-Werror=array-bounds]
output[j].fields &= ~all_output_options[i].field;
~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
builtin-script.c:2574:10: error: array subscript is above array bounds [-Werror=array-bounds]
output[j].user_set = true;
~~~~~~^~~
builtin-script.c:2575:10: error: array subscript is above array bounds [-Werror=array-bounds]
output[j].wildcard_set = true;
~~~~~~^~~
...
util/cs-etm.c: In function ‘cs_etm__process_auxtrace_info’:
util/cs-etm.c:1436:14: error: ‘struct perf_session’ has no member named ‘itrace_synth’; did you mean ‘itrace_synth_opts’?
session->itrace_synth.default_no_sample);
^~~~~~~~~~~~
itrace_synth_opts
Thanks,
Kim
p.s. FWIW, my make VF=1 output:
Auto-detecting system features:
... dwarf: [ on ]
... dwarf_getlocations: [ on ]
... glibc: [ on ]
... gtk2: [ on ]
... libaudit: [ on ]
... libbfd: [ on ]
... libelf: [ on ]
... libnuma: [ on ]
... numa_num_possible_cpus: [ on ]
... libperl: [ on ]
... libpython: [ on ]
... libslang: [ on ]
... libcrypto: [ on ]
... libunwind: [ on ]
... libdw-dwarf-unwind: [ on ]
... zlib: [ on ]
... lzma: [ on ]
... get_cpuid: [ on ]
... bpf: [ on ]
... backtrace: [ on ]
... fortify-source: [ on ]
... sync-compare-and-swap: [ on ]
... gtk2-infobar: [ on ]
... libelf-getphdrnum: [ on ]
... libelf-gelf_getnote: [ on ]
... libelf-getshdrstrndx: [ on ]
... libelf-mmap: [ on ]
... libpython-version: [ on ]
... libunwind-x86: [ OFF ]
... libunwind-x86_64: [ on ]
... libunwind-arm: [ OFF ]
... libunwind-aarch64: [ OFF ]
... pthread-attr-setaffinity-np: [ on ]
... pthread-barrier: [ on ]
... stackprotector-all: [ on ]
... timerfd: [ on ]
... sched_getcpu: [ on ]
... sdt: [ on ]
... setns: [ on ]
... libopencsd: [ on ]
... bionic: [ OFF ]
... compile-32: [ OFF ]
... compile-x32: [ OFF ]
... cplus-demangle: [ on ]
... hello: [ OFF ]
... libbabeltrace: [ on ]
... liberty: [ on ]
... liberty-z: [ on ]
... libunwind-debug-frame: [ OFF ]
... libunwind-debug-frame-arm: [ OFF ]
... libunwind-debug-frame-aarch64: [ OFF ]
... cxx: [ OFF ]
... llvm: [ OFF ]
... llvm-version: [ OFF ]
... clang: [ OFF ]
... prefix: /home/kim
... bindir: /home/kim/bin
... libdir: /home/kim/lib64
... sysconfdir: /home/kim/etc
... LIBUNWIND_DIR:
... LIBDW_DIR:
... JDIR: /usr/lib/jvm/java-1.8.0-openjdk-amd64
... DWARF post unwind library: libunwind
^ permalink raw reply [flat|nested] 30+ messages in thread
* Make perf script easier to use for itrace
@ 2018-08-31 22:01 Andi Kleen
2018-08-31 22:54 ` Kim Phillips
2018-09-03 7:52 ` Jiri Olsa
0 siblings, 2 replies; 30+ messages in thread
From: Andi Kleen @ 2018-08-31 22:01 UTC (permalink / raw)
To: acme; +Cc: jolsa, linux-kernel
Implement a range of improvements to make it easier to look
at itrace traces with perf script. Nothing here couldn't be done
before with some additional scripting, but add simple high
level options to make it easier to use.
% perf record -e intel_pt//k -a sleep 1
Show function calls:
% perf script --call-trace
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_pmu_enable
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) __x86_indirect_thunk_rax
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) event_filter_match
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) group_sched_in
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) __x86_indirect_thunk_rax
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) event_sched_in.isra.107
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_event_set_state.part.71
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_event_update_time
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_pmu_disable
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_log_itrace_start
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) __x86_indirect_thunk_rax
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_event_update_userpage
Show function calls and returns:
% perf script --call-ret-trace
perf 900 [000] 194167.205652203: tr strt ([unknown]) pt_config
perf 900 [000] 194167.205652203: return ([kernel.kallsyms]) pt_config
perf 900 [000] 194167.205652203: return ([kernel.kallsyms]) pt_event_add
perf 900 [000] 194167.205652203: call ([kernel.kallsyms]) perf_pmu_enable
perf 900 [000] 194167.205652203: return ([kernel.kallsyms]) perf_pmu_nop_void
perf 900 [000] 194167.205652203: return ([kernel.kallsyms]) event_sched_in.isra.107
perf 900 [000] 194167.205652203: call ([kernel.kallsyms]) __x86_indirect_thunk_rax
perf 900 [000] 194167.205652203: return ([kernel.kallsyms]) perf_pmu_nop_int
perf 900 [000] 194167.205652203: return ([kernel.kallsyms]) group_sched_in
perf 900 [000] 194167.205652203: call ([kernel.kallsyms]) event_filter_match
perf 900 [000] 194167.205652203: return ([kernel.kallsyms]) event_filter_match
perf 900 [000] 194167.205652203: call ([kernel.kallsyms]) group_sched_in
Show instruction traces (using XED):
% perf script --insn-trace
swapper 0 [000] 117276.429606186: ffffffff81010486 pt_config ([kernel.kallsyms]) nopl %eax, (%rax,%rax,1)
swapper 0 [000] 117276.429606186: ffffffff8101048b pt_config ([kernel.kallsyms]) add $0x10, %rsp
swapper 0 [000] 117276.429606186: ffffffff8101048f pt_config ([kernel.kallsyms]) popq %rbx
swapper 0 [000] 117276.429606186: ffffffff81010490 pt_config ([kernel.kallsyms]) popq %rbp
swapper 0 [000] 117276.429606186: ffffffff81010491 pt_config ([kernel.kallsyms]) popq %r12
swapper 0 [000] 117276.429606186: ffffffff81010493 pt_config ([kernel.kallsyms]) popq %r13
swapper 0 [000] 117276.429606186: ffffffff81010495 pt_config ([kernel.kallsyms]) popq %r14
swapper 0 [000] 117276.429606186: ffffffff81010497 pt_config ([kernel.kallsyms]) popq %r15
swapper 0 [000] 117276.429606186: ffffffff81010499 pt_config ([kernel.kallsyms]) retq
Filter by a ftrace style graph function:
% perf script --graph-function group_sched_in --call-trace
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) group_sched_in
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) __x86_indirect_thunk_rax
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) event_sched_in.isra.107
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_event_set_state.part.71
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_event_update_time
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_pmu_disable
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_log_itrace_start
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) __x86_indirect_thunk_rax
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_event_update_userpage
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) calc_timer_values
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) sched_clock_cpu
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) __x86_indirect_thunk_rax
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) arch_perf_update_userpage
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) __fentry__
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) using_native_sched_clock
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) sched_clock_stable
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_pmu_enable
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) __x86_indirect_thunk_rax
Also available in
git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-misc.git pt/easy-script-2
v1: Initial post
v2: Address review comments. Minor fixes to descriptions. Now builds everywhere.
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: Make perf script easier to use for itrace
2018-08-29 17:18 Andi Kleen
@ 2018-08-30 8:42 ` Jiri Olsa
0 siblings, 0 replies; 30+ messages in thread
From: Jiri Olsa @ 2018-08-30 8:42 UTC (permalink / raw)
To: Andi Kleen; +Cc: acme, jolsa, adrian.hunter, linux-kernel
On Wed, Aug 29, 2018 at 10:18:24AM -0700, Andi Kleen wrote:
SNIP
>
> Filter by a ftrace style graph function:
>
> % perf script --graph-function group_sched_in --call-trace
> perf 900 [000] 194167.205652203: ([kernel.kallsyms]) group_sched_in
> perf 900 [000] 194167.205652203: ([kernel.kallsyms]) __x86_indirect_thunk_rax
> perf 900 [000] 194167.205652203: ([kernel.kallsyms]) event_sched_in.isra.107
> perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_event_set_state.part.71
> perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_event_update_time
> perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_pmu_disable
> perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_log_itrace_start
> perf 900 [000] 194167.205652203: ([kernel.kallsyms]) __x86_indirect_thunk_rax
> perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_event_update_userpage
> perf 900 [000] 194167.205652203: ([kernel.kallsyms]) calc_timer_values
> perf 900 [000] 194167.205652203: ([kernel.kallsyms]) sched_clock_cpu
> perf 900 [000] 194167.205652203: ([kernel.kallsyms]) __x86_indirect_thunk_rax
> perf 900 [000] 194167.205652203: ([kernel.kallsyms]) arch_perf_update_userpage
> perf 900 [000] 194167.205652203: ([kernel.kallsyms]) __fentry__
> perf 900 [000] 194167.205652203: ([kernel.kallsyms]) using_native_sched_clock
> perf 900 [000] 194167.205652203: ([kernel.kallsyms]) sched_clock_stable
> perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_pmu_enable
> perf 900 [000] 194167.205652203: ([kernel.kallsyms]) __x86_indirect_thunk_rax
>
>
> Also available in
> git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-misc.git pt/easy-script-2
got compile error over your branch, might be one of the new fancy gcc8 checksm not sure
gcc version 8.1.1 20180712 (Red Hat 8.1.1-5) (GCC)
jirka
---
[jolsa@krava perf]$ make JOBS=1
BUILD: Doing 'make -j1' parallel build
CC builtin-script.o
builtin-script.c: In function ‘parse_output_fields’:
builtin-script.c:2571:22: error: array subscript 7 is above array bounds of ‘struct <anonymous>[7]’ [-Werror=array-bounds]
output[j].fields &= ~all_output_options[i].field;
^~
builtin-script.c:2571:22: error: array subscript 7 is above array bounds of ‘struct <anonymous>[7]’ [-Werror=array-bounds]
output[j].fields &= ~all_output_options[i].field;
~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
builtin-script.c:2574:10: error: array subscript 7 is above array bounds of ‘struct <anonymous>[7]’ [-Werror=array-bounds]
output[j].user_set = true;
~~~~~~^~~
builtin-script.c:2575:10: error: array subscript 7 is above array bounds of ‘struct <anonymous>[7]’ [-Werror=array-bounds]
output[j].wildcard_set = true;
~~~~~~^~~
builtin-script.c: In function ‘parse_output_fields.constprop’:
builtin-script.c:2571:22: error: array subscript 7 is above array bounds of ‘struct <anonymous>[7]’ [-Werror=array-bounds]
output[j].fields &= ~all_output_options[i].field;
^~
builtin-script.c:2571:22: error: array subscript 7 is above array bounds of ‘struct <anonymous>[7]’ [-Werror=array-bounds]
output[j].fields &= ~all_output_options[i].field;
~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
builtin-script.c:2574:10: error: array subscript 7 is above array bounds of ‘struct <anonymous>[7]’ [-Werror=array-bounds]
output[j].user_set = true;
~~~~~~^~~
builtin-script.c:2575:10: error: array subscript 7 is above array bounds of ‘struct <anonymous>[7]’ [-Werror=array-bounds]
output[j].wildcard_set = true;
~~~~~~^~~
cc1: all warnings being treated as errors
mv: cannot stat './.builtin-script.o.tmp': No such file or directory
make[3]: *** [/home/jolsa/kernel/linux-perf/tools/build/Makefile.build:97: builtin-script.o] Error 1
make[2]: *** [Makefile.perf:507: perf-in.o] Error 2
make[1]: *** [Makefile.perf:206: sub-make] Error 2
make: *** [Makefile:70: all] Error 2
^ permalink raw reply [flat|nested] 30+ messages in thread
* Make perf script easier to use for itrace
@ 2018-08-29 17:18 Andi Kleen
2018-08-30 8:42 ` Jiri Olsa
0 siblings, 1 reply; 30+ messages in thread
From: Andi Kleen @ 2018-08-29 17:18 UTC (permalink / raw)
To: acme; +Cc: jolsa, adrian.hunter, linux-kernel
Implement a range of improvements to make it easier to look
at itrace traces with perf script. Nothing here couldn't be done
before with some additional scripting, but add simple high
level options to make it easier to use.
% perf record -e intel_pt//k -a sleep 1
Show function calls:
% perf script --call-trace
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_pmu_enable
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) __x86_indirect_thunk_rax
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) event_filter_match
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) group_sched_in
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) __x86_indirect_thunk_rax
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) event_sched_in.isra.107
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_event_set_state.part.71
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_event_update_time
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_pmu_disable
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_log_itrace_start
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) __x86_indirect_thunk_rax
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_event_update_userpage
Show function calls and returns:
% perf script --call-ret-trace
perf 900 [000] 194167.205652203: tr strt ([unknown]) pt_config
perf 900 [000] 194167.205652203: return ([kernel.kallsyms]) pt_config
perf 900 [000] 194167.205652203: return ([kernel.kallsyms]) pt_event_add
perf 900 [000] 194167.205652203: call ([kernel.kallsyms]) perf_pmu_enable
perf 900 [000] 194167.205652203: return ([kernel.kallsyms]) perf_pmu_nop_void
perf 900 [000] 194167.205652203: return ([kernel.kallsyms]) event_sched_in.isra.107
perf 900 [000] 194167.205652203: call ([kernel.kallsyms]) __x86_indirect_thunk_rax
perf 900 [000] 194167.205652203: return ([kernel.kallsyms]) perf_pmu_nop_int
perf 900 [000] 194167.205652203: return ([kernel.kallsyms]) group_sched_in
perf 900 [000] 194167.205652203: call ([kernel.kallsyms]) event_filter_match
perf 900 [000] 194167.205652203: return ([kernel.kallsyms]) event_filter_match
perf 900 [000] 194167.205652203: call ([kernel.kallsyms]) group_sched_in
Show instruction traces (using XED):
% perf script --itrace=i0ns --ns -F +insn,-event,-period | xed -F insn: -A -64
swapper 0 [000] 117276.429606186: ffffffff81010486 pt_config ([kernel.kallsyms]) nopl %eax, (%rax,%rax,1)
swapper 0 [000] 117276.429606186: ffffffff8101048b pt_config ([kernel.kallsyms]) add $0x10, %rsp
swapper 0 [000] 117276.429606186: ffffffff8101048f pt_config ([kernel.kallsyms]) popq %rbx
swapper 0 [000] 117276.429606186: ffffffff81010490 pt_config ([kernel.kallsyms]) popq %rbp
swapper 0 [000] 117276.429606186: ffffffff81010491 pt_config ([kernel.kallsyms]) popq %r12
swapper 0 [000] 117276.429606186: ffffffff81010493 pt_config ([kernel.kallsyms]) popq %r13
swapper 0 [000] 117276.429606186: ffffffff81010495 pt_config ([kernel.kallsyms]) popq %r14
swapper 0 [000] 117276.429606186: ffffffff81010497 pt_config ([kernel.kallsyms]) popq %r15
swapper 0 [000] 117276.429606186: ffffffff81010499 pt_config ([kernel.kallsyms]) retq
Filter by a ftrace style graph function:
% perf script --graph-function group_sched_in --call-trace
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) group_sched_in
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) __x86_indirect_thunk_rax
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) event_sched_in.isra.107
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_event_set_state.part.71
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_event_update_time
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_pmu_disable
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_log_itrace_start
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) __x86_indirect_thunk_rax
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_event_update_userpage
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) calc_timer_values
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) sched_clock_cpu
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) __x86_indirect_thunk_rax
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) arch_perf_update_userpage
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) __fentry__
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) using_native_sched_clock
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) sched_clock_stable
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) perf_pmu_enable
perf 900 [000] 194167.205652203: ([kernel.kallsyms]) __x86_indirect_thunk_rax
Also available in
git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-misc.git pt/easy-script-2
v1: Initial post
^ permalink raw reply [flat|nested] 30+ messages in thread
end of thread, other threads:[~2018-09-26 8:47 UTC | newest]
Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-18 12:32 Make perf script easier to use for itrace Andi Kleen
2018-09-18 12:32 ` [PATCH v5 1/9] perf tools: Report itrace options in help Andi Kleen
2018-09-19 18:13 ` Arnaldo Carvalho de Melo
2018-09-18 12:32 ` [PATCH v5 2/9] tools, pager: Support overwriting the pager Andi Kleen
2018-09-19 18:17 ` Arnaldo Carvalho de Melo
2018-09-26 8:44 ` [tip:perf/core] tools lib subcmd: " tip-bot for Andi Kleen
2018-09-18 12:32 ` [PATCH v5 3/9] tools, perf, script: Add --insn-trace for instruction decoding Andi Kleen
2018-09-18 12:32 ` [PATCH v5 4/9] perf, tools, script: Allow sym and dso without ip, addr Andi Kleen
2018-09-19 18:24 ` Arnaldo Carvalho de Melo
2018-09-26 8:44 ` [tip:perf/core] perf " tip-bot for Andi Kleen
2018-09-18 12:32 ` [PATCH v5 5/9] perf, tools, script: Print DSO for callindent Andi Kleen
2018-09-19 18:26 ` Arnaldo Carvalho de Melo
2018-09-26 8:45 ` [tip:perf/core] perf " tip-bot for Andi Kleen
2018-09-18 12:32 ` [PATCH v5 6/9] perf, tools, script: Make itrace script default to all calls Andi Kleen
2018-09-18 13:24 ` Adrian Hunter
2018-09-19 18:28 ` Arnaldo Carvalho de Melo
2018-09-18 12:32 ` [PATCH v5 7/9] tools, perf, script: Add --call-trace and --call-ret-trace Andi Kleen
2018-09-18 12:32 ` [PATCH v5 8/9] tools, perf, script: Implement --graph-function Andi Kleen
2018-09-18 12:32 ` [PATCH v5 9/9] perf, tools, script: Support total cycles count Andi Kleen
2018-09-19 18:31 ` Arnaldo Carvalho de Melo
-- strict thread matches above, loose matches on Subject: below --
2018-09-20 18:05 Make perf script easier to use for itrace Andi Kleen
2018-09-14 3:10 Andi Kleen
2018-09-03 16:22 Andi Kleen
2018-08-31 22:01 Andi Kleen
2018-08-31 22:54 ` Kim Phillips
2018-09-03 7:31 ` Jiri Olsa
2018-09-03 15:50 ` Andi Kleen
2018-09-03 7:52 ` Jiri Olsa
2018-08-29 17:18 Andi Kleen
2018-08-30 8:42 ` Jiri Olsa
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.