* [PATCH V2 0/6] perf intel-pt: Improve the data displayed when using address filters
@ 2018-09-20 13:00 Adrian Hunter
2018-09-20 13:00 ` [PATCH V2 1/6] perf script: Enhance sample flags for trace begin / end Adrian Hunter
` (7 more replies)
0 siblings, 8 replies; 15+ messages in thread
From: Adrian Hunter @ 2018-09-20 13:00 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo; +Cc: Jiri Olsa, Andi Kleen, linux-kernel
Hi
Here is V2 of some Intel PT patches to improve the data displayed when using
address filters.
Previously, the decoder would indicate begin / end by a branch from / to
zero. That hides useful information, in particular when a trace ends with a
call. That happens when using address filters, for example:
$ perf record -e intel_pt/cyc,mtc_period=0,noretcomp/u --filter='filter main @ /bin/uname ' uname
Linux
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.031 MB perf.data ]
Before:
$ perf script --itrace=cre -Ftime,flags,ip,sym,symoff,addr --ns
7249.622183310: tr strt 0 [unknown] => 401590 main+0x0
7249.622183311: call 4015b9 main+0x29 => 0 [unknown]
7249.622183711: tr strt 0 [unknown] => 4015be main+0x2e
7249.622183714: call 4015c8 main+0x38 => 0 [unknown]
7249.622247731: tr strt 0 [unknown] => 4015cd main+0x3d
7249.622247760: call 4015d7 main+0x47 => 0 [unknown]
7249.622248340: tr strt 0 [unknown] => 4015dc main+0x4c
7249.622248341: call 4015e1 main+0x51 => 0 [unknown]
7249.622248681: tr strt 0 [unknown] => 4015e6 main+0x56
7249.622248682: call 4015eb main+0x5b => 0 [unknown]
7249.622248970: tr strt 0 [unknown] => 4015f0 main+0x60
7249.622248971: call 401612 main+0x82 => 0 [unknown]
7249.622249757: tr strt 0 [unknown] => 401617 main+0x87
7249.622249770: call 401847 main+0x2b7 => 0 [unknown]
7249.622250606: tr strt 0 [unknown] => 40184c main+0x2bc
7249.622250612: call 4019bf main+0x42f => 0 [unknown]
7249.622256823: tr strt 0 [unknown] => 4019c4 main+0x434
7249.622256863: call 4019f5 main+0x465 => 0 [unknown]
7249.622264217: tr strt 0 [unknown] => 4019fa main+0x46a
7249.622264235: call 401832 main+0x2a2 => 0 [unknown]
After:
$ perf script --itrace=cre -Ftime,flags,ip,sym,symoff,addr --ns
7249.622183310: tr strt 0 [unknown] => 401590 main+0x0
7249.622183311: tr end call 4015b9 main+0x29 => 401ef0 set_program_name+0x0
7249.622183711: tr strt 0 [unknown] => 4015be main+0x2e
7249.622183714: tr end call 4015c8 main+0x38 => 4014b0 setlocale@plt+0x0
7249.622247731: tr strt 0 [unknown] => 4015cd main+0x3d
7249.622247760: tr end call 4015d7 main+0x47 => 4012d0 bindtextdomain@plt+0x0
7249.622248340: tr strt 0 [unknown] => 4015dc main+0x4c
7249.622248341: tr end call 4015e1 main+0x51 => 4012b0 textdomain@plt+0x0
7249.622248681: tr strt 0 [unknown] => 4015e6 main+0x56
7249.622248682: tr end call 4015eb main+0x5b => 404340 atexit+0x0
7249.622248970: tr strt 0 [unknown] => 4015f0 main+0x60
7249.622248971: tr end call 401612 main+0x82 => 401320 getopt_long@plt+0x0
7249.622249757: tr strt 0 [unknown] => 401617 main+0x87
7249.622249770: tr end call 401847 main+0x2b7 => 401360 uname@plt+0x0
7249.622250606: tr strt 0 [unknown] => 40184c main+0x2bc
7249.622250612: tr end call 4019bf main+0x42f => 401b10 print_element+0x0
7249.622256823: tr strt 0 [unknown] => 4019c4 main+0x434
7249.622256863: tr end call 4019f5 main+0x465 => 401340 __overflow@plt+0x0
7249.622264217: tr strt 0 [unknown] => 4019fa main+0x46a
7249.622264235: tr end call 401832 main+0x2a2 => 401520 exit@plt+0x0
Changes in V2:
Improve commit messages
Adrian Hunter (6):
perf script: Enhance sample flags for trace begin / end
perf db-export: Add trace begin / end branch type variants
perf tools: Improve thread_stack__event() for trace begin / end
perf tools: Improve thread_stack__process() for trace begin / end
perf intel-pt: Add decoder flags for trace begin / end
perf intel-pt: Implement decoder flags for trace begin / end
tools/perf/builtin-script.c | 36 +++++++++++----
tools/perf/util/db-export.c | 22 ++++++++++
.../perf/util/intel-pt-decoder/intel-pt-decoder.c | 34 ++++++++++-----
.../perf/util/intel-pt-decoder/intel-pt-decoder.h | 2 +
tools/perf/util/intel-pt.c | 5 +++
tools/perf/util/thread-stack.c | 51 +++++++++++++++++-----
6 files changed, 118 insertions(+), 32 deletions(-)
Regards
Adrian
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH V2 1/6] perf script: Enhance sample flags for trace begin / end
2018-09-20 13:00 [PATCH V2 0/6] perf intel-pt: Improve the data displayed when using address filters Adrian Hunter
@ 2018-09-20 13:00 ` Adrian Hunter
2018-09-26 8:53 ` [tip:perf/core] " tip-bot for Adrian Hunter
2018-09-20 13:00 ` [PATCH V2 2/6] perf db-export: Add trace begin / end branch type variants Adrian Hunter
` (6 subsequent siblings)
7 siblings, 1 reply; 15+ messages in thread
From: Adrian Hunter @ 2018-09-20 13:00 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo; +Cc: Jiri Olsa, Andi Kleen, linux-kernel
Allow for different combinations of sample flags with "trace begin" or
"trace end".
Previously, the Intel PT decoder would indicate begin / end by a branch
from / to zero. That hides useful information, in particular when a trace
ends with a call. Before remedying that, prepare 'perf script' to display
sample flags with more combinations that include trace begin / end. In
those cases display 'tr start' and 'tr end' separately.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
tools/perf/builtin-script.c | 36 +++++++++++++++++++++++++++---------
1 file changed, 27 insertions(+), 9 deletions(-)
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 6176bae177c2..4982380ba96d 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -1255,6 +1255,18 @@ static struct {
{0, NULL}
};
+static const char *sample_flags_to_name(u32 flags)
+{
+ int i;
+
+ for (i = 0; sample_flags[i].name ; i++) {
+ if (sample_flags[i].flags == flags)
+ return sample_flags[i].name;
+ }
+
+ return NULL;
+}
+
static int perf_sample__fprintf_flags(u32 flags, FILE *fp)
{
const char *chars = PERF_IP_FLAG_CHARS;
@@ -1264,11 +1276,20 @@ static int perf_sample__fprintf_flags(u32 flags, FILE *fp)
char str[33];
int i, pos = 0;
- for (i = 0; sample_flags[i].name ; i++) {
- if (sample_flags[i].flags == (flags & ~PERF_IP_FLAG_IN_TX)) {
- name = sample_flags[i].name;
- break;
- }
+ name = sample_flags_to_name(flags & ~PERF_IP_FLAG_IN_TX);
+ if (name)
+ return fprintf(fp, " %-15s%4s ", name, in_tx ? "(x)" : "");
+
+ if (flags & PERF_IP_FLAG_TRACE_BEGIN) {
+ name = sample_flags_to_name(flags & ~(PERF_IP_FLAG_IN_TX | PERF_IP_FLAG_TRACE_BEGIN));
+ if (name)
+ return fprintf(fp, " tr strt %-7s%4s ", name, in_tx ? "(x)" : "");
+ }
+
+ if (flags & PERF_IP_FLAG_TRACE_END) {
+ name = sample_flags_to_name(flags & ~(PERF_IP_FLAG_IN_TX | PERF_IP_FLAG_TRACE_END));
+ if (name)
+ return fprintf(fp, " tr end %-7s%4s ", name, in_tx ? "(x)" : "");
}
for (i = 0; i < n; i++, flags >>= 1) {
@@ -1281,10 +1302,7 @@ static int perf_sample__fprintf_flags(u32 flags, FILE *fp)
}
str[pos] = 0;
- if (name)
- return fprintf(fp, " %-7s%4s ", name, in_tx ? "(x)" : "");
-
- return fprintf(fp, " %-11s ", str);
+ return fprintf(fp, " %-19s ", str);
}
struct printer_data {
--
2.17.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH V2 2/6] perf db-export: Add trace begin / end branch type variants
2018-09-20 13:00 [PATCH V2 0/6] perf intel-pt: Improve the data displayed when using address filters Adrian Hunter
2018-09-20 13:00 ` [PATCH V2 1/6] perf script: Enhance sample flags for trace begin / end Adrian Hunter
@ 2018-09-20 13:00 ` Adrian Hunter
2018-09-26 8:54 ` [tip:perf/core] " tip-bot for Adrian Hunter
2018-09-20 13:00 ` [PATCH V2 3/6] perf tools: Improve thread_stack__event() for trace begin / end Adrian Hunter
` (5 subsequent siblings)
7 siblings, 1 reply; 15+ messages in thread
From: Adrian Hunter @ 2018-09-20 13:00 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo; +Cc: Jiri Olsa, Andi Kleen, linux-kernel
Add branch types to cover different combinations with "trace begin" or
"trace end".
Previously, the Intel PT decoder would indicate begin / end by a branch
from / to zero. That hides useful information, in particular when a trace
ends with a call. Before remedying that, prepare the database export to
export branch types with more combinations that include trace begin / end.
In those cases extend the descriptions to include 'trace begin' and
'trace end' separately.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
tools/perf/util/db-export.c | 22 ++++++++++++++++++++++
1 file changed, 22 insertions(+)
diff --git a/tools/perf/util/db-export.c b/tools/perf/util/db-export.c
index 7123746edcf4..69fbb0a72d0c 100644
--- a/tools/perf/util/db-export.c
+++ b/tools/perf/util/db-export.c
@@ -463,6 +463,28 @@ int db_export__branch_types(struct db_export *dbe)
if (err)
break;
}
+
+ /* Add trace begin / end variants */
+ for (i = 0; branch_types[i].name ; i++) {
+ const char *name = branch_types[i].name;
+ u32 type = branch_types[i].branch_type;
+ char buf[64];
+
+ if (type == PERF_IP_FLAG_BRANCH ||
+ (type & (PERF_IP_FLAG_TRACE_BEGIN | PERF_IP_FLAG_TRACE_END)))
+ continue;
+
+ snprintf(buf, sizeof(buf), "trace begin / %s", name);
+ err = db_export__branch_type(dbe, type | PERF_IP_FLAG_TRACE_BEGIN, buf);
+ if (err)
+ break;
+
+ snprintf(buf, sizeof(buf), "%s / trace end", name);
+ err = db_export__branch_type(dbe, type | PERF_IP_FLAG_TRACE_END, buf);
+ if (err)
+ break;
+ }
+
return err;
}
--
2.17.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH V2 3/6] perf tools: Improve thread_stack__event() for trace begin / end
2018-09-20 13:00 [PATCH V2 0/6] perf intel-pt: Improve the data displayed when using address filters Adrian Hunter
2018-09-20 13:00 ` [PATCH V2 1/6] perf script: Enhance sample flags for trace begin / end Adrian Hunter
2018-09-20 13:00 ` [PATCH V2 2/6] perf db-export: Add trace begin / end branch type variants Adrian Hunter
@ 2018-09-20 13:00 ` Adrian Hunter
2018-09-26 8:54 ` [tip:perf/core] " tip-bot for Adrian Hunter
2018-09-20 13:00 ` [PATCH V2 4/6] perf tools: Improve thread_stack__process() " Adrian Hunter
` (4 subsequent siblings)
7 siblings, 1 reply; 15+ messages in thread
From: Adrian Hunter @ 2018-09-20 13:00 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo; +Cc: Jiri Olsa, Andi Kleen, linux-kernel
thread_stack__event() is used to create call stacks, by keeping track of
calls and returns. Improve the handling of trace begin / end to allow for a
trace that ends in a call.
Previously, the Intel PT decoder would indicate begin / end by a branch
from / to zero. That hides useful information, in particular when a trace
ends with a call. Before remedying that, enhance the thread stack so that
it does not expect to see the 'return' for a 'call' that ends the trace.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
tools/perf/util/thread-stack.c | 35 +++++++++++++++++++++++++++++-----
1 file changed, 30 insertions(+), 5 deletions(-)
diff --git a/tools/perf/util/thread-stack.c b/tools/perf/util/thread-stack.c
index dd17d6a38d3a..cea28b9074c1 100644
--- a/tools/perf/util/thread-stack.c
+++ b/tools/perf/util/thread-stack.c
@@ -36,6 +36,7 @@
* @branch_count: the branch count when the entry was created
* @cp: call path
* @no_call: a 'call' was not seen
+ * @trace_end: a 'call' but trace ended
*/
struct thread_stack_entry {
u64 ret_addr;
@@ -44,6 +45,7 @@ struct thread_stack_entry {
u64 branch_count;
struct call_path *cp;
bool no_call;
+ bool trace_end;
};
/**
@@ -112,7 +114,8 @@ static struct thread_stack *thread_stack__new(struct thread *thread,
return ts;
}
-static int thread_stack__push(struct thread_stack *ts, u64 ret_addr)
+static int thread_stack__push(struct thread_stack *ts, u64 ret_addr,
+ bool trace_end)
{
int err = 0;
@@ -124,6 +127,7 @@ static int thread_stack__push(struct thread_stack *ts, u64 ret_addr)
}
}
+ ts->stack[ts->cnt].trace_end = trace_end;
ts->stack[ts->cnt++].ret_addr = ret_addr;
return err;
@@ -150,6 +154,18 @@ static void thread_stack__pop(struct thread_stack *ts, u64 ret_addr)
}
}
+static void thread_stack__pop_trace_end(struct thread_stack *ts)
+{
+ size_t i;
+
+ for (i = ts->cnt; i; ) {
+ if (ts->stack[--i].trace_end)
+ ts->cnt = i;
+ else
+ return;
+ }
+}
+
static bool thread_stack__in_kernel(struct thread_stack *ts)
{
if (!ts->cnt)
@@ -254,10 +270,19 @@ int thread_stack__event(struct thread *thread, u32 flags, u64 from_ip,
ret_addr = from_ip + insn_len;
if (ret_addr == to_ip)
return 0; /* Zero-length calls are excluded */
- return thread_stack__push(thread->ts, ret_addr);
- } else if (flags & PERF_IP_FLAG_RETURN) {
- if (!from_ip)
- return 0;
+ return thread_stack__push(thread->ts, ret_addr,
+ flags && PERF_IP_FLAG_TRACE_END);
+ } else if (flags & PERF_IP_FLAG_TRACE_BEGIN) {
+ /*
+ * If the caller did not change the trace number (which would
+ * have flushed the stack) then try to make sense of the stack.
+ * Possibly, tracing began after returning to the current
+ * address, so try to pop that. Also, do not expect a call made
+ * when the trace ended, to return, so pop that.
+ */
+ thread_stack__pop(thread->ts, to_ip);
+ thread_stack__pop_trace_end(thread->ts);
+ } else if ((flags & PERF_IP_FLAG_RETURN) && from_ip) {
thread_stack__pop(thread->ts, to_ip);
}
--
2.17.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH V2 4/6] perf tools: Improve thread_stack__process() for trace begin / end
2018-09-20 13:00 [PATCH V2 0/6] perf intel-pt: Improve the data displayed when using address filters Adrian Hunter
` (2 preceding siblings ...)
2018-09-20 13:00 ` [PATCH V2 3/6] perf tools: Improve thread_stack__event() for trace begin / end Adrian Hunter
@ 2018-09-20 13:00 ` Adrian Hunter
2018-09-26 8:55 ` [tip:perf/core] " tip-bot for Adrian Hunter
2018-09-20 13:00 ` [PATCH V2 5/6] perf intel-pt: Add decoder flags " Adrian Hunter
` (3 subsequent siblings)
7 siblings, 1 reply; 15+ messages in thread
From: Adrian Hunter @ 2018-09-20 13:00 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo; +Cc: Jiri Olsa, Andi Kleen, linux-kernel
thread_stack__process() is used to create call paths for database export.
Improve the handling of trace begin / end to allow for a trace that ends in
a call.
Previously, the Intel PT decoder would indicate begin / end by a branch
from / to zero. That hides useful information, in particular when a trace
ends with a call. Before remedying that, enhance the thread stack so that
it identifies the trace end by the flag instead of by ip == 0.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
tools/perf/util/thread-stack.c | 16 +++++++++-------
1 file changed, 9 insertions(+), 7 deletions(-)
diff --git a/tools/perf/util/thread-stack.c b/tools/perf/util/thread-stack.c
index cea28b9074c1..45a97d15c6c8 100644
--- a/tools/perf/util/thread-stack.c
+++ b/tools/perf/util/thread-stack.c
@@ -357,7 +357,7 @@ void call_return_processor__free(struct call_return_processor *crp)
static int thread_stack__push_cp(struct thread_stack *ts, u64 ret_addr,
u64 timestamp, u64 ref, struct call_path *cp,
- bool no_call)
+ bool no_call, bool trace_end)
{
struct thread_stack_entry *tse;
int err;
@@ -375,6 +375,7 @@ static int thread_stack__push_cp(struct thread_stack *ts, u64 ret_addr,
tse->branch_count = ts->branch_count;
tse->cp = cp;
tse->no_call = no_call;
+ tse->trace_end = trace_end;
return 0;
}
@@ -448,7 +449,7 @@ static int thread_stack__bottom(struct thread *thread, struct thread_stack *ts,
return -ENOMEM;
return thread_stack__push_cp(thread->ts, ip, sample->time, ref, cp,
- true);
+ true, false);
}
static int thread_stack__no_call_return(struct thread *thread,
@@ -480,7 +481,7 @@ static int thread_stack__no_call_return(struct thread *thread,
if (!cp)
return -ENOMEM;
return thread_stack__push_cp(ts, 0, sample->time, ref,
- cp, true);
+ cp, true, false);
}
} else if (thread_stack__in_kernel(ts) && sample->ip < ks) {
/* Return to userspace, so pop all kernel addresses */
@@ -505,7 +506,7 @@ static int thread_stack__no_call_return(struct thread *thread,
return -ENOMEM;
err = thread_stack__push_cp(ts, sample->addr, sample->time, ref, cp,
- true);
+ true, false);
if (err)
return err;
@@ -525,7 +526,7 @@ static int thread_stack__trace_begin(struct thread *thread,
/* Pop trace end */
tse = &ts->stack[ts->cnt - 1];
- if (tse->cp->sym == NULL && tse->cp->ip == 0) {
+ if (tse->trace_end) {
err = thread_stack__call_return(thread, ts, --ts->cnt,
timestamp, ref, false);
if (err)
@@ -554,7 +555,7 @@ static int thread_stack__trace_end(struct thread_stack *ts,
ret_addr = sample->ip + sample->insn_len;
return thread_stack__push_cp(ts, ret_addr, sample->time, ref, cp,
- false);
+ false, true);
}
int thread_stack__process(struct thread *thread, struct comm *comm,
@@ -604,6 +605,7 @@ int thread_stack__process(struct thread *thread, struct comm *comm,
ts->last_time = sample->time;
if (sample->flags & PERF_IP_FLAG_CALL) {
+ bool trace_end = sample->flags & PERF_IP_FLAG_TRACE_END;
struct call_path_root *cpr = ts->crp->cpr;
struct call_path *cp;
u64 ret_addr;
@@ -621,7 +623,7 @@ int thread_stack__process(struct thread *thread, struct comm *comm,
if (!cp)
return -ENOMEM;
err = thread_stack__push_cp(ts, ret_addr, sample->time, ref,
- cp, false);
+ cp, false, trace_end);
} else if (sample->flags & PERF_IP_FLAG_RETURN) {
if (!sample->ip || !sample->addr)
return 0;
--
2.17.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH V2 5/6] perf intel-pt: Add decoder flags for trace begin / end
2018-09-20 13:00 [PATCH V2 0/6] perf intel-pt: Improve the data displayed when using address filters Adrian Hunter
` (3 preceding siblings ...)
2018-09-20 13:00 ` [PATCH V2 4/6] perf tools: Improve thread_stack__process() " Adrian Hunter
@ 2018-09-20 13:00 ` Adrian Hunter
2018-09-26 8:56 ` [tip:perf/core] " tip-bot for Adrian Hunter
2018-09-20 13:00 ` [PATCH V2 6/6] perf intel-pt: Implement " Adrian Hunter
` (2 subsequent siblings)
7 siblings, 1 reply; 15+ messages in thread
From: Adrian Hunter @ 2018-09-20 13:00 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo; +Cc: Jiri Olsa, Andi Kleen, linux-kernel
Previously, the decoder would indicate begin / end by a branch from / to
zero. That hides useful information, in particular when a trace ends with a
call. To prepare for remedying that, add Intel PT decoder flags for trace
begin / end and map them to the existing sample flags.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
tools/perf/util/intel-pt-decoder/intel-pt-decoder.h | 2 ++
tools/perf/util/intel-pt.c | 5 +++++
2 files changed, 7 insertions(+)
diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.h b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.h
index 51c18d67f4ca..ed088d4726ba 100644
--- a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.h
+++ b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.h
@@ -37,6 +37,8 @@ enum intel_pt_sample_type {
INTEL_PT_EX_STOP = 1 << 6,
INTEL_PT_PWR_EXIT = 1 << 7,
INTEL_PT_CBR_CHG = 1 << 8,
+ INTEL_PT_TRACE_BEGIN = 1 << 9,
+ INTEL_PT_TRACE_END = 1 << 10,
};
enum intel_pt_period_type {
diff --git a/tools/perf/util/intel-pt.c b/tools/perf/util/intel-pt.c
index aec68908d604..48c1d415c6b0 100644
--- a/tools/perf/util/intel-pt.c
+++ b/tools/perf/util/intel-pt.c
@@ -908,6 +908,11 @@ static void intel_pt_sample_flags(struct intel_pt_queue *ptq)
ptq->insn_len = ptq->state->insn_len;
memcpy(ptq->insn, ptq->state->insn, INTEL_PT_INSN_BUF_SZ);
}
+
+ if (ptq->state->type & INTEL_PT_TRACE_BEGIN)
+ ptq->flags |= PERF_IP_FLAG_TRACE_BEGIN;
+ if (ptq->state->type & INTEL_PT_TRACE_END)
+ ptq->flags |= PERF_IP_FLAG_TRACE_END;
}
static int intel_pt_setup_queue(struct intel_pt *pt,
--
2.17.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH V2 6/6] perf intel-pt: Implement decoder flags for trace begin / end
2018-09-20 13:00 [PATCH V2 0/6] perf intel-pt: Improve the data displayed when using address filters Adrian Hunter
` (4 preceding siblings ...)
2018-09-20 13:00 ` [PATCH V2 5/6] perf intel-pt: Add decoder flags " Adrian Hunter
@ 2018-09-20 13:00 ` Adrian Hunter
2018-09-26 8:56 ` [tip:perf/core] " tip-bot for Adrian Hunter
2018-09-20 13:41 ` [PATCH V2 0/6] perf intel-pt: Improve the data displayed when using address filters Arnaldo Carvalho de Melo
2018-09-20 14:13 ` Arnaldo Carvalho de Melo
7 siblings, 1 reply; 15+ messages in thread
From: Adrian Hunter @ 2018-09-20 13:00 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo; +Cc: Jiri Olsa, Andi Kleen, linux-kernel
Have the Intel PT decoder implement the new Intel PT decoder flags for
trace begin / end.
Previously, the decoder would indicate begin / end by a branch from / to
zero. That hides useful information, in particular when a trace ends with a
call. That happens when using address filters, for example:
$ perf record -e intel_pt/cyc,mtc_period=0,noretcomp/u --filter='filter main @ /bin/uname ' uname
Linux
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.031 MB perf.data ]
Before:
$ perf script --itrace=cre -Ftime,flags,ip,sym,symoff,addr --ns
7249.622183310: tr strt 0 [unknown] => 401590 main+0x0
7249.622183311: call 4015b9 main+0x29 => 0 [unknown]
7249.622183711: tr strt 0 [unknown] => 4015be main+0x2e
7249.622183714: call 4015c8 main+0x38 => 0 [unknown]
7249.622247731: tr strt 0 [unknown] => 4015cd main+0x3d
7249.622247760: call 4015d7 main+0x47 => 0 [unknown]
7249.622248340: tr strt 0 [unknown] => 4015dc main+0x4c
7249.622248341: call 4015e1 main+0x51 => 0 [unknown]
7249.622248681: tr strt 0 [unknown] => 4015e6 main+0x56
7249.622248682: call 4015eb main+0x5b => 0 [unknown]
7249.622248970: tr strt 0 [unknown] => 4015f0 main+0x60
7249.622248971: call 401612 main+0x82 => 0 [unknown]
7249.622249757: tr strt 0 [unknown] => 401617 main+0x87
7249.622249770: call 401847 main+0x2b7 => 0 [unknown]
7249.622250606: tr strt 0 [unknown] => 40184c main+0x2bc
7249.622250612: call 4019bf main+0x42f => 0 [unknown]
7249.622256823: tr strt 0 [unknown] => 4019c4 main+0x434
7249.622256863: call 4019f5 main+0x465 => 0 [unknown]
7249.622264217: tr strt 0 [unknown] => 4019fa main+0x46a
7249.622264235: call 401832 main+0x2a2 => 0 [unknown]
After:
$ perf script --itrace=cre -Ftime,flags,ip,sym,symoff,addr --ns
7249.622183310: tr strt 0 [unknown] => 401590 main+0x0
7249.622183311: tr end call 4015b9 main+0x29 => 401ef0 set_program_name+0x0
7249.622183711: tr strt 0 [unknown] => 4015be main+0x2e
7249.622183714: tr end call 4015c8 main+0x38 => 4014b0 setlocale@plt+0x0
7249.622247731: tr strt 0 [unknown] => 4015cd main+0x3d
7249.622247760: tr end call 4015d7 main+0x47 => 4012d0 bindtextdomain@plt+0x0
7249.622248340: tr strt 0 [unknown] => 4015dc main+0x4c
7249.622248341: tr end call 4015e1 main+0x51 => 4012b0 textdomain@plt+0x0
7249.622248681: tr strt 0 [unknown] => 4015e6 main+0x56
7249.622248682: tr end call 4015eb main+0x5b => 404340 atexit+0x0
7249.622248970: tr strt 0 [unknown] => 4015f0 main+0x60
7249.622248971: tr end call 401612 main+0x82 => 401320 getopt_long@plt+0x0
7249.622249757: tr strt 0 [unknown] => 401617 main+0x87
7249.622249770: tr end call 401847 main+0x2b7 => 401360 uname@plt+0x0
7249.622250606: tr strt 0 [unknown] => 40184c main+0x2bc
7249.622250612: tr end call 4019bf main+0x42f => 401b10 print_element+0x0
7249.622256823: tr strt 0 [unknown] => 4019c4 main+0x434
7249.622256863: tr end call 4019f5 main+0x465 => 401340 __overflow@plt+0x0
7249.622264217: tr strt 0 [unknown] => 4019fa main+0x46a
7249.622264235: tr end call 401832 main+0x2a2 => 401520 exit@plt+0x0
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
.../util/intel-pt-decoder/intel-pt-decoder.c | 34 +++++++++++++------
1 file changed, 23 insertions(+), 11 deletions(-)
diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
index d404bed7003a..58f6a9ceb590 100644
--- a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
+++ b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
@@ -1165,7 +1165,7 @@ static int intel_pt_walk_tip(struct intel_pt_decoder *decoder)
decoder->pge = false;
decoder->continuous_period = false;
decoder->pkt_state = INTEL_PT_STATE_IN_SYNC;
- decoder->state.to_ip = 0;
+ decoder->state.type |= INTEL_PT_TRACE_END;
return 0;
}
if (err == INTEL_PT_RETURN)
@@ -1179,9 +1179,13 @@ static int intel_pt_walk_tip(struct intel_pt_decoder *decoder)
decoder->continuous_period = false;
decoder->pkt_state = INTEL_PT_STATE_IN_SYNC;
decoder->state.from_ip = decoder->ip;
- decoder->state.to_ip = 0;
- if (decoder->packet.count != 0)
+ if (decoder->packet.count == 0) {
+ decoder->state.to_ip = 0;
+ } else {
+ decoder->state.to_ip = decoder->last_ip;
decoder->ip = decoder->last_ip;
+ }
+ decoder->state.type |= INTEL_PT_TRACE_END;
} else {
decoder->pkt_state = INTEL_PT_STATE_IN_SYNC;
decoder->state.from_ip = decoder->ip;
@@ -1208,7 +1212,8 @@ static int intel_pt_walk_tip(struct intel_pt_decoder *decoder)
decoder->pkt_state = INTEL_PT_STATE_IN_SYNC;
decoder->ip = to_ip;
decoder->state.from_ip = decoder->ip;
- decoder->state.to_ip = 0;
+ decoder->state.to_ip = to_ip;
+ decoder->state.type |= INTEL_PT_TRACE_END;
return 0;
}
intel_pt_log_at("ERROR: Conditional branch when expecting indirect branch",
@@ -1640,14 +1645,15 @@ static int intel_pt_walk_fup_tip(struct intel_pt_decoder *decoder)
case INTEL_PT_TIP_PGD:
decoder->state.from_ip = decoder->ip;
- decoder->state.to_ip = 0;
- if (decoder->packet.count != 0) {
+ if (decoder->packet.count == 0) {
+ decoder->state.to_ip = 0;
+ } else {
intel_pt_set_ip(decoder);
- intel_pt_log("Omitting PGD ip " x64_fmt "\n",
- decoder->ip);
+ decoder->state.to_ip = decoder->ip;
}
decoder->pge = false;
decoder->continuous_period = false;
+ decoder->state.type |= INTEL_PT_TRACE_END;
return 0;
case INTEL_PT_TIP_PGE:
@@ -1661,6 +1667,7 @@ static int intel_pt_walk_fup_tip(struct intel_pt_decoder *decoder)
intel_pt_set_ip(decoder);
decoder->state.to_ip = decoder->ip;
}
+ decoder->state.type |= INTEL_PT_TRACE_BEGIN;
return 0;
case INTEL_PT_TIP:
@@ -1739,6 +1746,7 @@ static int intel_pt_walk_trace(struct intel_pt_decoder *decoder)
intel_pt_set_ip(decoder);
decoder->state.from_ip = 0;
decoder->state.to_ip = decoder->ip;
+ decoder->state.type |= INTEL_PT_TRACE_BEGIN;
return 0;
}
@@ -2077,9 +2085,13 @@ static int intel_pt_walk_to_ip(struct intel_pt_decoder *decoder)
decoder->pge = decoder->packet.type != INTEL_PT_TIP_PGD;
if (intel_pt_have_ip(decoder))
intel_pt_set_ip(decoder);
- if (decoder->ip)
- return 0;
- break;
+ if (!decoder->ip)
+ break;
+ if (decoder->packet.type == INTEL_PT_TIP_PGE)
+ decoder->state.type |= INTEL_PT_TRACE_BEGIN;
+ if (decoder->packet.type == INTEL_PT_TIP_PGD)
+ decoder->state.type |= INTEL_PT_TRACE_END;
+ return 0;
case INTEL_PT_FUP:
if (intel_pt_have_ip(decoder))
--
2.17.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH V2 0/6] perf intel-pt: Improve the data displayed when using address filters
2018-09-20 13:00 [PATCH V2 0/6] perf intel-pt: Improve the data displayed when using address filters Adrian Hunter
` (5 preceding siblings ...)
2018-09-20 13:00 ` [PATCH V2 6/6] perf intel-pt: Implement " Adrian Hunter
@ 2018-09-20 13:41 ` Arnaldo Carvalho de Melo
2018-09-20 14:13 ` Arnaldo Carvalho de Melo
7 siblings, 0 replies; 15+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-09-20 13:41 UTC (permalink / raw)
To: Adrian Hunter; +Cc: Jiri Olsa, Andi Kleen, linux-kernel
Em Thu, Sep 20, 2018 at 04:00:42PM +0300, Adrian Hunter escreveu:
> Here is V2 of some Intel PT patches to improve the data displayed when using
> address filters.
<SNIP>
> Changes in V2:
>
> Improve commit messages
Thanks a lot, helps a lot,
- Arnaldo
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH V2 0/6] perf intel-pt: Improve the data displayed when using address filters
2018-09-20 13:00 [PATCH V2 0/6] perf intel-pt: Improve the data displayed when using address filters Adrian Hunter
` (6 preceding siblings ...)
2018-09-20 13:41 ` [PATCH V2 0/6] perf intel-pt: Improve the data displayed when using address filters Arnaldo Carvalho de Melo
@ 2018-09-20 14:13 ` Arnaldo Carvalho de Melo
7 siblings, 0 replies; 15+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-09-20 14:13 UTC (permalink / raw)
To: Adrian Hunter; +Cc: Jiri Olsa, Andi Kleen, linux-kernel
Em Thu, Sep 20, 2018 at 04:00:42PM +0300, Adrian Hunter escreveu:
> Hi
>
> Here is V2 of some Intel PT patches to improve the data displayed when using
> address filters.
>
> Previously, the decoder would indicate begin / end by a branch from / to
> zero. That hides useful information, in particular when a trace ends with a
> call. That happens when using address filters, for example:
>
> $ perf record -e intel_pt/cyc,mtc_period=0,noretcomp/u --filter='filter main @ /bin/uname ' uname
> Linux
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.031 MB perf.data ]
Thanks, applied.
- Arnaldo
^ permalink raw reply [flat|nested] 15+ messages in thread
* [tip:perf/core] perf script: Enhance sample flags for trace begin / end
2018-09-20 13:00 ` [PATCH V2 1/6] perf script: Enhance sample flags for trace begin / end Adrian Hunter
@ 2018-09-26 8:53 ` tip-bot for Adrian Hunter
0 siblings, 0 replies; 15+ messages in thread
From: tip-bot for Adrian Hunter @ 2018-09-26 8:53 UTC (permalink / raw)
To: linux-tip-commits
Cc: acme, jolsa, linux-kernel, tglx, ak, adrian.hunter, hpa, mingo
Commit-ID: 62cb1b8868a70c932b15959a98594df537df2ffc
Gitweb: https://git.kernel.org/tip/62cb1b8868a70c932b15959a98594df537df2ffc
Author: Adrian Hunter <adrian.hunter@intel.com>
AuthorDate: Thu, 20 Sep 2018 16:00:43 +0300
Committer: Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Thu, 20 Sep 2018 11:09:55 -0300
perf script: Enhance sample flags for trace begin / end
Allow for different combinations of sample flags with "trace begin" or
"trace end".
Previously, the Intel PT decoder would indicate begin / end by a branch
from / to zero. That hides useful information, in particular when a
trace ends with a call. Before remedying that, prepare 'perf script' to
display sample flags with more combinations that include trace begin /
end. In those cases display 'tr start' and 'tr end' separately.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20180920130048.31432-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/builtin-script.c | 36 +++++++++++++++++++++++++++---------
1 file changed, 27 insertions(+), 9 deletions(-)
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 7732346bd9dd..4da5e32b9e03 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -1262,6 +1262,18 @@ static struct {
{0, NULL}
};
+static const char *sample_flags_to_name(u32 flags)
+{
+ int i;
+
+ for (i = 0; sample_flags[i].name ; i++) {
+ if (sample_flags[i].flags == flags)
+ return sample_flags[i].name;
+ }
+
+ return NULL;
+}
+
static int perf_sample__fprintf_flags(u32 flags, FILE *fp)
{
const char *chars = PERF_IP_FLAG_CHARS;
@@ -1271,11 +1283,20 @@ static int perf_sample__fprintf_flags(u32 flags, FILE *fp)
char str[33];
int i, pos = 0;
- for (i = 0; sample_flags[i].name ; i++) {
- if (sample_flags[i].flags == (flags & ~PERF_IP_FLAG_IN_TX)) {
- name = sample_flags[i].name;
- break;
- }
+ name = sample_flags_to_name(flags & ~PERF_IP_FLAG_IN_TX);
+ if (name)
+ return fprintf(fp, " %-15s%4s ", name, in_tx ? "(x)" : "");
+
+ if (flags & PERF_IP_FLAG_TRACE_BEGIN) {
+ name = sample_flags_to_name(flags & ~(PERF_IP_FLAG_IN_TX | PERF_IP_FLAG_TRACE_BEGIN));
+ if (name)
+ return fprintf(fp, " tr strt %-7s%4s ", name, in_tx ? "(x)" : "");
+ }
+
+ if (flags & PERF_IP_FLAG_TRACE_END) {
+ name = sample_flags_to_name(flags & ~(PERF_IP_FLAG_IN_TX | PERF_IP_FLAG_TRACE_END));
+ if (name)
+ return fprintf(fp, " tr end %-7s%4s ", name, in_tx ? "(x)" : "");
}
for (i = 0; i < n; i++, flags >>= 1) {
@@ -1288,10 +1309,7 @@ static int perf_sample__fprintf_flags(u32 flags, FILE *fp)
}
str[pos] = 0;
- if (name)
- return fprintf(fp, " %-7s%4s ", name, in_tx ? "(x)" : "");
-
- return fprintf(fp, " %-11s ", str);
+ return fprintf(fp, " %-19s ", str);
}
struct printer_data {
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [tip:perf/core] perf db-export: Add trace begin / end branch type variants
2018-09-20 13:00 ` [PATCH V2 2/6] perf db-export: Add trace begin / end branch type variants Adrian Hunter
@ 2018-09-26 8:54 ` tip-bot for Adrian Hunter
0 siblings, 0 replies; 15+ messages in thread
From: tip-bot for Adrian Hunter @ 2018-09-26 8:54 UTC (permalink / raw)
To: linux-tip-commits
Cc: jolsa, acme, tglx, hpa, linux-kernel, adrian.hunter, mingo, ak
Commit-ID: ff645daf30cafb6fa74bee9a73733700bac2aff7
Gitweb: https://git.kernel.org/tip/ff645daf30cafb6fa74bee9a73733700bac2aff7
Author: Adrian Hunter <adrian.hunter@intel.com>
AuthorDate: Thu, 20 Sep 2018 16:00:44 +0300
Committer: Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Thu, 20 Sep 2018 11:10:25 -0300
perf db-export: Add trace begin / end branch type variants
Add branch types to cover different combinations with "trace begin" or
"trace end".
Previously, the Intel PT decoder would indicate begin / end by a branch
from / to zero. That hides useful information, in particular when a
trace ends with a call. Before remedying that, prepare the database
export to export branch types with more combinations that include trace
begin / end. In those cases extend the descriptions to include 'trace
begin' and 'trace end' separately.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20180920130048.31432-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/util/db-export.c | 22 ++++++++++++++++++++++
1 file changed, 22 insertions(+)
diff --git a/tools/perf/util/db-export.c b/tools/perf/util/db-export.c
index 7123746edcf4..69fbb0a72d0c 100644
--- a/tools/perf/util/db-export.c
+++ b/tools/perf/util/db-export.c
@@ -463,6 +463,28 @@ int db_export__branch_types(struct db_export *dbe)
if (err)
break;
}
+
+ /* Add trace begin / end variants */
+ for (i = 0; branch_types[i].name ; i++) {
+ const char *name = branch_types[i].name;
+ u32 type = branch_types[i].branch_type;
+ char buf[64];
+
+ if (type == PERF_IP_FLAG_BRANCH ||
+ (type & (PERF_IP_FLAG_TRACE_BEGIN | PERF_IP_FLAG_TRACE_END)))
+ continue;
+
+ snprintf(buf, sizeof(buf), "trace begin / %s", name);
+ err = db_export__branch_type(dbe, type | PERF_IP_FLAG_TRACE_BEGIN, buf);
+ if (err)
+ break;
+
+ snprintf(buf, sizeof(buf), "%s / trace end", name);
+ err = db_export__branch_type(dbe, type | PERF_IP_FLAG_TRACE_END, buf);
+ if (err)
+ break;
+ }
+
return err;
}
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [tip:perf/core] perf tools: Improve thread_stack__event() for trace begin / end
2018-09-20 13:00 ` [PATCH V2 3/6] perf tools: Improve thread_stack__event() for trace begin / end Adrian Hunter
@ 2018-09-26 8:54 ` tip-bot for Adrian Hunter
0 siblings, 0 replies; 15+ messages in thread
From: tip-bot for Adrian Hunter @ 2018-09-26 8:54 UTC (permalink / raw)
To: linux-tip-commits
Cc: jolsa, adrian.hunter, acme, tglx, ak, linux-kernel, hpa, mingo
Commit-ID: 4d60e5e36aa6f11b4d9eadc5d2b94128f24870c7
Gitweb: https://git.kernel.org/tip/4d60e5e36aa6f11b4d9eadc5d2b94128f24870c7
Author: Adrian Hunter <adrian.hunter@intel.com>
AuthorDate: Thu, 20 Sep 2018 16:00:45 +0300
Committer: Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Thu, 20 Sep 2018 15:16:17 -0300
perf tools: Improve thread_stack__event() for trace begin / end
thread_stack__event() is used to create call stacks, by keeping track of
calls and returns. Improve the handling of trace begin / end to allow
for a trace that ends in a call.
Previously, the Intel PT decoder would indicate begin / end by a branch
from / to zero. That hides useful information, in particular when a
trace ends with a call. Before remedying that, enhance the thread stack
so that it does not expect to see the 'return' for a 'call' that ends
the trace.
Committer notes:
Added this:
return thread_stack__push(thread->ts, ret_addr,
- flags && PERF_IP_FLAG_TRACE_END);
+ flags & PERF_IP_FLAG_TRACE_END);
To fix problem spotted by:
debian:9: clang version 3.8.1-24 (tags/RELEASE_381/final)
debian:experimental: clang version 6.0.1-6 (tags/RELEASE_601/final)
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20180920130048.31432-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/util/thread-stack.c | 35 ++++++++++++++++++++++++++++++-----
1 file changed, 30 insertions(+), 5 deletions(-)
diff --git a/tools/perf/util/thread-stack.c b/tools/perf/util/thread-stack.c
index dd17d6a38d3a..e3f7dfecafa9 100644
--- a/tools/perf/util/thread-stack.c
+++ b/tools/perf/util/thread-stack.c
@@ -36,6 +36,7 @@
* @branch_count: the branch count when the entry was created
* @cp: call path
* @no_call: a 'call' was not seen
+ * @trace_end: a 'call' but trace ended
*/
struct thread_stack_entry {
u64 ret_addr;
@@ -44,6 +45,7 @@ struct thread_stack_entry {
u64 branch_count;
struct call_path *cp;
bool no_call;
+ bool trace_end;
};
/**
@@ -112,7 +114,8 @@ static struct thread_stack *thread_stack__new(struct thread *thread,
return ts;
}
-static int thread_stack__push(struct thread_stack *ts, u64 ret_addr)
+static int thread_stack__push(struct thread_stack *ts, u64 ret_addr,
+ bool trace_end)
{
int err = 0;
@@ -124,6 +127,7 @@ static int thread_stack__push(struct thread_stack *ts, u64 ret_addr)
}
}
+ ts->stack[ts->cnt].trace_end = trace_end;
ts->stack[ts->cnt++].ret_addr = ret_addr;
return err;
@@ -150,6 +154,18 @@ static void thread_stack__pop(struct thread_stack *ts, u64 ret_addr)
}
}
+static void thread_stack__pop_trace_end(struct thread_stack *ts)
+{
+ size_t i;
+
+ for (i = ts->cnt; i; ) {
+ if (ts->stack[--i].trace_end)
+ ts->cnt = i;
+ else
+ return;
+ }
+}
+
static bool thread_stack__in_kernel(struct thread_stack *ts)
{
if (!ts->cnt)
@@ -254,10 +270,19 @@ int thread_stack__event(struct thread *thread, u32 flags, u64 from_ip,
ret_addr = from_ip + insn_len;
if (ret_addr == to_ip)
return 0; /* Zero-length calls are excluded */
- return thread_stack__push(thread->ts, ret_addr);
- } else if (flags & PERF_IP_FLAG_RETURN) {
- if (!from_ip)
- return 0;
+ return thread_stack__push(thread->ts, ret_addr,
+ flags & PERF_IP_FLAG_TRACE_END);
+ } else if (flags & PERF_IP_FLAG_TRACE_BEGIN) {
+ /*
+ * If the caller did not change the trace number (which would
+ * have flushed the stack) then try to make sense of the stack.
+ * Possibly, tracing began after returning to the current
+ * address, so try to pop that. Also, do not expect a call made
+ * when the trace ended, to return, so pop that.
+ */
+ thread_stack__pop(thread->ts, to_ip);
+ thread_stack__pop_trace_end(thread->ts);
+ } else if ((flags & PERF_IP_FLAG_RETURN) && from_ip) {
thread_stack__pop(thread->ts, to_ip);
}
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [tip:perf/core] perf tools: Improve thread_stack__process() for trace begin / end
2018-09-20 13:00 ` [PATCH V2 4/6] perf tools: Improve thread_stack__process() " Adrian Hunter
@ 2018-09-26 8:55 ` tip-bot for Adrian Hunter
0 siblings, 0 replies; 15+ messages in thread
From: tip-bot for Adrian Hunter @ 2018-09-26 8:55 UTC (permalink / raw)
To: linux-tip-commits
Cc: mingo, hpa, jolsa, tglx, ak, linux-kernel, adrian.hunter, acme
Commit-ID: 2dcde4e152a3e319cc7e76c7c6b8548a3c72310d
Gitweb: https://git.kernel.org/tip/2dcde4e152a3e319cc7e76c7c6b8548a3c72310d
Author: Adrian Hunter <adrian.hunter@intel.com>
AuthorDate: Thu, 20 Sep 2018 16:00:46 +0300
Committer: Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Thu, 20 Sep 2018 15:19:50 -0300
perf tools: Improve thread_stack__process() for trace begin / end
thread_stack__process() is used to create call paths for database
export. Improve the handling of trace begin / end to allow for a trace
that ends in a call.
Previously, the Intel PT decoder would indicate begin / end by a branch
from / to zero. That hides useful information, in particular when a
trace ends with a call. Before remedying that, enhance the thread stack
so that it identifies the trace end by the flag instead of by ip == 0.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20180920130048.31432-5-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/util/thread-stack.c | 16 +++++++++-------
1 file changed, 9 insertions(+), 7 deletions(-)
diff --git a/tools/perf/util/thread-stack.c b/tools/perf/util/thread-stack.c
index e3f7dfecafa9..c091635bf7dc 100644
--- a/tools/perf/util/thread-stack.c
+++ b/tools/perf/util/thread-stack.c
@@ -357,7 +357,7 @@ void call_return_processor__free(struct call_return_processor *crp)
static int thread_stack__push_cp(struct thread_stack *ts, u64 ret_addr,
u64 timestamp, u64 ref, struct call_path *cp,
- bool no_call)
+ bool no_call, bool trace_end)
{
struct thread_stack_entry *tse;
int err;
@@ -375,6 +375,7 @@ static int thread_stack__push_cp(struct thread_stack *ts, u64 ret_addr,
tse->branch_count = ts->branch_count;
tse->cp = cp;
tse->no_call = no_call;
+ tse->trace_end = trace_end;
return 0;
}
@@ -448,7 +449,7 @@ static int thread_stack__bottom(struct thread *thread, struct thread_stack *ts,
return -ENOMEM;
return thread_stack__push_cp(thread->ts, ip, sample->time, ref, cp,
- true);
+ true, false);
}
static int thread_stack__no_call_return(struct thread *thread,
@@ -480,7 +481,7 @@ static int thread_stack__no_call_return(struct thread *thread,
if (!cp)
return -ENOMEM;
return thread_stack__push_cp(ts, 0, sample->time, ref,
- cp, true);
+ cp, true, false);
}
} else if (thread_stack__in_kernel(ts) && sample->ip < ks) {
/* Return to userspace, so pop all kernel addresses */
@@ -505,7 +506,7 @@ static int thread_stack__no_call_return(struct thread *thread,
return -ENOMEM;
err = thread_stack__push_cp(ts, sample->addr, sample->time, ref, cp,
- true);
+ true, false);
if (err)
return err;
@@ -525,7 +526,7 @@ static int thread_stack__trace_begin(struct thread *thread,
/* Pop trace end */
tse = &ts->stack[ts->cnt - 1];
- if (tse->cp->sym == NULL && tse->cp->ip == 0) {
+ if (tse->trace_end) {
err = thread_stack__call_return(thread, ts, --ts->cnt,
timestamp, ref, false);
if (err)
@@ -554,7 +555,7 @@ static int thread_stack__trace_end(struct thread_stack *ts,
ret_addr = sample->ip + sample->insn_len;
return thread_stack__push_cp(ts, ret_addr, sample->time, ref, cp,
- false);
+ false, true);
}
int thread_stack__process(struct thread *thread, struct comm *comm,
@@ -604,6 +605,7 @@ int thread_stack__process(struct thread *thread, struct comm *comm,
ts->last_time = sample->time;
if (sample->flags & PERF_IP_FLAG_CALL) {
+ bool trace_end = sample->flags & PERF_IP_FLAG_TRACE_END;
struct call_path_root *cpr = ts->crp->cpr;
struct call_path *cp;
u64 ret_addr;
@@ -621,7 +623,7 @@ int thread_stack__process(struct thread *thread, struct comm *comm,
if (!cp)
return -ENOMEM;
err = thread_stack__push_cp(ts, ret_addr, sample->time, ref,
- cp, false);
+ cp, false, trace_end);
} else if (sample->flags & PERF_IP_FLAG_RETURN) {
if (!sample->ip || !sample->addr)
return 0;
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [tip:perf/core] perf intel-pt: Add decoder flags for trace begin / end
2018-09-20 13:00 ` [PATCH V2 5/6] perf intel-pt: Add decoder flags " Adrian Hunter
@ 2018-09-26 8:56 ` tip-bot for Adrian Hunter
0 siblings, 0 replies; 15+ messages in thread
From: tip-bot for Adrian Hunter @ 2018-09-26 8:56 UTC (permalink / raw)
To: linux-tip-commits
Cc: acme, hpa, jolsa, tglx, ak, mingo, adrian.hunter, linux-kernel
Commit-ID: c6b5da093a8ba740b71dd0052f3846016986fd21
Gitweb: https://git.kernel.org/tip/c6b5da093a8ba740b71dd0052f3846016986fd21
Author: Adrian Hunter <adrian.hunter@intel.com>
AuthorDate: Thu, 20 Sep 2018 16:00:47 +0300
Committer: Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Thu, 20 Sep 2018 15:19:51 -0300
perf intel-pt: Add decoder flags for trace begin / end
Previously, the decoder would indicate begin / end by a branch from / to
zero. That hides useful information, in particular when a trace ends
with a call. To prepare for remedying that, add Intel PT decoder flags
for trace begin / end and map them to the existing sample flags.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20180920130048.31432-6-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/util/intel-pt-decoder/intel-pt-decoder.h | 2 ++
tools/perf/util/intel-pt.c | 5 +++++
2 files changed, 7 insertions(+)
diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.h b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.h
index 51c18d67f4ca..ed088d4726ba 100644
--- a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.h
+++ b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.h
@@ -37,6 +37,8 @@ enum intel_pt_sample_type {
INTEL_PT_EX_STOP = 1 << 6,
INTEL_PT_PWR_EXIT = 1 << 7,
INTEL_PT_CBR_CHG = 1 << 8,
+ INTEL_PT_TRACE_BEGIN = 1 << 9,
+ INTEL_PT_TRACE_END = 1 << 10,
};
enum intel_pt_period_type {
diff --git a/tools/perf/util/intel-pt.c b/tools/perf/util/intel-pt.c
index aec68908d604..48c1d415c6b0 100644
--- a/tools/perf/util/intel-pt.c
+++ b/tools/perf/util/intel-pt.c
@@ -908,6 +908,11 @@ static void intel_pt_sample_flags(struct intel_pt_queue *ptq)
ptq->insn_len = ptq->state->insn_len;
memcpy(ptq->insn, ptq->state->insn, INTEL_PT_INSN_BUF_SZ);
}
+
+ if (ptq->state->type & INTEL_PT_TRACE_BEGIN)
+ ptq->flags |= PERF_IP_FLAG_TRACE_BEGIN;
+ if (ptq->state->type & INTEL_PT_TRACE_END)
+ ptq->flags |= PERF_IP_FLAG_TRACE_END;
}
static int intel_pt_setup_queue(struct intel_pt *pt,
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [tip:perf/core] perf intel-pt: Implement decoder flags for trace begin / end
2018-09-20 13:00 ` [PATCH V2 6/6] perf intel-pt: Implement " Adrian Hunter
@ 2018-09-26 8:56 ` tip-bot for Adrian Hunter
0 siblings, 0 replies; 15+ messages in thread
From: tip-bot for Adrian Hunter @ 2018-09-26 8:56 UTC (permalink / raw)
To: linux-tip-commits
Cc: jolsa, ak, acme, hpa, mingo, adrian.hunter, linux-kernel, tglx
Commit-ID: bea6385789b8b5e1e3228a281978ca6c4a8c70a0
Gitweb: https://git.kernel.org/tip/bea6385789b8b5e1e3228a281978ca6c4a8c70a0
Author: Adrian Hunter <adrian.hunter@intel.com>
AuthorDate: Thu, 20 Sep 2018 16:00:48 +0300
Committer: Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Thu, 20 Sep 2018 15:19:52 -0300
perf intel-pt: Implement decoder flags for trace begin / end
Have the Intel PT decoder implement the new Intel PT decoder flags for
trace begin / end.
Previously, the decoder would indicate begin / end by a branch from / to
zero. That hides useful information, in particular when a trace ends
with a call. That happens when using address filters, for example:
$ perf record -e intel_pt/cyc,mtc_period=0,noretcomp/u --filter='filter main @ /bin/uname ' uname Linux
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.031 MB perf.data ]
Before:
$ perf script --itrace=cre -Ftime,flags,ip,sym,symoff,addr --ns
7249.622183310: tr strt 0 [unknown] => 401590 main+0x0
7249.622183311: call 4015b9 main+0x29 => 0 [unknown]
7249.622183711: tr strt 0 [unknown] => 4015be main+0x2e
7249.622183714: call 4015c8 main+0x38 => 0 [unknown]
7249.622247731: tr strt 0 [unknown] => 4015cd main+0x3d
7249.622247760: call 4015d7 main+0x47 => 0 [unknown]
7249.622248340: tr strt 0 [unknown] => 4015dc main+0x4c
7249.622248341: call 4015e1 main+0x51 => 0 [unknown]
7249.622248681: tr strt 0 [unknown] => 4015e6 main+0x56
7249.622248682: call 4015eb main+0x5b => 0 [unknown]
7249.622248970: tr strt 0 [unknown] => 4015f0 main+0x60
7249.622248971: call 401612 main+0x82 => 0 [unknown]
7249.622249757: tr strt 0 [unknown] => 401617 main+0x87
7249.622249770: call 401847 main+0x2b7 => 0 [unknown]
7249.622250606: tr strt 0 [unknown] => 40184c main+0x2bc
7249.622250612: call 4019bf main+0x42f => 0 [unknown]
7249.622256823: tr strt 0 [unknown] => 4019c4 main+0x434
7249.622256863: call 4019f5 main+0x465 => 0 [unknown]
7249.622264217: tr strt 0 [unknown] => 4019fa main+0x46a
7249.622264235: call 401832 main+0x2a2 => 0 [unknown]
After:
$ perf script --itrace=cre -Ftime,flags,ip,sym,symoff,addr --ns
7249.622183310: tr strt 0 [unknown] => 401590 main+0x0
7249.622183311: tr end call 4015b9 main+0x29 => 401ef0 set_program_name+0x0
7249.622183711: tr strt 0 [unknown] => 4015be main+0x2e
7249.622183714: tr end call 4015c8 main+0x38 => 4014b0 setlocale@plt+0x0
7249.622247731: tr strt 0 [unknown] => 4015cd main+0x3d
7249.622247760: tr end call 4015d7 main+0x47 => 4012d0 bindtextdomain@plt+0x0
7249.622248340: tr strt 0 [unknown] => 4015dc main+0x4c
7249.622248341: tr end call 4015e1 main+0x51 => 4012b0 textdomain@plt+0x0
7249.622248681: tr strt 0 [unknown] => 4015e6 main+0x56
7249.622248682: tr end call 4015eb main+0x5b => 404340 atexit+0x0
7249.622248970: tr strt 0 [unknown] => 4015f0 main+0x60
7249.622248971: tr end call 401612 main+0x82 => 401320 getopt_long@plt+0x0
7249.622249757: tr strt 0 [unknown] => 401617 main+0x87
7249.622249770: tr end call 401847 main+0x2b7 => 401360 uname@plt+0x0
7249.622250606: tr strt 0 [unknown] => 40184c main+0x2bc
7249.622250612: tr end call 4019bf main+0x42f => 401b10 print_element+0x0
7249.622256823: tr strt 0 [unknown] => 4019c4 main+0x434
7249.622256863: tr end call 4019f5 main+0x465 => 401340 __overflow@plt+0x0
7249.622264217: tr strt 0 [unknown] => 4019fa main+0x46a
7249.622264235: tr end call 401832 main+0x2a2 => 401520 exit@plt+0x0
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20180920130048.31432-7-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
.../perf/util/intel-pt-decoder/intel-pt-decoder.c | 34 +++++++++++++++-------
1 file changed, 23 insertions(+), 11 deletions(-)
diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
index d404bed7003a..58f6a9ceb590 100644
--- a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
+++ b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
@@ -1165,7 +1165,7 @@ static int intel_pt_walk_tip(struct intel_pt_decoder *decoder)
decoder->pge = false;
decoder->continuous_period = false;
decoder->pkt_state = INTEL_PT_STATE_IN_SYNC;
- decoder->state.to_ip = 0;
+ decoder->state.type |= INTEL_PT_TRACE_END;
return 0;
}
if (err == INTEL_PT_RETURN)
@@ -1179,9 +1179,13 @@ static int intel_pt_walk_tip(struct intel_pt_decoder *decoder)
decoder->continuous_period = false;
decoder->pkt_state = INTEL_PT_STATE_IN_SYNC;
decoder->state.from_ip = decoder->ip;
- decoder->state.to_ip = 0;
- if (decoder->packet.count != 0)
+ if (decoder->packet.count == 0) {
+ decoder->state.to_ip = 0;
+ } else {
+ decoder->state.to_ip = decoder->last_ip;
decoder->ip = decoder->last_ip;
+ }
+ decoder->state.type |= INTEL_PT_TRACE_END;
} else {
decoder->pkt_state = INTEL_PT_STATE_IN_SYNC;
decoder->state.from_ip = decoder->ip;
@@ -1208,7 +1212,8 @@ static int intel_pt_walk_tip(struct intel_pt_decoder *decoder)
decoder->pkt_state = INTEL_PT_STATE_IN_SYNC;
decoder->ip = to_ip;
decoder->state.from_ip = decoder->ip;
- decoder->state.to_ip = 0;
+ decoder->state.to_ip = to_ip;
+ decoder->state.type |= INTEL_PT_TRACE_END;
return 0;
}
intel_pt_log_at("ERROR: Conditional branch when expecting indirect branch",
@@ -1640,14 +1645,15 @@ static int intel_pt_walk_fup_tip(struct intel_pt_decoder *decoder)
case INTEL_PT_TIP_PGD:
decoder->state.from_ip = decoder->ip;
- decoder->state.to_ip = 0;
- if (decoder->packet.count != 0) {
+ if (decoder->packet.count == 0) {
+ decoder->state.to_ip = 0;
+ } else {
intel_pt_set_ip(decoder);
- intel_pt_log("Omitting PGD ip " x64_fmt "\n",
- decoder->ip);
+ decoder->state.to_ip = decoder->ip;
}
decoder->pge = false;
decoder->continuous_period = false;
+ decoder->state.type |= INTEL_PT_TRACE_END;
return 0;
case INTEL_PT_TIP_PGE:
@@ -1661,6 +1667,7 @@ static int intel_pt_walk_fup_tip(struct intel_pt_decoder *decoder)
intel_pt_set_ip(decoder);
decoder->state.to_ip = decoder->ip;
}
+ decoder->state.type |= INTEL_PT_TRACE_BEGIN;
return 0;
case INTEL_PT_TIP:
@@ -1739,6 +1746,7 @@ next:
intel_pt_set_ip(decoder);
decoder->state.from_ip = 0;
decoder->state.to_ip = decoder->ip;
+ decoder->state.type |= INTEL_PT_TRACE_BEGIN;
return 0;
}
@@ -2077,9 +2085,13 @@ static int intel_pt_walk_to_ip(struct intel_pt_decoder *decoder)
decoder->pge = decoder->packet.type != INTEL_PT_TIP_PGD;
if (intel_pt_have_ip(decoder))
intel_pt_set_ip(decoder);
- if (decoder->ip)
- return 0;
- break;
+ if (!decoder->ip)
+ break;
+ if (decoder->packet.type == INTEL_PT_TIP_PGE)
+ decoder->state.type |= INTEL_PT_TRACE_BEGIN;
+ if (decoder->packet.type == INTEL_PT_TIP_PGD)
+ decoder->state.type |= INTEL_PT_TRACE_END;
+ return 0;
case INTEL_PT_FUP:
if (intel_pt_have_ip(decoder))
^ permalink raw reply related [flat|nested] 15+ messages in thread
end of thread, other threads:[~2018-09-26 8:56 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-20 13:00 [PATCH V2 0/6] perf intel-pt: Improve the data displayed when using address filters Adrian Hunter
2018-09-20 13:00 ` [PATCH V2 1/6] perf script: Enhance sample flags for trace begin / end Adrian Hunter
2018-09-26 8:53 ` [tip:perf/core] " tip-bot for Adrian Hunter
2018-09-20 13:00 ` [PATCH V2 2/6] perf db-export: Add trace begin / end branch type variants Adrian Hunter
2018-09-26 8:54 ` [tip:perf/core] " tip-bot for Adrian Hunter
2018-09-20 13:00 ` [PATCH V2 3/6] perf tools: Improve thread_stack__event() for trace begin / end Adrian Hunter
2018-09-26 8:54 ` [tip:perf/core] " tip-bot for Adrian Hunter
2018-09-20 13:00 ` [PATCH V2 4/6] perf tools: Improve thread_stack__process() " Adrian Hunter
2018-09-26 8:55 ` [tip:perf/core] " tip-bot for Adrian Hunter
2018-09-20 13:00 ` [PATCH V2 5/6] perf intel-pt: Add decoder flags " Adrian Hunter
2018-09-26 8:56 ` [tip:perf/core] " tip-bot for Adrian Hunter
2018-09-20 13:00 ` [PATCH V2 6/6] perf intel-pt: Implement " Adrian Hunter
2018-09-26 8:56 ` [tip:perf/core] " tip-bot for Adrian Hunter
2018-09-20 13:41 ` [PATCH V2 0/6] perf intel-pt: Improve the data displayed when using address filters Arnaldo Carvalho de Melo
2018-09-20 14:13 ` Arnaldo Carvalho de Melo
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).