linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V2 0/6] perf intel-pt: Improve the data displayed when using address filters
@ 2018-09-20 13:00 Adrian Hunter
  2018-09-20 13:00 ` [PATCH V2 1/6] perf script: Enhance sample flags for trace begin / end Adrian Hunter
                   ` (7 more replies)
  0 siblings, 8 replies; 15+ messages in thread
From: Adrian Hunter @ 2018-09-20 13:00 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: Jiri Olsa, Andi Kleen, linux-kernel

Hi

Here is V2 of some Intel PT patches to improve the data displayed when using
address filters.

Previously, the decoder would indicate begin / end by a branch from / to
zero. That hides useful information, in particular when a trace ends with a
call. That happens when using address filters, for example:

$ perf record -e intel_pt/cyc,mtc_period=0,noretcomp/u --filter='filter main @ /bin/uname ' uname
Linux
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.031 MB perf.data ]

Before:

$ perf script --itrace=cre -Ftime,flags,ip,sym,symoff,addr --ns
 7249.622183310:   tr strt         0 [unknown] =>   401590 main+0x0
 7249.622183311:   call       4015b9 main+0x29 =>        0 [unknown]
 7249.622183711:   tr strt         0 [unknown] =>   4015be main+0x2e
 7249.622183714:   call       4015c8 main+0x38 =>        0 [unknown]
 7249.622247731:   tr strt         0 [unknown] =>   4015cd main+0x3d
 7249.622247760:   call       4015d7 main+0x47 =>        0 [unknown]
 7249.622248340:   tr strt         0 [unknown] =>   4015dc main+0x4c
 7249.622248341:   call       4015e1 main+0x51 =>        0 [unknown]
 7249.622248681:   tr strt         0 [unknown] =>   4015e6 main+0x56
 7249.622248682:   call       4015eb main+0x5b =>        0 [unknown]
 7249.622248970:   tr strt         0 [unknown] =>   4015f0 main+0x60
 7249.622248971:   call       401612 main+0x82 =>        0 [unknown]
 7249.622249757:   tr strt         0 [unknown] =>   401617 main+0x87
 7249.622249770:   call       401847 main+0x2b7 =>        0 [unknown]
 7249.622250606:   tr strt         0 [unknown] =>   40184c main+0x2bc
 7249.622250612:   call       4019bf main+0x42f =>        0 [unknown]
 7249.622256823:   tr strt         0 [unknown] =>   4019c4 main+0x434
 7249.622256863:   call       4019f5 main+0x465 =>        0 [unknown]
 7249.622264217:   tr strt         0 [unknown] =>   4019fa main+0x46a
 7249.622264235:   call       401832 main+0x2a2 =>        0 [unknown]

After:

$ perf script --itrace=cre -Ftime,flags,ip,sym,symoff,addr --ns
 7249.622183310:   tr strt              0 [unknown] =>   401590 main+0x0
 7249.622183311:   tr end  call    4015b9 main+0x29 =>   401ef0 set_program_name+0x0
 7249.622183711:   tr strt              0 [unknown] =>   4015be main+0x2e
 7249.622183714:   tr end  call    4015c8 main+0x38 =>   4014b0 setlocale@plt+0x0
 7249.622247731:   tr strt              0 [unknown] =>   4015cd main+0x3d
 7249.622247760:   tr end  call    4015d7 main+0x47 =>   4012d0 bindtextdomain@plt+0x0
 7249.622248340:   tr strt              0 [unknown] =>   4015dc main+0x4c
 7249.622248341:   tr end  call    4015e1 main+0x51 =>   4012b0 textdomain@plt+0x0
 7249.622248681:   tr strt              0 [unknown] =>   4015e6 main+0x56
 7249.622248682:   tr end  call    4015eb main+0x5b =>   404340 atexit+0x0
 7249.622248970:   tr strt              0 [unknown] =>   4015f0 main+0x60
 7249.622248971:   tr end  call    401612 main+0x82 =>   401320 getopt_long@plt+0x0
 7249.622249757:   tr strt              0 [unknown] =>   401617 main+0x87
 7249.622249770:   tr end  call    401847 main+0x2b7 =>   401360 uname@plt+0x0
 7249.622250606:   tr strt              0 [unknown] =>   40184c main+0x2bc
 7249.622250612:   tr end  call    4019bf main+0x42f =>   401b10 print_element+0x0
 7249.622256823:   tr strt              0 [unknown] =>   4019c4 main+0x434
 7249.622256863:   tr end  call    4019f5 main+0x465 =>   401340 __overflow@plt+0x0
 7249.622264217:   tr strt              0 [unknown] =>   4019fa main+0x46a
 7249.622264235:   tr end  call    401832 main+0x2a2 =>   401520 exit@plt+0x0


Changes in V2:

	Improve commit messages


Adrian Hunter (6):
      perf script: Enhance sample flags for trace begin / end
      perf db-export: Add trace begin / end branch type variants
      perf tools: Improve thread_stack__event() for trace begin / end
      perf tools: Improve thread_stack__process() for trace begin / end
      perf intel-pt: Add decoder flags for trace begin / end
      perf intel-pt: Implement decoder flags for trace begin / end

 tools/perf/builtin-script.c                        | 36 +++++++++++----
 tools/perf/util/db-export.c                        | 22 ++++++++++
 .../perf/util/intel-pt-decoder/intel-pt-decoder.c  | 34 ++++++++++-----
 .../perf/util/intel-pt-decoder/intel-pt-decoder.h  |  2 +
 tools/perf/util/intel-pt.c                         |  5 +++
 tools/perf/util/thread-stack.c                     | 51 +++++++++++++++++-----
 6 files changed, 118 insertions(+), 32 deletions(-)


Regards
Adrian

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH V2 1/6] perf script: Enhance sample flags for trace begin / end
  2018-09-20 13:00 [PATCH V2 0/6] perf intel-pt: Improve the data displayed when using address filters Adrian Hunter
@ 2018-09-20 13:00 ` Adrian Hunter
  2018-09-26  8:53   ` [tip:perf/core] " tip-bot for Adrian Hunter
  2018-09-20 13:00 ` [PATCH V2 2/6] perf db-export: Add trace begin / end branch type variants Adrian Hunter
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 15+ messages in thread
From: Adrian Hunter @ 2018-09-20 13:00 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: Jiri Olsa, Andi Kleen, linux-kernel

Allow for different combinations of sample flags with "trace begin" or
"trace end".

Previously, the Intel PT decoder would indicate begin / end by a branch
from / to zero. That hides useful information, in particular when a trace
ends with a call. Before remedying that, prepare 'perf script' to display
sample flags with more combinations that include trace begin / end. In
those cases display 'tr start' and 'tr end' separately.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/builtin-script.c | 36 +++++++++++++++++++++++++++---------
 1 file changed, 27 insertions(+), 9 deletions(-)

diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 6176bae177c2..4982380ba96d 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -1255,6 +1255,18 @@ static struct {
 	{0, NULL}
 };
 
+static const char *sample_flags_to_name(u32 flags)
+{
+	int i;
+
+	for (i = 0; sample_flags[i].name ; i++) {
+		if (sample_flags[i].flags == flags)
+			return sample_flags[i].name;
+	}
+
+	return NULL;
+}
+
 static int perf_sample__fprintf_flags(u32 flags, FILE *fp)
 {
 	const char *chars = PERF_IP_FLAG_CHARS;
@@ -1264,11 +1276,20 @@ static int perf_sample__fprintf_flags(u32 flags, FILE *fp)
 	char str[33];
 	int i, pos = 0;
 
-	for (i = 0; sample_flags[i].name ; i++) {
-		if (sample_flags[i].flags == (flags & ~PERF_IP_FLAG_IN_TX)) {
-			name = sample_flags[i].name;
-			break;
-		}
+	name = sample_flags_to_name(flags & ~PERF_IP_FLAG_IN_TX);
+	if (name)
+		return fprintf(fp, "  %-15s%4s ", name, in_tx ? "(x)" : "");
+
+	if (flags & PERF_IP_FLAG_TRACE_BEGIN) {
+		name = sample_flags_to_name(flags & ~(PERF_IP_FLAG_IN_TX | PERF_IP_FLAG_TRACE_BEGIN));
+		if (name)
+			return fprintf(fp, "  tr strt %-7s%4s ", name, in_tx ? "(x)" : "");
+	}
+
+	if (flags & PERF_IP_FLAG_TRACE_END) {
+		name = sample_flags_to_name(flags & ~(PERF_IP_FLAG_IN_TX | PERF_IP_FLAG_TRACE_END));
+		if (name)
+			return fprintf(fp, "  tr end  %-7s%4s ", name, in_tx ? "(x)" : "");
 	}
 
 	for (i = 0; i < n; i++, flags >>= 1) {
@@ -1281,10 +1302,7 @@ static int perf_sample__fprintf_flags(u32 flags, FILE *fp)
 	}
 	str[pos] = 0;
 
-	if (name)
-		return fprintf(fp, "  %-7s%4s ", name, in_tx ? "(x)" : "");
-
-	return fprintf(fp, "  %-11s ", str);
+	return fprintf(fp, "  %-19s ", str);
 }
 
 struct printer_data {
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH V2 2/6] perf db-export: Add trace begin / end branch type variants
  2018-09-20 13:00 [PATCH V2 0/6] perf intel-pt: Improve the data displayed when using address filters Adrian Hunter
  2018-09-20 13:00 ` [PATCH V2 1/6] perf script: Enhance sample flags for trace begin / end Adrian Hunter
@ 2018-09-20 13:00 ` Adrian Hunter
  2018-09-26  8:54   ` [tip:perf/core] " tip-bot for Adrian Hunter
  2018-09-20 13:00 ` [PATCH V2 3/6] perf tools: Improve thread_stack__event() for trace begin / end Adrian Hunter
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 15+ messages in thread
From: Adrian Hunter @ 2018-09-20 13:00 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: Jiri Olsa, Andi Kleen, linux-kernel

Add branch types to cover different combinations with "trace begin" or
"trace end".

Previously, the Intel PT decoder would indicate begin / end by a branch
from / to zero. That hides useful information, in particular when a trace
ends with a call. Before remedying that, prepare the database export to
export branch types with more combinations that include trace begin / end.
In those cases extend the descriptions to include 'trace begin' and
'trace end' separately.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/util/db-export.c | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/tools/perf/util/db-export.c b/tools/perf/util/db-export.c
index 7123746edcf4..69fbb0a72d0c 100644
--- a/tools/perf/util/db-export.c
+++ b/tools/perf/util/db-export.c
@@ -463,6 +463,28 @@ int db_export__branch_types(struct db_export *dbe)
 		if (err)
 			break;
 	}
+
+	/* Add trace begin / end variants */
+	for (i = 0; branch_types[i].name ; i++) {
+		const char *name = branch_types[i].name;
+		u32 type = branch_types[i].branch_type;
+		char buf[64];
+
+		if (type == PERF_IP_FLAG_BRANCH ||
+		    (type & (PERF_IP_FLAG_TRACE_BEGIN | PERF_IP_FLAG_TRACE_END)))
+			continue;
+
+		snprintf(buf, sizeof(buf), "trace begin / %s", name);
+		err = db_export__branch_type(dbe, type | PERF_IP_FLAG_TRACE_BEGIN, buf);
+		if (err)
+			break;
+
+		snprintf(buf, sizeof(buf), "%s / trace end", name);
+		err = db_export__branch_type(dbe, type | PERF_IP_FLAG_TRACE_END, buf);
+		if (err)
+			break;
+	}
+
 	return err;
 }
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH V2 3/6] perf tools: Improve thread_stack__event() for trace begin / end
  2018-09-20 13:00 [PATCH V2 0/6] perf intel-pt: Improve the data displayed when using address filters Adrian Hunter
  2018-09-20 13:00 ` [PATCH V2 1/6] perf script: Enhance sample flags for trace begin / end Adrian Hunter
  2018-09-20 13:00 ` [PATCH V2 2/6] perf db-export: Add trace begin / end branch type variants Adrian Hunter
@ 2018-09-20 13:00 ` Adrian Hunter
  2018-09-26  8:54   ` [tip:perf/core] " tip-bot for Adrian Hunter
  2018-09-20 13:00 ` [PATCH V2 4/6] perf tools: Improve thread_stack__process() " Adrian Hunter
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 15+ messages in thread
From: Adrian Hunter @ 2018-09-20 13:00 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: Jiri Olsa, Andi Kleen, linux-kernel

thread_stack__event() is used to create call stacks, by keeping track of
calls and returns. Improve the handling of trace begin / end to allow for a
trace that ends in a call.

Previously, the Intel PT decoder would indicate begin / end by a branch
from / to zero. That hides useful information, in particular when a trace
ends with a call. Before remedying that, enhance the thread stack so that
it does not expect to see the 'return' for a 'call' that ends the trace.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/util/thread-stack.c | 35 +++++++++++++++++++++++++++++-----
 1 file changed, 30 insertions(+), 5 deletions(-)

diff --git a/tools/perf/util/thread-stack.c b/tools/perf/util/thread-stack.c
index dd17d6a38d3a..cea28b9074c1 100644
--- a/tools/perf/util/thread-stack.c
+++ b/tools/perf/util/thread-stack.c
@@ -36,6 +36,7 @@
  * @branch_count: the branch count when the entry was created
  * @cp: call path
  * @no_call: a 'call' was not seen
+ * @trace_end: a 'call' but trace ended
  */
 struct thread_stack_entry {
 	u64 ret_addr;
@@ -44,6 +45,7 @@ struct thread_stack_entry {
 	u64 branch_count;
 	struct call_path *cp;
 	bool no_call;
+	bool trace_end;
 };
 
 /**
@@ -112,7 +114,8 @@ static struct thread_stack *thread_stack__new(struct thread *thread,
 	return ts;
 }
 
-static int thread_stack__push(struct thread_stack *ts, u64 ret_addr)
+static int thread_stack__push(struct thread_stack *ts, u64 ret_addr,
+			      bool trace_end)
 {
 	int err = 0;
 
@@ -124,6 +127,7 @@ static int thread_stack__push(struct thread_stack *ts, u64 ret_addr)
 		}
 	}
 
+	ts->stack[ts->cnt].trace_end = trace_end;
 	ts->stack[ts->cnt++].ret_addr = ret_addr;
 
 	return err;
@@ -150,6 +154,18 @@ static void thread_stack__pop(struct thread_stack *ts, u64 ret_addr)
 	}
 }
 
+static void thread_stack__pop_trace_end(struct thread_stack *ts)
+{
+	size_t i;
+
+	for (i = ts->cnt; i; ) {
+		if (ts->stack[--i].trace_end)
+			ts->cnt = i;
+		else
+			return;
+	}
+}
+
 static bool thread_stack__in_kernel(struct thread_stack *ts)
 {
 	if (!ts->cnt)
@@ -254,10 +270,19 @@ int thread_stack__event(struct thread *thread, u32 flags, u64 from_ip,
 		ret_addr = from_ip + insn_len;
 		if (ret_addr == to_ip)
 			return 0; /* Zero-length calls are excluded */
-		return thread_stack__push(thread->ts, ret_addr);
-	} else if (flags & PERF_IP_FLAG_RETURN) {
-		if (!from_ip)
-			return 0;
+		return thread_stack__push(thread->ts, ret_addr,
+					  flags && PERF_IP_FLAG_TRACE_END);
+	} else if (flags & PERF_IP_FLAG_TRACE_BEGIN) {
+		/*
+		 * If the caller did not change the trace number (which would
+		 * have flushed the stack) then try to make sense of the stack.
+		 * Possibly, tracing began after returning to the current
+		 * address, so try to pop that. Also, do not expect a call made
+		 * when the trace ended, to return, so pop that.
+		 */
+		thread_stack__pop(thread->ts, to_ip);
+		thread_stack__pop_trace_end(thread->ts);
+	} else if ((flags & PERF_IP_FLAG_RETURN) && from_ip) {
 		thread_stack__pop(thread->ts, to_ip);
 	}
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH V2 4/6] perf tools: Improve thread_stack__process() for trace begin / end
  2018-09-20 13:00 [PATCH V2 0/6] perf intel-pt: Improve the data displayed when using address filters Adrian Hunter
                   ` (2 preceding siblings ...)
  2018-09-20 13:00 ` [PATCH V2 3/6] perf tools: Improve thread_stack__event() for trace begin / end Adrian Hunter
@ 2018-09-20 13:00 ` Adrian Hunter
  2018-09-26  8:55   ` [tip:perf/core] " tip-bot for Adrian Hunter
  2018-09-20 13:00 ` [PATCH V2 5/6] perf intel-pt: Add decoder flags " Adrian Hunter
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 15+ messages in thread
From: Adrian Hunter @ 2018-09-20 13:00 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: Jiri Olsa, Andi Kleen, linux-kernel

thread_stack__process() is used to create call paths for database export.
Improve the handling of trace begin / end to allow for a trace that ends in
a call.

Previously, the Intel PT decoder would indicate begin / end by a branch
from / to zero. That hides useful information, in particular when a trace
ends with a call. Before remedying that, enhance the thread stack so that
it identifies the trace end by the flag instead of by ip == 0.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/util/thread-stack.c | 16 +++++++++-------
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/tools/perf/util/thread-stack.c b/tools/perf/util/thread-stack.c
index cea28b9074c1..45a97d15c6c8 100644
--- a/tools/perf/util/thread-stack.c
+++ b/tools/perf/util/thread-stack.c
@@ -357,7 +357,7 @@ void call_return_processor__free(struct call_return_processor *crp)
 
 static int thread_stack__push_cp(struct thread_stack *ts, u64 ret_addr,
 				 u64 timestamp, u64 ref, struct call_path *cp,
-				 bool no_call)
+				 bool no_call, bool trace_end)
 {
 	struct thread_stack_entry *tse;
 	int err;
@@ -375,6 +375,7 @@ static int thread_stack__push_cp(struct thread_stack *ts, u64 ret_addr,
 	tse->branch_count = ts->branch_count;
 	tse->cp = cp;
 	tse->no_call = no_call;
+	tse->trace_end = trace_end;
 
 	return 0;
 }
@@ -448,7 +449,7 @@ static int thread_stack__bottom(struct thread *thread, struct thread_stack *ts,
 		return -ENOMEM;
 
 	return thread_stack__push_cp(thread->ts, ip, sample->time, ref, cp,
-				     true);
+				     true, false);
 }
 
 static int thread_stack__no_call_return(struct thread *thread,
@@ -480,7 +481,7 @@ static int thread_stack__no_call_return(struct thread *thread,
 			if (!cp)
 				return -ENOMEM;
 			return thread_stack__push_cp(ts, 0, sample->time, ref,
-						     cp, true);
+						     cp, true, false);
 		}
 	} else if (thread_stack__in_kernel(ts) && sample->ip < ks) {
 		/* Return to userspace, so pop all kernel addresses */
@@ -505,7 +506,7 @@ static int thread_stack__no_call_return(struct thread *thread,
 		return -ENOMEM;
 
 	err = thread_stack__push_cp(ts, sample->addr, sample->time, ref, cp,
-				    true);
+				    true, false);
 	if (err)
 		return err;
 
@@ -525,7 +526,7 @@ static int thread_stack__trace_begin(struct thread *thread,
 
 	/* Pop trace end */
 	tse = &ts->stack[ts->cnt - 1];
-	if (tse->cp->sym == NULL && tse->cp->ip == 0) {
+	if (tse->trace_end) {
 		err = thread_stack__call_return(thread, ts, --ts->cnt,
 						timestamp, ref, false);
 		if (err)
@@ -554,7 +555,7 @@ static int thread_stack__trace_end(struct thread_stack *ts,
 	ret_addr = sample->ip + sample->insn_len;
 
 	return thread_stack__push_cp(ts, ret_addr, sample->time, ref, cp,
-				     false);
+				     false, true);
 }
 
 int thread_stack__process(struct thread *thread, struct comm *comm,
@@ -604,6 +605,7 @@ int thread_stack__process(struct thread *thread, struct comm *comm,
 	ts->last_time = sample->time;
 
 	if (sample->flags & PERF_IP_FLAG_CALL) {
+		bool trace_end = sample->flags & PERF_IP_FLAG_TRACE_END;
 		struct call_path_root *cpr = ts->crp->cpr;
 		struct call_path *cp;
 		u64 ret_addr;
@@ -621,7 +623,7 @@ int thread_stack__process(struct thread *thread, struct comm *comm,
 		if (!cp)
 			return -ENOMEM;
 		err = thread_stack__push_cp(ts, ret_addr, sample->time, ref,
-					    cp, false);
+					    cp, false, trace_end);
 	} else if (sample->flags & PERF_IP_FLAG_RETURN) {
 		if (!sample->ip || !sample->addr)
 			return 0;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH V2 5/6] perf intel-pt: Add decoder flags for trace begin / end
  2018-09-20 13:00 [PATCH V2 0/6] perf intel-pt: Improve the data displayed when using address filters Adrian Hunter
                   ` (3 preceding siblings ...)
  2018-09-20 13:00 ` [PATCH V2 4/6] perf tools: Improve thread_stack__process() " Adrian Hunter
@ 2018-09-20 13:00 ` Adrian Hunter
  2018-09-26  8:56   ` [tip:perf/core] " tip-bot for Adrian Hunter
  2018-09-20 13:00 ` [PATCH V2 6/6] perf intel-pt: Implement " Adrian Hunter
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 15+ messages in thread
From: Adrian Hunter @ 2018-09-20 13:00 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: Jiri Olsa, Andi Kleen, linux-kernel

Previously, the decoder would indicate begin / end by a branch from / to
zero. That hides useful information, in particular when a trace ends with a
call. To prepare for remedying that, add Intel PT decoder flags for trace
begin / end and map them to the existing sample flags.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/util/intel-pt-decoder/intel-pt-decoder.h | 2 ++
 tools/perf/util/intel-pt.c                          | 5 +++++
 2 files changed, 7 insertions(+)

diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.h b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.h
index 51c18d67f4ca..ed088d4726ba 100644
--- a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.h
+++ b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.h
@@ -37,6 +37,8 @@ enum intel_pt_sample_type {
 	INTEL_PT_EX_STOP	= 1 << 6,
 	INTEL_PT_PWR_EXIT	= 1 << 7,
 	INTEL_PT_CBR_CHG	= 1 << 8,
+	INTEL_PT_TRACE_BEGIN	= 1 << 9,
+	INTEL_PT_TRACE_END	= 1 << 10,
 };
 
 enum intel_pt_period_type {
diff --git a/tools/perf/util/intel-pt.c b/tools/perf/util/intel-pt.c
index aec68908d604..48c1d415c6b0 100644
--- a/tools/perf/util/intel-pt.c
+++ b/tools/perf/util/intel-pt.c
@@ -908,6 +908,11 @@ static void intel_pt_sample_flags(struct intel_pt_queue *ptq)
 		ptq->insn_len = ptq->state->insn_len;
 		memcpy(ptq->insn, ptq->state->insn, INTEL_PT_INSN_BUF_SZ);
 	}
+
+	if (ptq->state->type & INTEL_PT_TRACE_BEGIN)
+		ptq->flags |= PERF_IP_FLAG_TRACE_BEGIN;
+	if (ptq->state->type & INTEL_PT_TRACE_END)
+		ptq->flags |= PERF_IP_FLAG_TRACE_END;
 }
 
 static int intel_pt_setup_queue(struct intel_pt *pt,
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH V2 6/6] perf intel-pt: Implement decoder flags for trace begin / end
  2018-09-20 13:00 [PATCH V2 0/6] perf intel-pt: Improve the data displayed when using address filters Adrian Hunter
                   ` (4 preceding siblings ...)
  2018-09-20 13:00 ` [PATCH V2 5/6] perf intel-pt: Add decoder flags " Adrian Hunter
@ 2018-09-20 13:00 ` Adrian Hunter
  2018-09-26  8:56   ` [tip:perf/core] " tip-bot for Adrian Hunter
  2018-09-20 13:41 ` [PATCH V2 0/6] perf intel-pt: Improve the data displayed when using address filters Arnaldo Carvalho de Melo
  2018-09-20 14:13 ` Arnaldo Carvalho de Melo
  7 siblings, 1 reply; 15+ messages in thread
From: Adrian Hunter @ 2018-09-20 13:00 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: Jiri Olsa, Andi Kleen, linux-kernel

Have the Intel PT decoder implement the new Intel PT decoder flags for
trace begin / end.

Previously, the decoder would indicate begin / end by a branch from / to
zero. That hides useful information, in particular when a trace ends with a
call. That happens when using address filters, for example:

$ perf record -e intel_pt/cyc,mtc_period=0,noretcomp/u --filter='filter main @ /bin/uname ' uname
Linux
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.031 MB perf.data ]

Before:

$ perf script --itrace=cre -Ftime,flags,ip,sym,symoff,addr --ns
 7249.622183310:   tr strt         0 [unknown] =>   401590 main+0x0
 7249.622183311:   call       4015b9 main+0x29 =>        0 [unknown]
 7249.622183711:   tr strt         0 [unknown] =>   4015be main+0x2e
 7249.622183714:   call       4015c8 main+0x38 =>        0 [unknown]
 7249.622247731:   tr strt         0 [unknown] =>   4015cd main+0x3d
 7249.622247760:   call       4015d7 main+0x47 =>        0 [unknown]
 7249.622248340:   tr strt         0 [unknown] =>   4015dc main+0x4c
 7249.622248341:   call       4015e1 main+0x51 =>        0 [unknown]
 7249.622248681:   tr strt         0 [unknown] =>   4015e6 main+0x56
 7249.622248682:   call       4015eb main+0x5b =>        0 [unknown]
 7249.622248970:   tr strt         0 [unknown] =>   4015f0 main+0x60
 7249.622248971:   call       401612 main+0x82 =>        0 [unknown]
 7249.622249757:   tr strt         0 [unknown] =>   401617 main+0x87
 7249.622249770:   call       401847 main+0x2b7 =>        0 [unknown]
 7249.622250606:   tr strt         0 [unknown] =>   40184c main+0x2bc
 7249.622250612:   call       4019bf main+0x42f =>        0 [unknown]
 7249.622256823:   tr strt         0 [unknown] =>   4019c4 main+0x434
 7249.622256863:   call       4019f5 main+0x465 =>        0 [unknown]
 7249.622264217:   tr strt         0 [unknown] =>   4019fa main+0x46a
 7249.622264235:   call       401832 main+0x2a2 =>        0 [unknown]

After:

$ perf script --itrace=cre -Ftime,flags,ip,sym,symoff,addr --ns
 7249.622183310:   tr strt              0 [unknown] =>   401590 main+0x0
 7249.622183311:   tr end  call    4015b9 main+0x29 =>   401ef0 set_program_name+0x0
 7249.622183711:   tr strt              0 [unknown] =>   4015be main+0x2e
 7249.622183714:   tr end  call    4015c8 main+0x38 =>   4014b0 setlocale@plt+0x0
 7249.622247731:   tr strt              0 [unknown] =>   4015cd main+0x3d
 7249.622247760:   tr end  call    4015d7 main+0x47 =>   4012d0 bindtextdomain@plt+0x0
 7249.622248340:   tr strt              0 [unknown] =>   4015dc main+0x4c
 7249.622248341:   tr end  call    4015e1 main+0x51 =>   4012b0 textdomain@plt+0x0
 7249.622248681:   tr strt              0 [unknown] =>   4015e6 main+0x56
 7249.622248682:   tr end  call    4015eb main+0x5b =>   404340 atexit+0x0
 7249.622248970:   tr strt              0 [unknown] =>   4015f0 main+0x60
 7249.622248971:   tr end  call    401612 main+0x82 =>   401320 getopt_long@plt+0x0
 7249.622249757:   tr strt              0 [unknown] =>   401617 main+0x87
 7249.622249770:   tr end  call    401847 main+0x2b7 =>   401360 uname@plt+0x0
 7249.622250606:   tr strt              0 [unknown] =>   40184c main+0x2bc
 7249.622250612:   tr end  call    4019bf main+0x42f =>   401b10 print_element+0x0
 7249.622256823:   tr strt              0 [unknown] =>   4019c4 main+0x434
 7249.622256863:   tr end  call    4019f5 main+0x465 =>   401340 __overflow@plt+0x0
 7249.622264217:   tr strt              0 [unknown] =>   4019fa main+0x46a
 7249.622264235:   tr end  call    401832 main+0x2a2 =>   401520 exit@plt+0x0

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 .../util/intel-pt-decoder/intel-pt-decoder.c  | 34 +++++++++++++------
 1 file changed, 23 insertions(+), 11 deletions(-)

diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
index d404bed7003a..58f6a9ceb590 100644
--- a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
+++ b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
@@ -1165,7 +1165,7 @@ static int intel_pt_walk_tip(struct intel_pt_decoder *decoder)
 		decoder->pge = false;
 		decoder->continuous_period = false;
 		decoder->pkt_state = INTEL_PT_STATE_IN_SYNC;
-		decoder->state.to_ip = 0;
+		decoder->state.type |= INTEL_PT_TRACE_END;
 		return 0;
 	}
 	if (err == INTEL_PT_RETURN)
@@ -1179,9 +1179,13 @@ static int intel_pt_walk_tip(struct intel_pt_decoder *decoder)
 			decoder->continuous_period = false;
 			decoder->pkt_state = INTEL_PT_STATE_IN_SYNC;
 			decoder->state.from_ip = decoder->ip;
-			decoder->state.to_ip = 0;
-			if (decoder->packet.count != 0)
+			if (decoder->packet.count == 0) {
+				decoder->state.to_ip = 0;
+			} else {
+				decoder->state.to_ip = decoder->last_ip;
 				decoder->ip = decoder->last_ip;
+			}
+			decoder->state.type |= INTEL_PT_TRACE_END;
 		} else {
 			decoder->pkt_state = INTEL_PT_STATE_IN_SYNC;
 			decoder->state.from_ip = decoder->ip;
@@ -1208,7 +1212,8 @@ static int intel_pt_walk_tip(struct intel_pt_decoder *decoder)
 			decoder->pkt_state = INTEL_PT_STATE_IN_SYNC;
 			decoder->ip = to_ip;
 			decoder->state.from_ip = decoder->ip;
-			decoder->state.to_ip = 0;
+			decoder->state.to_ip = to_ip;
+			decoder->state.type |= INTEL_PT_TRACE_END;
 			return 0;
 		}
 		intel_pt_log_at("ERROR: Conditional branch when expecting indirect branch",
@@ -1640,14 +1645,15 @@ static int intel_pt_walk_fup_tip(struct intel_pt_decoder *decoder)
 
 		case INTEL_PT_TIP_PGD:
 			decoder->state.from_ip = decoder->ip;
-			decoder->state.to_ip = 0;
-			if (decoder->packet.count != 0) {
+			if (decoder->packet.count == 0) {
+				decoder->state.to_ip = 0;
+			} else {
 				intel_pt_set_ip(decoder);
-				intel_pt_log("Omitting PGD ip " x64_fmt "\n",
-					     decoder->ip);
+				decoder->state.to_ip = decoder->ip;
 			}
 			decoder->pge = false;
 			decoder->continuous_period = false;
+			decoder->state.type |= INTEL_PT_TRACE_END;
 			return 0;
 
 		case INTEL_PT_TIP_PGE:
@@ -1661,6 +1667,7 @@ static int intel_pt_walk_fup_tip(struct intel_pt_decoder *decoder)
 				intel_pt_set_ip(decoder);
 				decoder->state.to_ip = decoder->ip;
 			}
+			decoder->state.type |= INTEL_PT_TRACE_BEGIN;
 			return 0;
 
 		case INTEL_PT_TIP:
@@ -1739,6 +1746,7 @@ static int intel_pt_walk_trace(struct intel_pt_decoder *decoder)
 			intel_pt_set_ip(decoder);
 			decoder->state.from_ip = 0;
 			decoder->state.to_ip = decoder->ip;
+			decoder->state.type |= INTEL_PT_TRACE_BEGIN;
 			return 0;
 		}
 
@@ -2077,9 +2085,13 @@ static int intel_pt_walk_to_ip(struct intel_pt_decoder *decoder)
 			decoder->pge = decoder->packet.type != INTEL_PT_TIP_PGD;
 			if (intel_pt_have_ip(decoder))
 				intel_pt_set_ip(decoder);
-			if (decoder->ip)
-				return 0;
-			break;
+			if (!decoder->ip)
+				break;
+			if (decoder->packet.type == INTEL_PT_TIP_PGE)
+				decoder->state.type |= INTEL_PT_TRACE_BEGIN;
+			if (decoder->packet.type == INTEL_PT_TIP_PGD)
+				decoder->state.type |= INTEL_PT_TRACE_END;
+			return 0;
 
 		case INTEL_PT_FUP:
 			if (intel_pt_have_ip(decoder))
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH V2 0/6] perf intel-pt: Improve the data displayed when using address filters
  2018-09-20 13:00 [PATCH V2 0/6] perf intel-pt: Improve the data displayed when using address filters Adrian Hunter
                   ` (5 preceding siblings ...)
  2018-09-20 13:00 ` [PATCH V2 6/6] perf intel-pt: Implement " Adrian Hunter
@ 2018-09-20 13:41 ` Arnaldo Carvalho de Melo
  2018-09-20 14:13 ` Arnaldo Carvalho de Melo
  7 siblings, 0 replies; 15+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-09-20 13:41 UTC (permalink / raw)
  To: Adrian Hunter; +Cc: Jiri Olsa, Andi Kleen, linux-kernel

Em Thu, Sep 20, 2018 at 04:00:42PM +0300, Adrian Hunter escreveu:
> Here is V2 of some Intel PT patches to improve the data displayed when using
> address filters.

<SNIP>

> Changes in V2:
> 
> 	Improve commit messages

Thanks a lot, helps a lot,

- Arnaldo

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH V2 0/6] perf intel-pt: Improve the data displayed when using address filters
  2018-09-20 13:00 [PATCH V2 0/6] perf intel-pt: Improve the data displayed when using address filters Adrian Hunter
                   ` (6 preceding siblings ...)
  2018-09-20 13:41 ` [PATCH V2 0/6] perf intel-pt: Improve the data displayed when using address filters Arnaldo Carvalho de Melo
@ 2018-09-20 14:13 ` Arnaldo Carvalho de Melo
  7 siblings, 0 replies; 15+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-09-20 14:13 UTC (permalink / raw)
  To: Adrian Hunter; +Cc: Jiri Olsa, Andi Kleen, linux-kernel

Em Thu, Sep 20, 2018 at 04:00:42PM +0300, Adrian Hunter escreveu:
> Hi
> 
> Here is V2 of some Intel PT patches to improve the data displayed when using
> address filters.
> 
> Previously, the decoder would indicate begin / end by a branch from / to
> zero. That hides useful information, in particular when a trace ends with a
> call. That happens when using address filters, for example:
> 
> $ perf record -e intel_pt/cyc,mtc_period=0,noretcomp/u --filter='filter main @ /bin/uname ' uname
> Linux
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.031 MB perf.data ]

Thanks, applied.

- Arnaldo

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [tip:perf/core] perf script: Enhance sample flags for trace begin / end
  2018-09-20 13:00 ` [PATCH V2 1/6] perf script: Enhance sample flags for trace begin / end Adrian Hunter
@ 2018-09-26  8:53   ` tip-bot for Adrian Hunter
  0 siblings, 0 replies; 15+ messages in thread
From: tip-bot for Adrian Hunter @ 2018-09-26  8:53 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: acme, jolsa, linux-kernel, tglx, ak, adrian.hunter, hpa, mingo

Commit-ID:  62cb1b8868a70c932b15959a98594df537df2ffc
Gitweb:     https://git.kernel.org/tip/62cb1b8868a70c932b15959a98594df537df2ffc
Author:     Adrian Hunter <adrian.hunter@intel.com>
AuthorDate: Thu, 20 Sep 2018 16:00:43 +0300
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Thu, 20 Sep 2018 11:09:55 -0300

perf script: Enhance sample flags for trace begin / end

Allow for different combinations of sample flags with "trace begin" or
"trace end".

Previously, the Intel PT decoder would indicate begin / end by a branch
from / to zero. That hides useful information, in particular when a
trace ends with a call. Before remedying that, prepare 'perf script' to
display sample flags with more combinations that include trace begin /
end. In those cases display 'tr start' and 'tr end' separately.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20180920130048.31432-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-script.c | 36 +++++++++++++++++++++++++++---------
 1 file changed, 27 insertions(+), 9 deletions(-)

diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 7732346bd9dd..4da5e32b9e03 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -1262,6 +1262,18 @@ static struct {
 	{0, NULL}
 };
 
+static const char *sample_flags_to_name(u32 flags)
+{
+	int i;
+
+	for (i = 0; sample_flags[i].name ; i++) {
+		if (sample_flags[i].flags == flags)
+			return sample_flags[i].name;
+	}
+
+	return NULL;
+}
+
 static int perf_sample__fprintf_flags(u32 flags, FILE *fp)
 {
 	const char *chars = PERF_IP_FLAG_CHARS;
@@ -1271,11 +1283,20 @@ static int perf_sample__fprintf_flags(u32 flags, FILE *fp)
 	char str[33];
 	int i, pos = 0;
 
-	for (i = 0; sample_flags[i].name ; i++) {
-		if (sample_flags[i].flags == (flags & ~PERF_IP_FLAG_IN_TX)) {
-			name = sample_flags[i].name;
-			break;
-		}
+	name = sample_flags_to_name(flags & ~PERF_IP_FLAG_IN_TX);
+	if (name)
+		return fprintf(fp, "  %-15s%4s ", name, in_tx ? "(x)" : "");
+
+	if (flags & PERF_IP_FLAG_TRACE_BEGIN) {
+		name = sample_flags_to_name(flags & ~(PERF_IP_FLAG_IN_TX | PERF_IP_FLAG_TRACE_BEGIN));
+		if (name)
+			return fprintf(fp, "  tr strt %-7s%4s ", name, in_tx ? "(x)" : "");
+	}
+
+	if (flags & PERF_IP_FLAG_TRACE_END) {
+		name = sample_flags_to_name(flags & ~(PERF_IP_FLAG_IN_TX | PERF_IP_FLAG_TRACE_END));
+		if (name)
+			return fprintf(fp, "  tr end  %-7s%4s ", name, in_tx ? "(x)" : "");
 	}
 
 	for (i = 0; i < n; i++, flags >>= 1) {
@@ -1288,10 +1309,7 @@ static int perf_sample__fprintf_flags(u32 flags, FILE *fp)
 	}
 	str[pos] = 0;
 
-	if (name)
-		return fprintf(fp, "  %-7s%4s ", name, in_tx ? "(x)" : "");
-
-	return fprintf(fp, "  %-11s ", str);
+	return fprintf(fp, "  %-19s ", str);
 }
 
 struct printer_data {

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [tip:perf/core] perf db-export: Add trace begin / end branch type variants
  2018-09-20 13:00 ` [PATCH V2 2/6] perf db-export: Add trace begin / end branch type variants Adrian Hunter
@ 2018-09-26  8:54   ` tip-bot for Adrian Hunter
  0 siblings, 0 replies; 15+ messages in thread
From: tip-bot for Adrian Hunter @ 2018-09-26  8:54 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: jolsa, acme, tglx, hpa, linux-kernel, adrian.hunter, mingo, ak

Commit-ID:  ff645daf30cafb6fa74bee9a73733700bac2aff7
Gitweb:     https://git.kernel.org/tip/ff645daf30cafb6fa74bee9a73733700bac2aff7
Author:     Adrian Hunter <adrian.hunter@intel.com>
AuthorDate: Thu, 20 Sep 2018 16:00:44 +0300
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Thu, 20 Sep 2018 11:10:25 -0300

perf db-export: Add trace begin / end branch type variants

Add branch types to cover different combinations with "trace begin" or
"trace end".

Previously, the Intel PT decoder would indicate begin / end by a branch
from / to zero. That hides useful information, in particular when a
trace ends with a call. Before remedying that, prepare the database
export to export branch types with more combinations that include trace
begin / end.  In those cases extend the descriptions to include 'trace
begin' and 'trace end' separately.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20180920130048.31432-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/db-export.c | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/tools/perf/util/db-export.c b/tools/perf/util/db-export.c
index 7123746edcf4..69fbb0a72d0c 100644
--- a/tools/perf/util/db-export.c
+++ b/tools/perf/util/db-export.c
@@ -463,6 +463,28 @@ int db_export__branch_types(struct db_export *dbe)
 		if (err)
 			break;
 	}
+
+	/* Add trace begin / end variants */
+	for (i = 0; branch_types[i].name ; i++) {
+		const char *name = branch_types[i].name;
+		u32 type = branch_types[i].branch_type;
+		char buf[64];
+
+		if (type == PERF_IP_FLAG_BRANCH ||
+		    (type & (PERF_IP_FLAG_TRACE_BEGIN | PERF_IP_FLAG_TRACE_END)))
+			continue;
+
+		snprintf(buf, sizeof(buf), "trace begin / %s", name);
+		err = db_export__branch_type(dbe, type | PERF_IP_FLAG_TRACE_BEGIN, buf);
+		if (err)
+			break;
+
+		snprintf(buf, sizeof(buf), "%s / trace end", name);
+		err = db_export__branch_type(dbe, type | PERF_IP_FLAG_TRACE_END, buf);
+		if (err)
+			break;
+	}
+
 	return err;
 }
 

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [tip:perf/core] perf tools: Improve thread_stack__event() for trace begin / end
  2018-09-20 13:00 ` [PATCH V2 3/6] perf tools: Improve thread_stack__event() for trace begin / end Adrian Hunter
@ 2018-09-26  8:54   ` tip-bot for Adrian Hunter
  0 siblings, 0 replies; 15+ messages in thread
From: tip-bot for Adrian Hunter @ 2018-09-26  8:54 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: jolsa, adrian.hunter, acme, tglx, ak, linux-kernel, hpa, mingo

Commit-ID:  4d60e5e36aa6f11b4d9eadc5d2b94128f24870c7
Gitweb:     https://git.kernel.org/tip/4d60e5e36aa6f11b4d9eadc5d2b94128f24870c7
Author:     Adrian Hunter <adrian.hunter@intel.com>
AuthorDate: Thu, 20 Sep 2018 16:00:45 +0300
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Thu, 20 Sep 2018 15:16:17 -0300

perf tools: Improve thread_stack__event() for trace begin / end

thread_stack__event() is used to create call stacks, by keeping track of
calls and returns. Improve the handling of trace begin / end to allow
for a trace that ends in a call.

Previously, the Intel PT decoder would indicate begin / end by a branch
from / to zero. That hides useful information, in particular when a
trace ends with a call. Before remedying that, enhance the thread stack
so that it does not expect to see the 'return' for a 'call' that ends
the trace.

Committer notes:

Added this:

                return thread_stack__push(thread->ts, ret_addr,
-                                         flags && PERF_IP_FLAG_TRACE_END);
+                                         flags & PERF_IP_FLAG_TRACE_END);

To fix problem spotted by:

debian:9:            clang version 3.8.1-24 (tags/RELEASE_381/final)
debian:experimental: clang version 6.0.1-6 (tags/RELEASE_601/final)

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20180920130048.31432-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/thread-stack.c | 35 ++++++++++++++++++++++++++++++-----
 1 file changed, 30 insertions(+), 5 deletions(-)

diff --git a/tools/perf/util/thread-stack.c b/tools/perf/util/thread-stack.c
index dd17d6a38d3a..e3f7dfecafa9 100644
--- a/tools/perf/util/thread-stack.c
+++ b/tools/perf/util/thread-stack.c
@@ -36,6 +36,7 @@
  * @branch_count: the branch count when the entry was created
  * @cp: call path
  * @no_call: a 'call' was not seen
+ * @trace_end: a 'call' but trace ended
  */
 struct thread_stack_entry {
 	u64 ret_addr;
@@ -44,6 +45,7 @@ struct thread_stack_entry {
 	u64 branch_count;
 	struct call_path *cp;
 	bool no_call;
+	bool trace_end;
 };
 
 /**
@@ -112,7 +114,8 @@ static struct thread_stack *thread_stack__new(struct thread *thread,
 	return ts;
 }
 
-static int thread_stack__push(struct thread_stack *ts, u64 ret_addr)
+static int thread_stack__push(struct thread_stack *ts, u64 ret_addr,
+			      bool trace_end)
 {
 	int err = 0;
 
@@ -124,6 +127,7 @@ static int thread_stack__push(struct thread_stack *ts, u64 ret_addr)
 		}
 	}
 
+	ts->stack[ts->cnt].trace_end = trace_end;
 	ts->stack[ts->cnt++].ret_addr = ret_addr;
 
 	return err;
@@ -150,6 +154,18 @@ static void thread_stack__pop(struct thread_stack *ts, u64 ret_addr)
 	}
 }
 
+static void thread_stack__pop_trace_end(struct thread_stack *ts)
+{
+	size_t i;
+
+	for (i = ts->cnt; i; ) {
+		if (ts->stack[--i].trace_end)
+			ts->cnt = i;
+		else
+			return;
+	}
+}
+
 static bool thread_stack__in_kernel(struct thread_stack *ts)
 {
 	if (!ts->cnt)
@@ -254,10 +270,19 @@ int thread_stack__event(struct thread *thread, u32 flags, u64 from_ip,
 		ret_addr = from_ip + insn_len;
 		if (ret_addr == to_ip)
 			return 0; /* Zero-length calls are excluded */
-		return thread_stack__push(thread->ts, ret_addr);
-	} else if (flags & PERF_IP_FLAG_RETURN) {
-		if (!from_ip)
-			return 0;
+		return thread_stack__push(thread->ts, ret_addr,
+					  flags & PERF_IP_FLAG_TRACE_END);
+	} else if (flags & PERF_IP_FLAG_TRACE_BEGIN) {
+		/*
+		 * If the caller did not change the trace number (which would
+		 * have flushed the stack) then try to make sense of the stack.
+		 * Possibly, tracing began after returning to the current
+		 * address, so try to pop that. Also, do not expect a call made
+		 * when the trace ended, to return, so pop that.
+		 */
+		thread_stack__pop(thread->ts, to_ip);
+		thread_stack__pop_trace_end(thread->ts);
+	} else if ((flags & PERF_IP_FLAG_RETURN) && from_ip) {
 		thread_stack__pop(thread->ts, to_ip);
 	}
 

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [tip:perf/core] perf tools: Improve thread_stack__process() for trace begin / end
  2018-09-20 13:00 ` [PATCH V2 4/6] perf tools: Improve thread_stack__process() " Adrian Hunter
@ 2018-09-26  8:55   ` tip-bot for Adrian Hunter
  0 siblings, 0 replies; 15+ messages in thread
From: tip-bot for Adrian Hunter @ 2018-09-26  8:55 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: mingo, hpa, jolsa, tglx, ak, linux-kernel, adrian.hunter, acme

Commit-ID:  2dcde4e152a3e319cc7e76c7c6b8548a3c72310d
Gitweb:     https://git.kernel.org/tip/2dcde4e152a3e319cc7e76c7c6b8548a3c72310d
Author:     Adrian Hunter <adrian.hunter@intel.com>
AuthorDate: Thu, 20 Sep 2018 16:00:46 +0300
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Thu, 20 Sep 2018 15:19:50 -0300

perf tools: Improve thread_stack__process() for trace begin / end

thread_stack__process() is used to create call paths for database
export.  Improve the handling of trace begin / end to allow for a trace
that ends in a call.

Previously, the Intel PT decoder would indicate begin / end by a branch
from / to zero. That hides useful information, in particular when a
trace ends with a call. Before remedying that, enhance the thread stack
so that it identifies the trace end by the flag instead of by ip == 0.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20180920130048.31432-5-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/thread-stack.c | 16 +++++++++-------
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/tools/perf/util/thread-stack.c b/tools/perf/util/thread-stack.c
index e3f7dfecafa9..c091635bf7dc 100644
--- a/tools/perf/util/thread-stack.c
+++ b/tools/perf/util/thread-stack.c
@@ -357,7 +357,7 @@ void call_return_processor__free(struct call_return_processor *crp)
 
 static int thread_stack__push_cp(struct thread_stack *ts, u64 ret_addr,
 				 u64 timestamp, u64 ref, struct call_path *cp,
-				 bool no_call)
+				 bool no_call, bool trace_end)
 {
 	struct thread_stack_entry *tse;
 	int err;
@@ -375,6 +375,7 @@ static int thread_stack__push_cp(struct thread_stack *ts, u64 ret_addr,
 	tse->branch_count = ts->branch_count;
 	tse->cp = cp;
 	tse->no_call = no_call;
+	tse->trace_end = trace_end;
 
 	return 0;
 }
@@ -448,7 +449,7 @@ static int thread_stack__bottom(struct thread *thread, struct thread_stack *ts,
 		return -ENOMEM;
 
 	return thread_stack__push_cp(thread->ts, ip, sample->time, ref, cp,
-				     true);
+				     true, false);
 }
 
 static int thread_stack__no_call_return(struct thread *thread,
@@ -480,7 +481,7 @@ static int thread_stack__no_call_return(struct thread *thread,
 			if (!cp)
 				return -ENOMEM;
 			return thread_stack__push_cp(ts, 0, sample->time, ref,
-						     cp, true);
+						     cp, true, false);
 		}
 	} else if (thread_stack__in_kernel(ts) && sample->ip < ks) {
 		/* Return to userspace, so pop all kernel addresses */
@@ -505,7 +506,7 @@ static int thread_stack__no_call_return(struct thread *thread,
 		return -ENOMEM;
 
 	err = thread_stack__push_cp(ts, sample->addr, sample->time, ref, cp,
-				    true);
+				    true, false);
 	if (err)
 		return err;
 
@@ -525,7 +526,7 @@ static int thread_stack__trace_begin(struct thread *thread,
 
 	/* Pop trace end */
 	tse = &ts->stack[ts->cnt - 1];
-	if (tse->cp->sym == NULL && tse->cp->ip == 0) {
+	if (tse->trace_end) {
 		err = thread_stack__call_return(thread, ts, --ts->cnt,
 						timestamp, ref, false);
 		if (err)
@@ -554,7 +555,7 @@ static int thread_stack__trace_end(struct thread_stack *ts,
 	ret_addr = sample->ip + sample->insn_len;
 
 	return thread_stack__push_cp(ts, ret_addr, sample->time, ref, cp,
-				     false);
+				     false, true);
 }
 
 int thread_stack__process(struct thread *thread, struct comm *comm,
@@ -604,6 +605,7 @@ int thread_stack__process(struct thread *thread, struct comm *comm,
 	ts->last_time = sample->time;
 
 	if (sample->flags & PERF_IP_FLAG_CALL) {
+		bool trace_end = sample->flags & PERF_IP_FLAG_TRACE_END;
 		struct call_path_root *cpr = ts->crp->cpr;
 		struct call_path *cp;
 		u64 ret_addr;
@@ -621,7 +623,7 @@ int thread_stack__process(struct thread *thread, struct comm *comm,
 		if (!cp)
 			return -ENOMEM;
 		err = thread_stack__push_cp(ts, ret_addr, sample->time, ref,
-					    cp, false);
+					    cp, false, trace_end);
 	} else if (sample->flags & PERF_IP_FLAG_RETURN) {
 		if (!sample->ip || !sample->addr)
 			return 0;

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [tip:perf/core] perf intel-pt: Add decoder flags for trace begin / end
  2018-09-20 13:00 ` [PATCH V2 5/6] perf intel-pt: Add decoder flags " Adrian Hunter
@ 2018-09-26  8:56   ` tip-bot for Adrian Hunter
  0 siblings, 0 replies; 15+ messages in thread
From: tip-bot for Adrian Hunter @ 2018-09-26  8:56 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: acme, hpa, jolsa, tglx, ak, mingo, adrian.hunter, linux-kernel

Commit-ID:  c6b5da093a8ba740b71dd0052f3846016986fd21
Gitweb:     https://git.kernel.org/tip/c6b5da093a8ba740b71dd0052f3846016986fd21
Author:     Adrian Hunter <adrian.hunter@intel.com>
AuthorDate: Thu, 20 Sep 2018 16:00:47 +0300
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Thu, 20 Sep 2018 15:19:51 -0300

perf intel-pt: Add decoder flags for trace begin / end

Previously, the decoder would indicate begin / end by a branch from / to
zero. That hides useful information, in particular when a trace ends
with a call. To prepare for remedying that, add Intel PT decoder flags
for trace begin / end and map them to the existing sample flags.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20180920130048.31432-6-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/intel-pt-decoder/intel-pt-decoder.h | 2 ++
 tools/perf/util/intel-pt.c                          | 5 +++++
 2 files changed, 7 insertions(+)

diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.h b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.h
index 51c18d67f4ca..ed088d4726ba 100644
--- a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.h
+++ b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.h
@@ -37,6 +37,8 @@ enum intel_pt_sample_type {
 	INTEL_PT_EX_STOP	= 1 << 6,
 	INTEL_PT_PWR_EXIT	= 1 << 7,
 	INTEL_PT_CBR_CHG	= 1 << 8,
+	INTEL_PT_TRACE_BEGIN	= 1 << 9,
+	INTEL_PT_TRACE_END	= 1 << 10,
 };
 
 enum intel_pt_period_type {
diff --git a/tools/perf/util/intel-pt.c b/tools/perf/util/intel-pt.c
index aec68908d604..48c1d415c6b0 100644
--- a/tools/perf/util/intel-pt.c
+++ b/tools/perf/util/intel-pt.c
@@ -908,6 +908,11 @@ static void intel_pt_sample_flags(struct intel_pt_queue *ptq)
 		ptq->insn_len = ptq->state->insn_len;
 		memcpy(ptq->insn, ptq->state->insn, INTEL_PT_INSN_BUF_SZ);
 	}
+
+	if (ptq->state->type & INTEL_PT_TRACE_BEGIN)
+		ptq->flags |= PERF_IP_FLAG_TRACE_BEGIN;
+	if (ptq->state->type & INTEL_PT_TRACE_END)
+		ptq->flags |= PERF_IP_FLAG_TRACE_END;
 }
 
 static int intel_pt_setup_queue(struct intel_pt *pt,

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [tip:perf/core] perf intel-pt: Implement decoder flags for trace begin / end
  2018-09-20 13:00 ` [PATCH V2 6/6] perf intel-pt: Implement " Adrian Hunter
@ 2018-09-26  8:56   ` tip-bot for Adrian Hunter
  0 siblings, 0 replies; 15+ messages in thread
From: tip-bot for Adrian Hunter @ 2018-09-26  8:56 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: jolsa, ak, acme, hpa, mingo, adrian.hunter, linux-kernel, tglx

Commit-ID:  bea6385789b8b5e1e3228a281978ca6c4a8c70a0
Gitweb:     https://git.kernel.org/tip/bea6385789b8b5e1e3228a281978ca6c4a8c70a0
Author:     Adrian Hunter <adrian.hunter@intel.com>
AuthorDate: Thu, 20 Sep 2018 16:00:48 +0300
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Thu, 20 Sep 2018 15:19:52 -0300

perf intel-pt: Implement decoder flags for trace begin / end

Have the Intel PT decoder implement the new Intel PT decoder flags for
trace begin / end.

Previously, the decoder would indicate begin / end by a branch from / to
zero. That hides useful information, in particular when a trace ends
with a call. That happens when using address filters, for example:

  $ perf record -e intel_pt/cyc,mtc_period=0,noretcomp/u --filter='filter main @ /bin/uname ' uname Linux
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.031 MB perf.data ]

Before:

  $ perf script --itrace=cre -Ftime,flags,ip,sym,symoff,addr --ns
   7249.622183310:   tr strt         0 [unknown] =>   401590 main+0x0
   7249.622183311:   call       4015b9 main+0x29 =>        0 [unknown]
   7249.622183711:   tr strt         0 [unknown] =>   4015be main+0x2e
   7249.622183714:   call       4015c8 main+0x38 =>        0 [unknown]
   7249.622247731:   tr strt         0 [unknown] =>   4015cd main+0x3d
   7249.622247760:   call       4015d7 main+0x47 =>        0 [unknown]
   7249.622248340:   tr strt         0 [unknown] =>   4015dc main+0x4c
   7249.622248341:   call       4015e1 main+0x51 =>        0 [unknown]
   7249.622248681:   tr strt         0 [unknown] =>   4015e6 main+0x56
   7249.622248682:   call       4015eb main+0x5b =>        0 [unknown]
   7249.622248970:   tr strt         0 [unknown] =>   4015f0 main+0x60
   7249.622248971:   call       401612 main+0x82 =>        0 [unknown]
   7249.622249757:   tr strt         0 [unknown] =>   401617 main+0x87
   7249.622249770:   call       401847 main+0x2b7 =>        0 [unknown]
   7249.622250606:   tr strt         0 [unknown] =>   40184c main+0x2bc
   7249.622250612:   call       4019bf main+0x42f =>        0 [unknown]
   7249.622256823:   tr strt         0 [unknown] =>   4019c4 main+0x434
   7249.622256863:   call       4019f5 main+0x465 =>        0 [unknown]
   7249.622264217:   tr strt         0 [unknown] =>   4019fa main+0x46a
   7249.622264235:   call       401832 main+0x2a2 =>        0 [unknown]

After:

  $ perf script --itrace=cre -Ftime,flags,ip,sym,symoff,addr --ns
   7249.622183310:   tr strt              0 [unknown] =>   401590 main+0x0
   7249.622183311:   tr end  call    4015b9 main+0x29 =>   401ef0 set_program_name+0x0
   7249.622183711:   tr strt              0 [unknown] =>   4015be main+0x2e
   7249.622183714:   tr end  call    4015c8 main+0x38 =>   4014b0 setlocale@plt+0x0
   7249.622247731:   tr strt              0 [unknown] =>   4015cd main+0x3d
   7249.622247760:   tr end  call    4015d7 main+0x47 =>   4012d0 bindtextdomain@plt+0x0
   7249.622248340:   tr strt              0 [unknown] =>   4015dc main+0x4c
   7249.622248341:   tr end  call    4015e1 main+0x51 =>   4012b0 textdomain@plt+0x0
   7249.622248681:   tr strt              0 [unknown] =>   4015e6 main+0x56
   7249.622248682:   tr end  call    4015eb main+0x5b =>   404340 atexit+0x0
   7249.622248970:   tr strt              0 [unknown] =>   4015f0 main+0x60
   7249.622248971:   tr end  call    401612 main+0x82 =>   401320 getopt_long@plt+0x0
   7249.622249757:   tr strt              0 [unknown] =>   401617 main+0x87
   7249.622249770:   tr end  call    401847 main+0x2b7 =>   401360 uname@plt+0x0
   7249.622250606:   tr strt              0 [unknown] =>   40184c main+0x2bc
   7249.622250612:   tr end  call    4019bf main+0x42f =>   401b10 print_element+0x0
   7249.622256823:   tr strt              0 [unknown] =>   4019c4 main+0x434
   7249.622256863:   tr end  call    4019f5 main+0x465 =>   401340 __overflow@plt+0x0
   7249.622264217:   tr strt              0 [unknown] =>   4019fa main+0x46a
   7249.622264235:   tr end  call    401832 main+0x2a2 =>   401520 exit@plt+0x0

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20180920130048.31432-7-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 .../perf/util/intel-pt-decoder/intel-pt-decoder.c  | 34 +++++++++++++++-------
 1 file changed, 23 insertions(+), 11 deletions(-)

diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
index d404bed7003a..58f6a9ceb590 100644
--- a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
+++ b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
@@ -1165,7 +1165,7 @@ static int intel_pt_walk_tip(struct intel_pt_decoder *decoder)
 		decoder->pge = false;
 		decoder->continuous_period = false;
 		decoder->pkt_state = INTEL_PT_STATE_IN_SYNC;
-		decoder->state.to_ip = 0;
+		decoder->state.type |= INTEL_PT_TRACE_END;
 		return 0;
 	}
 	if (err == INTEL_PT_RETURN)
@@ -1179,9 +1179,13 @@ static int intel_pt_walk_tip(struct intel_pt_decoder *decoder)
 			decoder->continuous_period = false;
 			decoder->pkt_state = INTEL_PT_STATE_IN_SYNC;
 			decoder->state.from_ip = decoder->ip;
-			decoder->state.to_ip = 0;
-			if (decoder->packet.count != 0)
+			if (decoder->packet.count == 0) {
+				decoder->state.to_ip = 0;
+			} else {
+				decoder->state.to_ip = decoder->last_ip;
 				decoder->ip = decoder->last_ip;
+			}
+			decoder->state.type |= INTEL_PT_TRACE_END;
 		} else {
 			decoder->pkt_state = INTEL_PT_STATE_IN_SYNC;
 			decoder->state.from_ip = decoder->ip;
@@ -1208,7 +1212,8 @@ static int intel_pt_walk_tip(struct intel_pt_decoder *decoder)
 			decoder->pkt_state = INTEL_PT_STATE_IN_SYNC;
 			decoder->ip = to_ip;
 			decoder->state.from_ip = decoder->ip;
-			decoder->state.to_ip = 0;
+			decoder->state.to_ip = to_ip;
+			decoder->state.type |= INTEL_PT_TRACE_END;
 			return 0;
 		}
 		intel_pt_log_at("ERROR: Conditional branch when expecting indirect branch",
@@ -1640,14 +1645,15 @@ static int intel_pt_walk_fup_tip(struct intel_pt_decoder *decoder)
 
 		case INTEL_PT_TIP_PGD:
 			decoder->state.from_ip = decoder->ip;
-			decoder->state.to_ip = 0;
-			if (decoder->packet.count != 0) {
+			if (decoder->packet.count == 0) {
+				decoder->state.to_ip = 0;
+			} else {
 				intel_pt_set_ip(decoder);
-				intel_pt_log("Omitting PGD ip " x64_fmt "\n",
-					     decoder->ip);
+				decoder->state.to_ip = decoder->ip;
 			}
 			decoder->pge = false;
 			decoder->continuous_period = false;
+			decoder->state.type |= INTEL_PT_TRACE_END;
 			return 0;
 
 		case INTEL_PT_TIP_PGE:
@@ -1661,6 +1667,7 @@ static int intel_pt_walk_fup_tip(struct intel_pt_decoder *decoder)
 				intel_pt_set_ip(decoder);
 				decoder->state.to_ip = decoder->ip;
 			}
+			decoder->state.type |= INTEL_PT_TRACE_BEGIN;
 			return 0;
 
 		case INTEL_PT_TIP:
@@ -1739,6 +1746,7 @@ next:
 			intel_pt_set_ip(decoder);
 			decoder->state.from_ip = 0;
 			decoder->state.to_ip = decoder->ip;
+			decoder->state.type |= INTEL_PT_TRACE_BEGIN;
 			return 0;
 		}
 
@@ -2077,9 +2085,13 @@ static int intel_pt_walk_to_ip(struct intel_pt_decoder *decoder)
 			decoder->pge = decoder->packet.type != INTEL_PT_TIP_PGD;
 			if (intel_pt_have_ip(decoder))
 				intel_pt_set_ip(decoder);
-			if (decoder->ip)
-				return 0;
-			break;
+			if (!decoder->ip)
+				break;
+			if (decoder->packet.type == INTEL_PT_TIP_PGE)
+				decoder->state.type |= INTEL_PT_TRACE_BEGIN;
+			if (decoder->packet.type == INTEL_PT_TIP_PGD)
+				decoder->state.type |= INTEL_PT_TRACE_END;
+			return 0;
 
 		case INTEL_PT_FUP:
 			if (intel_pt_have_ip(decoder))

^ permalink raw reply related	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2018-09-26  8:56 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-20 13:00 [PATCH V2 0/6] perf intel-pt: Improve the data displayed when using address filters Adrian Hunter
2018-09-20 13:00 ` [PATCH V2 1/6] perf script: Enhance sample flags for trace begin / end Adrian Hunter
2018-09-26  8:53   ` [tip:perf/core] " tip-bot for Adrian Hunter
2018-09-20 13:00 ` [PATCH V2 2/6] perf db-export: Add trace begin / end branch type variants Adrian Hunter
2018-09-26  8:54   ` [tip:perf/core] " tip-bot for Adrian Hunter
2018-09-20 13:00 ` [PATCH V2 3/6] perf tools: Improve thread_stack__event() for trace begin / end Adrian Hunter
2018-09-26  8:54   ` [tip:perf/core] " tip-bot for Adrian Hunter
2018-09-20 13:00 ` [PATCH V2 4/6] perf tools: Improve thread_stack__process() " Adrian Hunter
2018-09-26  8:55   ` [tip:perf/core] " tip-bot for Adrian Hunter
2018-09-20 13:00 ` [PATCH V2 5/6] perf intel-pt: Add decoder flags " Adrian Hunter
2018-09-26  8:56   ` [tip:perf/core] " tip-bot for Adrian Hunter
2018-09-20 13:00 ` [PATCH V2 6/6] perf intel-pt: Implement " Adrian Hunter
2018-09-26  8:56   ` [tip:perf/core] " tip-bot for Adrian Hunter
2018-09-20 13:41 ` [PATCH V2 0/6] perf intel-pt: Improve the data displayed when using address filters Arnaldo Carvalho de Melo
2018-09-20 14:13 ` Arnaldo Carvalho de Melo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).