All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/11] perf intel-pt: Add support for decoding FUP/TIP only
@ 2020-07-09 17:36 Adrian Hunter
  2020-07-09 17:36 ` [PATCH 01/11] perf intel-pt: Fix FUP packet state Adrian Hunter
                   ` (10 more replies)
  0 siblings, 11 replies; 20+ messages in thread
From: Adrian Hunter @ 2020-07-09 17:36 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: Jiri Olsa, Andi Kleen, linux-kernel

Hi

Here are some fixes and small improvements for Intel PT.


Adrian Hunter (11):
      perf intel-pt: Fix FUP packet state
      perf intel-pt: Fix duplicate branch after CBR
      perf tools: Improve aux_output not supported error
      perf auxtrace: Add optional error flags to the itrace 'e' option
      perf intel-pt: Use itrace error flags to suppress some errors
      perf auxtrace: Add optional log flags to the itrace 'd' option
      perf intel-pt: Use itrace debug log flags to suppress some messages
      perf intel-pt: Time filter logged perf events
      perf auxtrace: Add itrace 'q' option for quicker, less detailed decoding
      perf intel-pt: Add support for decoding FUP/TIP only
      perf intel-pt: Add support for decoding PSB+ only

 tools/perf/Documentation/itrace.txt                |   9 +
 tools/perf/Documentation/perf-intel-pt.txt         |  25 ++-
 tools/perf/util/auxtrace.c                         |   7 +
 tools/perf/util/auxtrace.h                         |   6 +
 tools/perf/util/evsel.c                            |   4 +
 .../perf/util/intel-pt-decoder/intel-pt-decoder.c  | 214 +++++++++++++++++++--
 .../perf/util/intel-pt-decoder/intel-pt-decoder.h  |   1 +
 tools/perf/util/intel-pt.c                         |  51 ++++-
 8 files changed, 287 insertions(+), 30 deletions(-)


Regards
Adrian

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 01/11] perf intel-pt: Fix FUP packet state
  2020-07-09 17:36 [PATCH 00/11] perf intel-pt: Add support for decoding FUP/TIP only Adrian Hunter
@ 2020-07-09 17:36 ` Adrian Hunter
  2020-07-09 17:36 ` [PATCH 02/11] perf intel-pt: Fix duplicate branch after CBR Adrian Hunter
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 20+ messages in thread
From: Adrian Hunter @ 2020-07-09 17:36 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: Jiri Olsa, Andi Kleen, linux-kernel

While walking code towards a FUP ip, the packet state is
INTEL_PT_STATE_FUP or INTEL_PT_STATE_FUP_NO_TIP. That was mishandled
resulting in the state becoming INTEL_PT_STATE_IN_SYNC prematurely.
The result was an occasional lost EXSTOP event.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: stable@vger.kernel.org
---
 .../util/intel-pt-decoder/intel-pt-decoder.c  | 21 +++++++------------
 1 file changed, 7 insertions(+), 14 deletions(-)

diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
index f8ccfd6be0ee..75c4bd74d521 100644
--- a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
+++ b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
@@ -1164,6 +1164,7 @@ static int intel_pt_walk_fup(struct intel_pt_decoder *decoder)
 			return 0;
 		if (err == -EAGAIN ||
 		    intel_pt_fup_with_nlip(decoder, &intel_pt_insn, ip, err)) {
+			decoder->pkt_state = INTEL_PT_STATE_IN_SYNC;
 			if (intel_pt_fup_event(decoder))
 				return 0;
 			return -EAGAIN;
@@ -1942,17 +1943,13 @@ static int intel_pt_walk_trace(struct intel_pt_decoder *decoder)
 			}
 			if (decoder->set_fup_mwait)
 				no_tip = true;
+			if (no_tip)
+				decoder->pkt_state = INTEL_PT_STATE_FUP_NO_TIP;
+			else
+				decoder->pkt_state = INTEL_PT_STATE_FUP;
 			err = intel_pt_walk_fup(decoder);
-			if (err != -EAGAIN) {
-				if (err)
-					return err;
-				if (no_tip)
-					decoder->pkt_state =
-						INTEL_PT_STATE_FUP_NO_TIP;
-				else
-					decoder->pkt_state = INTEL_PT_STATE_FUP;
-				return 0;
-			}
+			if (err != -EAGAIN)
+				return err;
 			if (no_tip) {
 				no_tip = false;
 				break;
@@ -2599,15 +2596,11 @@ const struct intel_pt_state *intel_pt_decode(struct intel_pt_decoder *decoder)
 			err = intel_pt_walk_tip(decoder);
 			break;
 		case INTEL_PT_STATE_FUP:
-			decoder->pkt_state = INTEL_PT_STATE_IN_SYNC;
 			err = intel_pt_walk_fup(decoder);
 			if (err == -EAGAIN)
 				err = intel_pt_walk_fup_tip(decoder);
-			else if (!err)
-				decoder->pkt_state = INTEL_PT_STATE_FUP;
 			break;
 		case INTEL_PT_STATE_FUP_NO_TIP:
-			decoder->pkt_state = INTEL_PT_STATE_IN_SYNC;
 			err = intel_pt_walk_fup(decoder);
 			if (err == -EAGAIN)
 				err = intel_pt_walk_trace(decoder);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 02/11] perf intel-pt: Fix duplicate branch after CBR
  2020-07-09 17:36 [PATCH 00/11] perf intel-pt: Add support for decoding FUP/TIP only Adrian Hunter
  2020-07-09 17:36 ` [PATCH 01/11] perf intel-pt: Fix FUP packet state Adrian Hunter
@ 2020-07-09 17:36 ` Adrian Hunter
  2020-07-09 17:36 ` [PATCH 03/11] perf tools: Improve aux_output not supported error Adrian Hunter
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 20+ messages in thread
From: Adrian Hunter @ 2020-07-09 17:36 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: Jiri Olsa, Andi Kleen, linux-kernel

CBR events can result in a duplicate branch event, because the state type
defaults to a branch. Fix by clearing the state type.

Example: trace 'sleep' and hope for a frequency change

 Before:

   $ perf record -e intel_pt//u sleep 0.1
   [ perf record: Woken up 1 times to write data ]
   [ perf record: Captured and wrote 0.034 MB perf.data ]
   $ perf script --itrace=bpe > before.txt

 After:

   $ perf script --itrace=bpe > after.txt
   $ diff -u before.txt after.txt
   --- before.txt  2020-07-07 14:42:18.191508098 +0300
   +++ after.txt   2020-07-07 14:42:36.587891753 +0300
   @@ -29673,7 +29673,6 @@
               sleep 93431 [007] 15411.619905:          1  branches:u:                 0 [unknown] ([unknown]) =>     7f0818abb2e0 clock_nanosleep@@GLIBC_2.17+0x0 (/usr/lib/x86_64-linux-gnu/libc-2.31.so)
               sleep 93431 [007] 15411.619905:          1  branches:u:      7f0818abb30c clock_nanosleep@@GLIBC_2.17+0x2c (/usr/lib/x86_64-linux-gnu/libc-2.31.so) =>                0 [unknown] ([unknown])
               sleep 93431 [007] 15411.720069:         cbr:  cbr: 15 freq: 1507 MHz ( 56%)         7f0818abb30c clock_nanosleep@@GLIBC_2.17+0x2c (/usr/lib/x86_64-linux-gnu/libc-2.31.so)
   -           sleep 93431 [007] 15411.720069:          1  branches:u:      7f0818abb30c clock_nanosleep@@GLIBC_2.17+0x2c (/usr/lib/x86_64-linux-gnu/libc-2.31.so) =>                0 [unknown] ([unknown])
               sleep 93431 [007] 15411.720076:          1  branches:u:                 0 [unknown] ([unknown]) =>     7f0818abb30e clock_nanosleep@@GLIBC_2.17+0x2e (/usr/lib/x86_64-linux-gnu/libc-2.31.so)
               sleep 93431 [007] 15411.720077:          1  branches:u:      7f0818abb323 clock_nanosleep@@GLIBC_2.17+0x43 (/usr/lib/x86_64-linux-gnu/libc-2.31.so) =>     7f0818ac0eb7 __nanosleep+0x17 (/usr/lib/x86_64-linux-gnu/libc-2.31.so)
               sleep 93431 [007] 15411.720077:          1  branches:u:      7f0818ac0ebf __nanosleep+0x1f (/usr/lib/x86_64-linux-gnu/libc-2.31.so) =>     55cb7e4c2827 rpl_nanosleep+0x97 (/usr/bin/sleep)

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Fixes: 91de8684f1cff ("perf intel-pt: Cater for CBR change in PSB+")
Fixes: abe5a1d3e4bee ("perf intel-pt: Decoder to output CBR changes immediately")
Cc: stable@vger.kernel.org
---
 tools/perf/util/intel-pt-decoder/intel-pt-decoder.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
index 75c4bd74d521..7ffcbd6fcd1a 100644
--- a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
+++ b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
@@ -1977,8 +1977,10 @@ static int intel_pt_walk_trace(struct intel_pt_decoder *decoder)
 			 * possibility of another CBR change that gets caught up
 			 * in the PSB+.
 			 */
-			if (decoder->cbr != decoder->cbr_seen)
+			if (decoder->cbr != decoder->cbr_seen) {
+				decoder->state.type = 0;
 				return 0;
+			}
 			break;
 
 		case INTEL_PT_PIP:
@@ -2019,8 +2021,10 @@ static int intel_pt_walk_trace(struct intel_pt_decoder *decoder)
 
 		case INTEL_PT_CBR:
 			intel_pt_calc_cbr(decoder);
-			if (decoder->cbr != decoder->cbr_seen)
+			if (decoder->cbr != decoder->cbr_seen) {
+				decoder->state.type = 0;
 				return 0;
+			}
 			break;
 
 		case INTEL_PT_MODE_EXEC:
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 03/11] perf tools: Improve aux_output not supported error
  2020-07-09 17:36 [PATCH 00/11] perf intel-pt: Add support for decoding FUP/TIP only Adrian Hunter
  2020-07-09 17:36 ` [PATCH 01/11] perf intel-pt: Fix FUP packet state Adrian Hunter
  2020-07-09 17:36 ` [PATCH 02/11] perf intel-pt: Fix duplicate branch after CBR Adrian Hunter
@ 2020-07-09 17:36 ` Adrian Hunter
  2020-07-09 17:36 ` [PATCH 04/11] perf auxtrace: Add optional error flags to the itrace 'e' option Adrian Hunter
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 20+ messages in thread
From: Adrian Hunter @ 2020-07-09 17:36 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: Jiri Olsa, Andi Kleen, linux-kernel

For example:
 Before:
   $ perf record -e '{intel_pt/branch=0/,branch-loads/aux-output/ppp}' -- ls -l
   Error:
   branch-loads: PMU Hardware doesn't support sampling/overflow-interrupts. Try 'perf stat'
 After:
   $ perf record -e '{intel_pt/branch=0/,branch-loads/aux-output/ppp}' -- ls -l
   Error:
   branch-loads: PMU Hardware doesn't support 'aux_output' feature

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/util/evsel.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 9aa51a65593d..9c5c72094112 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -2533,6 +2533,10 @@ int evsel__open_strerror(struct evsel *evsel, struct target *target,
 	 "No such device - did you specify an out-of-range profile CPU?");
 		break;
 	case EOPNOTSUPP:
+		if (evsel->core.attr.aux_output)
+			return scnprintf(msg, size,
+	"%s: PMU Hardware doesn't support 'aux_output' feature",
+					 evsel__name(evsel));
 		if (evsel->core.attr.sample_period != 0)
 			return scnprintf(msg, size,
 	"%s: PMU Hardware doesn't support sampling/overflow-interrupts. Try 'perf stat'",
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 04/11] perf auxtrace: Add optional error flags to the itrace 'e' option
  2020-07-09 17:36 [PATCH 00/11] perf intel-pt: Add support for decoding FUP/TIP only Adrian Hunter
                   ` (2 preceding siblings ...)
  2020-07-09 17:36 ` [PATCH 03/11] perf tools: Improve aux_output not supported error Adrian Hunter
@ 2020-07-09 17:36 ` Adrian Hunter
  2020-07-09 17:36 ` [PATCH 05/11] perf intel-pt: Use itrace error flags to suppress some errors Adrian Hunter
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 20+ messages in thread
From: Adrian Hunter @ 2020-07-09 17:36 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: Jiri Olsa, Andi Kleen, linux-kernel

Allow the 'e' option to be followed by an architecture-specific number
which flags what kind of errors will or will not be reported.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/Documentation/itrace.txt | 3 +++
 tools/perf/util/auxtrace.c          | 2 ++
 tools/perf/util/auxtrace.h          | 2 ++
 3 files changed, 7 insertions(+)

diff --git a/tools/perf/Documentation/itrace.txt b/tools/perf/Documentation/itrace.txt
index e817179c5027..34864b4047ed 100644
--- a/tools/perf/Documentation/itrace.txt
+++ b/tools/perf/Documentation/itrace.txt
@@ -47,3 +47,6 @@
 	--itrace=i0nss1000000
 
 	skips the first million instructions.
+
+	The 'e' option may be followed by an architecture-specific number which
+	flags what kind of errors will or will not be reported.
diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
index 25c639ac4ad4..5cfc0b12b2b3 100644
--- a/tools/perf/util/auxtrace.c
+++ b/tools/perf/util/auxtrace.c
@@ -1436,6 +1436,8 @@ int itrace_parse_synth_opts(const struct option *opt, const char *str,
 			break;
 		case 'e':
 			synth_opts->errors = true;
+			synth_opts->error_flags = strtoul(p, &endptr, 0);
+			p = endptr;
 			break;
 		case 'd':
 			synth_opts->log = true;
diff --git a/tools/perf/util/auxtrace.h b/tools/perf/util/auxtrace.h
index 142ccf7d34df..a04475f41f28 100644
--- a/tools/perf/util/auxtrace.h
+++ b/tools/perf/util/auxtrace.h
@@ -91,6 +91,7 @@ enum itrace_period_type {
  * @cpu_bitmap: CPUs for which to synthesize events, or NULL for all
  * @ptime_range: time intervals to trace or NULL
  * @range_num: number of time intervals to trace
+ * @error_flags: arch-specific flags to affect what errors are reported
  */
 struct itrace_synth_opts {
 	bool			set;
@@ -124,6 +125,7 @@ struct itrace_synth_opts {
 	unsigned long		*cpu_bitmap;
 	struct perf_time_interval *ptime_range;
 	int			range_num;
+	unsigned int		error_flags;
 };
 
 /**
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 05/11] perf intel-pt: Use itrace error flags to suppress some errors
  2020-07-09 17:36 [PATCH 00/11] perf intel-pt: Add support for decoding FUP/TIP only Adrian Hunter
                   ` (3 preceding siblings ...)
  2020-07-09 17:36 ` [PATCH 04/11] perf auxtrace: Add optional error flags to the itrace 'e' option Adrian Hunter
@ 2020-07-09 17:36 ` Adrian Hunter
  2020-07-09 17:50   ` Andi Kleen
  2020-07-09 17:36 ` [PATCH 06/11] perf auxtrace: Add optional log flags to the itrace 'd' option Adrian Hunter
                   ` (5 subsequent siblings)
  10 siblings, 1 reply; 20+ messages in thread
From: Adrian Hunter @ 2020-07-09 17:36 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: Jiri Olsa, Andi Kleen, linux-kernel

The itrace "e" option may be followed by a number which has the
following effect for Intel PT:
	1	Suppress overflow events
	2	Suppress trace data lost events
The values may be combined by bitwise OR'ing them.

Suppressing those errors can be useful for testing and debugging
because they are not due to decoding.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/Documentation/perf-intel-pt.txt |  7 ++++++-
 tools/perf/util/intel-pt.c                 | 12 ++++++++++++
 2 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/tools/perf/Documentation/perf-intel-pt.txt b/tools/perf/Documentation/perf-intel-pt.txt
index f4cd49a7fcdb..0fcd8ad897b0 100644
--- a/tools/perf/Documentation/perf-intel-pt.txt
+++ b/tools/perf/Documentation/perf-intel-pt.txt
@@ -871,7 +871,11 @@ Developer Manuals.
 
 Error events show where the decoder lost the trace.  Error events
 are quite important.  Users must know if what they are seeing is a complete
-picture or not.
+picture or not. The "e" option may be followed by a number which has the
+following effect:
+	1	Suppress overflow events
+	2	Suppress trace data lost events
+The values may be combined by bitwise OR'ing them.
 
 The "d" option will cause the creation of a file "intel_pt.log" containing all
 decoded packets and instructions.  Note that this option slows down the decoder
@@ -956,6 +960,7 @@ at the beginning. This is useful to ignore initialization code.
 
 skips the first million instructions.
 
+
 dump option
 ~~~~~~~~~~~
 
diff --git a/tools/perf/util/intel-pt.c b/tools/perf/util/intel-pt.c
index 8c441b815d73..a8e8e8acbcc8 100644
--- a/tools/perf/util/intel-pt.c
+++ b/tools/perf/util/intel-pt.c
@@ -46,6 +46,9 @@
 
 #define MAX_TIMESTAMP (~0ULL)
 
+#define INTEL_PT_ERR_SUPPRESS_OVF	1
+#define INTEL_PT_ERR_SUPPRESS_LOST	2
+
 struct range {
 	u64 start;
 	u64 end;
@@ -1863,6 +1866,15 @@ static int intel_pt_synth_error(struct intel_pt *pt, int code, int cpu,
 	char msg[MAX_AUXTRACE_ERROR_MSG];
 	int err;
 
+	if (pt->synth_opts.error_flags) {
+		if (code == INTEL_PT_ERR_OVR &&
+		    pt->synth_opts.error_flags & INTEL_PT_ERR_SUPPRESS_OVF)
+			return 0;
+		if (code == INTEL_PT_ERR_LOST &&
+		    pt->synth_opts.error_flags & INTEL_PT_ERR_SUPPRESS_LOST)
+			return 0;
+	}
+
 	intel_pt__strerror(code, msg, MAX_AUXTRACE_ERROR_MSG);
 
 	auxtrace_synth_error(&event.auxtrace_error, PERF_AUXTRACE_ERROR_ITRACE,
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 06/11] perf auxtrace: Add optional log flags to the itrace 'd' option
  2020-07-09 17:36 [PATCH 00/11] perf intel-pt: Add support for decoding FUP/TIP only Adrian Hunter
                   ` (4 preceding siblings ...)
  2020-07-09 17:36 ` [PATCH 05/11] perf intel-pt: Use itrace error flags to suppress some errors Adrian Hunter
@ 2020-07-09 17:36 ` Adrian Hunter
  2020-07-09 17:51   ` Andi Kleen
  2020-07-09 17:36 ` [PATCH 07/11] perf intel-pt: Use itrace debug log flags to suppress some messages Adrian Hunter
                   ` (4 subsequent siblings)
  10 siblings, 1 reply; 20+ messages in thread
From: Adrian Hunter @ 2020-07-09 17:36 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: Jiri Olsa, Andi Kleen, linux-kernel

Allow the 'd' option to be followed by an architecture-specific number
which flags what kind of debug messages will or will not be logged.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/Documentation/itrace.txt | 3 +++
 tools/perf/util/auxtrace.c          | 2 ++
 tools/perf/util/auxtrace.h          | 2 ++
 3 files changed, 7 insertions(+)

diff --git a/tools/perf/Documentation/itrace.txt b/tools/perf/Documentation/itrace.txt
index 34864b4047ed..3dd8fddb8b1b 100644
--- a/tools/perf/Documentation/itrace.txt
+++ b/tools/perf/Documentation/itrace.txt
@@ -50,3 +50,6 @@
 
 	The 'e' option may be followed by an architecture-specific number which
 	flags what kind of errors will or will not be reported.
+
+	If supported, The 'd' option may be followed by an architecture-specific
+	number which flags what kind of debug messages will or will not be logged.
diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
index 5cfc0b12b2b3..3f806c2881c9 100644
--- a/tools/perf/util/auxtrace.c
+++ b/tools/perf/util/auxtrace.c
@@ -1441,6 +1441,8 @@ int itrace_parse_synth_opts(const struct option *opt, const char *str,
 			break;
 		case 'd':
 			synth_opts->log = true;
+			synth_opts->log_flags = strtoul(p, &endptr, 0);
+			p = endptr;
 			break;
 		case 'c':
 			synth_opts->branches = true;
diff --git a/tools/perf/util/auxtrace.h b/tools/perf/util/auxtrace.h
index a04475f41f28..f41dbdc98175 100644
--- a/tools/perf/util/auxtrace.h
+++ b/tools/perf/util/auxtrace.h
@@ -92,6 +92,7 @@ enum itrace_period_type {
  * @ptime_range: time intervals to trace or NULL
  * @range_num: number of time intervals to trace
  * @error_flags: arch-specific flags to affect what errors are reported
+ * @log_flags: arch-specific flags to affect what is logged
  */
 struct itrace_synth_opts {
 	bool			set;
@@ -126,6 +127,7 @@ struct itrace_synth_opts {
 	struct perf_time_interval *ptime_range;
 	int			range_num;
 	unsigned int		error_flags;
+	unsigned int		log_flags;
 };
 
 /**
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 07/11] perf intel-pt: Use itrace debug log flags to suppress some messages
  2020-07-09 17:36 [PATCH 00/11] perf intel-pt: Add support for decoding FUP/TIP only Adrian Hunter
                   ` (5 preceding siblings ...)
  2020-07-09 17:36 ` [PATCH 06/11] perf auxtrace: Add optional log flags to the itrace 'd' option Adrian Hunter
@ 2020-07-09 17:36 ` Adrian Hunter
  2020-07-09 17:36 ` [PATCH 08/11] perf intel-pt: Time filter logged perf events Adrian Hunter
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 20+ messages in thread
From: Adrian Hunter @ 2020-07-09 17:36 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: Jiri Olsa, Andi Kleen, linux-kernel

The "d" option may be followed by a number which has the following effect:
	1	Suppress logging of perf events

Suppressing perf events is useful for decreasing the size of the log.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/Documentation/perf-intel-pt.txt |  4 +++-
 tools/perf/util/intel-pt.c                 | 19 ++++++++++++-------
 2 files changed, 15 insertions(+), 8 deletions(-)

diff --git a/tools/perf/Documentation/perf-intel-pt.txt b/tools/perf/Documentation/perf-intel-pt.txt
index 0fcd8ad897b0..85a2ff804900 100644
--- a/tools/perf/Documentation/perf-intel-pt.txt
+++ b/tools/perf/Documentation/perf-intel-pt.txt
@@ -879,7 +879,9 @@ The values may be combined by bitwise OR'ing them.
 
 The "d" option will cause the creation of a file "intel_pt.log" containing all
 decoded packets and instructions.  Note that this option slows down the decoder
-and that the resulting file may be very large.
+and that the resulting file may be very large.  The "d" option may be followed
+by a number which has the following effect:
+	1	Suppress logging of perf events
 
 In addition, the period of the "instructions" event can be specified. e.g.
 
diff --git a/tools/perf/util/intel-pt.c b/tools/perf/util/intel-pt.c
index a8e8e8acbcc8..d90375659244 100644
--- a/tools/perf/util/intel-pt.c
+++ b/tools/perf/util/intel-pt.c
@@ -49,6 +49,8 @@
 #define INTEL_PT_ERR_SUPPRESS_OVF	1
 #define INTEL_PT_ERR_SUPPRESS_LOST	2
 
+#define INTEL_PT_LOG_SUPPRESS_EV	1
+
 struct range {
 	u64 start;
 	u64 end;
@@ -252,6 +254,11 @@ static void intel_pt_dump_sample(struct perf_session *session,
 	intel_pt_dump(pt, sample->aux_sample.data, sample->aux_sample.size);
 }
 
+static bool intel_pt_log_events(struct intel_pt *pt)
+{
+	return !(pt->synth_opts.log_flags & INTEL_PT_LOG_SUPPRESS_EV);
+}
+
 static int intel_pt_do_fix_overlap(struct intel_pt *pt, struct auxtrace_buffer *a,
 				   struct auxtrace_buffer *b)
 {
@@ -2589,10 +2596,6 @@ static int intel_pt_context_switch(struct intel_pt *pt, union perf_event *event,
 		return -EINVAL;
 	}
 
-	intel_pt_log("context_switch: cpu %d pid %d tid %d time %"PRIu64" tsc %#"PRIx64"\n",
-		     cpu, pid, tid, sample->time, perf_time_to_tsc(sample->time,
-		     &pt->tc));
-
 	ret = intel_pt_sync_switch(pt, cpu, tid, sample->time);
 	if (ret <= 0)
 		return ret;
@@ -2749,9 +2752,11 @@ static int intel_pt_process_event(struct perf_session *session,
 	if (!err && event->header.type == PERF_RECORD_TEXT_POKE)
 		err = intel_pt_text_poke(pt, event);
 
-	intel_pt_log("event %u: cpu %d time %"PRIu64" tsc %#"PRIx64" ",
-		     event->header.type, sample->cpu, sample->time, timestamp);
-	intel_pt_log_event(event);
+	if (intel_pt_enable_logging && intel_pt_log_events(pt)) {
+		intel_pt_log("event %u: cpu %d time %"PRIu64" tsc %#"PRIx64" ",
+			     event->header.type, sample->cpu, sample->time, timestamp);
+		intel_pt_log_event(event);
+	}
 
 	return err;
 }
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 08/11] perf intel-pt: Time filter logged perf events
  2020-07-09 17:36 [PATCH 00/11] perf intel-pt: Add support for decoding FUP/TIP only Adrian Hunter
                   ` (6 preceding siblings ...)
  2020-07-09 17:36 ` [PATCH 07/11] perf intel-pt: Use itrace debug log flags to suppress some messages Adrian Hunter
@ 2020-07-09 17:36 ` Adrian Hunter
  2020-07-09 17:36 ` [PATCH 09/11] perf auxtrace: Add itrace 'q' option for quicker, less detailed decoding Adrian Hunter
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 20+ messages in thread
From: Adrian Hunter @ 2020-07-09 17:36 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: Jiri Olsa, Andi Kleen, linux-kernel

Change the debug logging (when used with the --time option) to time filter
logged perf events, but allow that to be overridden by using "d2" instead
of plain "d".

By default that can greatly reduce the size of the log file.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/Documentation/perf-intel-pt.txt |  3 +++
 tools/perf/util/intel-pt.c                 | 20 +++++++++++++++++---
 2 files changed, 20 insertions(+), 3 deletions(-)

diff --git a/tools/perf/Documentation/perf-intel-pt.txt b/tools/perf/Documentation/perf-intel-pt.txt
index 85a2ff804900..9a90e2db4e4a 100644
--- a/tools/perf/Documentation/perf-intel-pt.txt
+++ b/tools/perf/Documentation/perf-intel-pt.txt
@@ -882,6 +882,9 @@ decoded packets and instructions.  Note that this option slows down the decoder
 and that the resulting file may be very large.  The "d" option may be followed
 by a number which has the following effect:
 	1	Suppress logging of perf events
+	2	Log all perf events
+By default, logged perf events are filtered by any specified time ranges, but
+value 2 overrides that.
 
 In addition, the period of the "instructions" event can be specified. e.g.
 
diff --git a/tools/perf/util/intel-pt.c b/tools/perf/util/intel-pt.c
index d90375659244..597120dd6b77 100644
--- a/tools/perf/util/intel-pt.c
+++ b/tools/perf/util/intel-pt.c
@@ -50,6 +50,7 @@
 #define INTEL_PT_ERR_SUPPRESS_LOST	2
 
 #define INTEL_PT_LOG_SUPPRESS_EV	1
+#define INTEL_PT_LOG_ALL_EV		2
 
 struct range {
 	u64 start;
@@ -254,9 +255,22 @@ static void intel_pt_dump_sample(struct perf_session *session,
 	intel_pt_dump(pt, sample->aux_sample.data, sample->aux_sample.size);
 }
 
-static bool intel_pt_log_events(struct intel_pt *pt)
+static bool intel_pt_log_events(struct intel_pt *pt, u64 tm)
 {
-	return !(pt->synth_opts.log_flags & INTEL_PT_LOG_SUPPRESS_EV);
+	struct perf_time_interval *range = pt->synth_opts.ptime_range;
+	int n = pt->synth_opts.range_num;
+
+	if (pt->synth_opts.log_flags & INTEL_PT_LOG_ALL_EV)
+		return true;
+
+	if (pt->synth_opts.log_flags & INTEL_PT_LOG_SUPPRESS_EV)
+		return false;
+
+	/* perf_time__ranges_skip_sample does not work if time is zero */
+	if (!tm)
+		tm = 1;
+
+	return !n || !perf_time__ranges_skip_sample(range, n, tm);
 }
 
 static int intel_pt_do_fix_overlap(struct intel_pt *pt, struct auxtrace_buffer *a,
@@ -2752,7 +2766,7 @@ static int intel_pt_process_event(struct perf_session *session,
 	if (!err && event->header.type == PERF_RECORD_TEXT_POKE)
 		err = intel_pt_text_poke(pt, event);
 
-	if (intel_pt_enable_logging && intel_pt_log_events(pt)) {
+	if (intel_pt_enable_logging && intel_pt_log_events(pt, sample->time)) {
 		intel_pt_log("event %u: cpu %d time %"PRIu64" tsc %#"PRIx64" ",
 			     event->header.type, sample->cpu, sample->time, timestamp);
 		intel_pt_log_event(event);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 09/11] perf auxtrace: Add itrace 'q' option for quicker, less detailed decoding
  2020-07-09 17:36 [PATCH 00/11] perf intel-pt: Add support for decoding FUP/TIP only Adrian Hunter
                   ` (7 preceding siblings ...)
  2020-07-09 17:36 ` [PATCH 08/11] perf intel-pt: Time filter logged perf events Adrian Hunter
@ 2020-07-09 17:36 ` Adrian Hunter
  2020-07-09 17:52   ` Andi Kleen
  2020-07-09 17:36 ` [PATCH 10/11] perf intel-pt: Add support for decoding FUP/TIP only Adrian Hunter
  2020-07-09 17:36 ` [PATCH 11/11] perf intel-pt: Add support for decoding PSB+ only Adrian Hunter
  10 siblings, 1 reply; 20+ messages in thread
From: Adrian Hunter @ 2020-07-09 17:36 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: Jiri Olsa, Andi Kleen, linux-kernel

The 'q' option is for modes of decoding that are quicker because they
skip or omit decoding some aspects of trace data.

If supported, the 'q' option may be repeated to increase the effect.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/Documentation/itrace.txt | 3 +++
 tools/perf/util/auxtrace.c          | 3 +++
 tools/perf/util/auxtrace.h          | 2 ++
 3 files changed, 8 insertions(+)

diff --git a/tools/perf/Documentation/itrace.txt b/tools/perf/Documentation/itrace.txt
index 3dd8fddb8b1b..d4ffa11b9d50 100644
--- a/tools/perf/Documentation/itrace.txt
+++ b/tools/perf/Documentation/itrace.txt
@@ -18,6 +18,7 @@
 		l	synthesize last branch entries (use with i or x)
 		L	synthesize last branch entries on existing event records
 		s       skip initial number of events
+		q	quicker (less detailed) decoding
 
 	The default is all events i.e. the same as --itrace=ibxwpe,
 	except for perf script where it is --itrace=ce
@@ -53,3 +54,5 @@
 
 	If supported, The 'd' option may be followed by an architecture-specific
 	number which flags what kind of debug messages will or will not be logged.
+
+	If supported, the 'q' option may be repeated to increase the effect.
diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
index 3f806c2881c9..81726c014237 100644
--- a/tools/perf/util/auxtrace.c
+++ b/tools/perf/util/auxtrace.c
@@ -1511,6 +1511,9 @@ int itrace_parse_synth_opts(const struct option *opt, const char *str,
 		case 'a':
 			synth_opts->remote_access = true;
 			break;
+		case 'q':
+			synth_opts->quick += 1;
+			break;
 		case ' ':
 		case ',':
 			break;
diff --git a/tools/perf/util/auxtrace.h b/tools/perf/util/auxtrace.h
index f41dbdc98175..d3b5520fa992 100644
--- a/tools/perf/util/auxtrace.h
+++ b/tools/perf/util/auxtrace.h
@@ -93,6 +93,7 @@ enum itrace_period_type {
  * @range_num: number of time intervals to trace
  * @error_flags: arch-specific flags to affect what errors are reported
  * @log_flags: arch-specific flags to affect what is logged
+ * @quick: quicker (less detailed) decoding
  */
 struct itrace_synth_opts {
 	bool			set;
@@ -128,6 +129,7 @@ struct itrace_synth_opts {
 	int			range_num;
 	unsigned int		error_flags;
 	unsigned int		log_flags;
+	unsigned int		quick;
 };
 
 /**
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 10/11] perf intel-pt: Add support for decoding FUP/TIP only
  2020-07-09 17:36 [PATCH 00/11] perf intel-pt: Add support for decoding FUP/TIP only Adrian Hunter
                   ` (8 preceding siblings ...)
  2020-07-09 17:36 ` [PATCH 09/11] perf auxtrace: Add itrace 'q' option for quicker, less detailed decoding Adrian Hunter
@ 2020-07-09 17:36 ` Adrian Hunter
  2020-07-09 17:55   ` Andi Kleen
  2020-07-09 17:36 ` [PATCH 11/11] perf intel-pt: Add support for decoding PSB+ only Adrian Hunter
  10 siblings, 1 reply; 20+ messages in thread
From: Adrian Hunter @ 2020-07-09 17:36 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: Jiri Olsa, Andi Kleen, linux-kernel

Use the new itrace 'q' option to add support for a mode of decoding that
ignores TNT, does not walk object code, but gets the ip from FUP and TIP
packets.

Example:

 $ perf record -e intel_pt//u grep -rI pudding drivers
 [ perf record: Woken up 52 times to write data ]
 [ perf record: Captured and wrote 57.870 MB perf.data ]
 $ time perf script --itrace=bi | wc -l
 58948289

 real    1m23.863s
 user    1m23.251s
 sys     0m7.452s
 $ time perf script --itrace=biq | wc -l
 3385694

 real    0m4.453s
 user    0m4.455s
 sys     0m0.328s

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/Documentation/perf-intel-pt.txt    |   8 +
 .../util/intel-pt-decoder/intel-pt-decoder.c  | 167 +++++++++++++++++-
 .../util/intel-pt-decoder/intel-pt-decoder.h  |   1 +
 tools/perf/util/intel-pt.c                    |   6 +-
 4 files changed, 177 insertions(+), 5 deletions(-)

diff --git a/tools/perf/Documentation/perf-intel-pt.txt b/tools/perf/Documentation/perf-intel-pt.txt
index 9a90e2db4e4a..758295a7e3d6 100644
--- a/tools/perf/Documentation/perf-intel-pt.txt
+++ b/tools/perf/Documentation/perf-intel-pt.txt
@@ -825,6 +825,7 @@ The letters are:
 	l	synthesize last branch entries (use with i or x)
 	L	synthesize last branch entries on existing event records
 	s	skip initial number of events
+	q	quicker (less detailed) decoding
 
 "Instructions" events look like they were recorded by "perf record -e
 instructions".
@@ -965,6 +966,13 @@ at the beginning. This is useful to ignore initialization code.
 
 skips the first million instructions.
 
+The q option does not decode TNT packets, and does not walk object code, but
+gets the ip from FUP and TIP packets.  The q option can be used with the b and i
+options but the period is not used.  The q option decodes more quickly, but is
+useful only if the control flow of interest is represented or indicated by FUP,
+TIP, TIP.PGE, or TIP.PGD packets.  However the q option could be used to find
+time ranges that could then be decoded fully using the --time option.
+
 
 dump option
 ~~~~~~~~~~~
diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
index 7ffcbd6fcd1a..ccb204b1a050 100644
--- a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
+++ b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
@@ -55,6 +55,7 @@ enum intel_pt_pkt_state {
 	INTEL_PT_STATE_TIP_PGD,
 	INTEL_PT_STATE_FUP,
 	INTEL_PT_STATE_FUP_NO_TIP,
+	INTEL_PT_STATE_RESAMPLE,
 };
 
 static inline bool intel_pt_sample_time(enum intel_pt_pkt_state pkt_state)
@@ -65,6 +66,7 @@ static inline bool intel_pt_sample_time(enum intel_pt_pkt_state pkt_state)
 	case INTEL_PT_STATE_ERR_RESYNC:
 	case INTEL_PT_STATE_IN_SYNC:
 	case INTEL_PT_STATE_TNT_CONT:
+	case INTEL_PT_STATE_RESAMPLE:
 		return true;
 	case INTEL_PT_STATE_TNT:
 	case INTEL_PT_STATE_TIP:
@@ -109,6 +111,8 @@ struct intel_pt_decoder {
 	bool fixup_last_mtc;
 	bool have_last_ip;
 	bool in_psb;
+	bool hop;
+	bool hop_psb_fup;
 	enum intel_pt_param_flags flags;
 	uint64_t pos;
 	uint64_t last_ip;
@@ -235,6 +239,7 @@ struct intel_pt_decoder *intel_pt_decoder_new(struct intel_pt_params *params)
 	decoder->data               = params->data;
 	decoder->return_compression = params->return_compression;
 	decoder->branch_enable      = params->branch_enable;
+	decoder->hop                = params->quick >= 1;
 
 	decoder->flags              = params->flags;
 
@@ -275,6 +280,9 @@ struct intel_pt_decoder *intel_pt_decoder_new(struct intel_pt_params *params)
 	intel_pt_log("timestamp: tsc_ctc_mult %u\n", decoder->tsc_ctc_mult);
 	intel_pt_log("timestamp: tsc_slip %#x\n", decoder->tsc_slip);
 
+	if (decoder->hop)
+		intel_pt_log("Hop mode: decoding FUP and TIPs, but not TNT\n");
+
 	return decoder;
 }
 
@@ -1730,8 +1738,14 @@ static int intel_pt_walk_psbend(struct intel_pt_decoder *decoder)
 
 		case INTEL_PT_FUP:
 			decoder->pge = true;
-			if (decoder->packet.count)
+			if (decoder->packet.count) {
 				intel_pt_set_last_ip(decoder);
+				if (decoder->hop) {
+					/* Act on FUP at PSBEND */
+					decoder->ip = decoder->last_ip;
+					decoder->hop_psb_fup = true;
+				}
+			}
 			break;
 
 		case INTEL_PT_MODE_TSX:
@@ -1875,6 +1889,118 @@ static int intel_pt_walk_fup_tip(struct intel_pt_decoder *decoder)
 	}
 }
 
+static int intel_pt_resample(struct intel_pt_decoder *decoder)
+{
+	decoder->pkt_state = INTEL_PT_STATE_IN_SYNC;
+	decoder->state.type = INTEL_PT_INSTRUCTION;
+	decoder->state.from_ip = decoder->ip;
+	decoder->state.to_ip = 0;
+	return 0;
+}
+
+#define HOP_PROCESS	0
+#define HOP_IGNORE	1
+#define HOP_RETURN	2
+#define HOP_AGAIN	3
+
+/* Hop mode: Ignore TNT, do not walk code, but get ip from FUPs and TIPs */
+static int intel_pt_hop_trace(struct intel_pt_decoder *decoder, bool *no_tip, int *err)
+{
+	switch (decoder->packet.type) {
+	case INTEL_PT_TNT:
+		return HOP_IGNORE;
+
+	case INTEL_PT_TIP_PGD:
+		if (!decoder->packet.count)
+			return HOP_IGNORE;
+		intel_pt_set_ip(decoder);
+		decoder->state.type |= INTEL_PT_TRACE_END;
+		decoder->state.from_ip = 0;
+		decoder->state.to_ip = decoder->ip;
+		return HOP_RETURN;
+
+	case INTEL_PT_TIP:
+		if (!decoder->packet.count)
+			return HOP_IGNORE;
+		intel_pt_set_ip(decoder);
+		decoder->state.type = INTEL_PT_INSTRUCTION;
+		decoder->state.from_ip = decoder->ip;
+		decoder->state.to_ip = 0;
+		return HOP_RETURN;
+
+	case INTEL_PT_FUP:
+		if (!decoder->packet.count)
+			return HOP_IGNORE;
+		intel_pt_set_ip(decoder);
+		if (intel_pt_fup_event(decoder))
+			return HOP_RETURN;
+		if (!decoder->branch_enable)
+			*no_tip = true;
+		if (*no_tip) {
+			decoder->state.type = INTEL_PT_INSTRUCTION;
+			decoder->state.from_ip = decoder->ip;
+			decoder->state.to_ip = 0;
+			return HOP_RETURN;
+		}
+		*err = intel_pt_walk_fup_tip(decoder);
+		if (!*err)
+			decoder->pkt_state = INTEL_PT_STATE_RESAMPLE;
+		return HOP_RETURN;
+
+	case INTEL_PT_PSB:
+		decoder->last_ip = 0;
+		decoder->have_last_ip = true;
+		decoder->hop_psb_fup = false;
+		*err = intel_pt_walk_psbend(decoder);
+		if (*err == -EAGAIN)
+			return HOP_AGAIN;
+		if (*err)
+			return HOP_RETURN;
+		if (decoder->hop_psb_fup) {
+			decoder->hop_psb_fup = false;
+			decoder->state.type = INTEL_PT_INSTRUCTION;
+			decoder->state.from_ip = decoder->ip;
+			decoder->state.to_ip = 0;
+			return HOP_RETURN;
+		}
+		if (decoder->cbr != decoder->cbr_seen) {
+			decoder->state.type = 0;
+			return HOP_RETURN;
+		}
+		return HOP_IGNORE;
+
+	case INTEL_PT_BAD:
+	case INTEL_PT_PAD:
+	case INTEL_PT_TIP_PGE:
+	case INTEL_PT_TSC:
+	case INTEL_PT_TMA:
+	case INTEL_PT_MODE_EXEC:
+	case INTEL_PT_MODE_TSX:
+	case INTEL_PT_MTC:
+	case INTEL_PT_CYC:
+	case INTEL_PT_VMCS:
+	case INTEL_PT_PSBEND:
+	case INTEL_PT_CBR:
+	case INTEL_PT_TRACESTOP:
+	case INTEL_PT_PIP:
+	case INTEL_PT_OVF:
+	case INTEL_PT_MNT:
+	case INTEL_PT_PTWRITE:
+	case INTEL_PT_PTWRITE_IP:
+	case INTEL_PT_EXSTOP:
+	case INTEL_PT_EXSTOP_IP:
+	case INTEL_PT_MWAIT:
+	case INTEL_PT_PWRE:
+	case INTEL_PT_PWRX:
+	case INTEL_PT_BBP:
+	case INTEL_PT_BIP:
+	case INTEL_PT_BEP:
+	case INTEL_PT_BEP_IP:
+	default:
+		return HOP_PROCESS;
+	}
+}
+
 static int intel_pt_walk_trace(struct intel_pt_decoder *decoder)
 {
 	bool no_tip = false;
@@ -1885,6 +2011,19 @@ static int intel_pt_walk_trace(struct intel_pt_decoder *decoder)
 		if (err)
 			return err;
 next:
+		if (decoder->hop) {
+			switch (intel_pt_hop_trace(decoder, &no_tip, &err)) {
+			case HOP_IGNORE:
+				continue;
+			case HOP_RETURN:
+				return err;
+			case HOP_AGAIN:
+				goto next;
+			default:
+				break;
+			}
+		}
+
 		switch (decoder->packet.type) {
 		case INTEL_PT_TNT:
 			if (!decoder->packet.count)
@@ -1914,6 +2053,12 @@ static int intel_pt_walk_trace(struct intel_pt_decoder *decoder)
 			decoder->state.from_ip = 0;
 			decoder->state.to_ip = decoder->ip;
 			decoder->state.type |= INTEL_PT_TRACE_BEGIN;
+			/*
+			 * In hop mode, resample to get the to_ip as an
+			 * "instruction" sample.
+			 */
+			if (decoder->hop)
+				decoder->pkt_state = INTEL_PT_STATE_RESAMPLE;
 			return 0;
 		}
 
@@ -2033,7 +2178,7 @@ static int intel_pt_walk_trace(struct intel_pt_decoder *decoder)
 
 		case INTEL_PT_MODE_TSX:
 			/* MODE_TSX need not be followed by FUP */
-			if (!decoder->pge) {
+			if (!decoder->pge || decoder->in_psb) {
 				intel_pt_update_in_tx(decoder);
 				break;
 			}
@@ -2424,7 +2569,11 @@ static int intel_pt_sync_ip(struct intel_pt_decoder *decoder)
 	if (err)
 		return err;
 
-	decoder->pkt_state = INTEL_PT_STATE_IN_SYNC;
+	/* In hop mode, resample to get the to_ip as an "instruction" sample */
+	if (decoder->hop)
+		decoder->pkt_state = INTEL_PT_STATE_RESAMPLE;
+	else
+		decoder->pkt_state = INTEL_PT_STATE_IN_SYNC;
 	decoder->overflow = false;
 
 	decoder->state.from_ip = 0;
@@ -2545,7 +2694,14 @@ static int intel_pt_sync(struct intel_pt_decoder *decoder)
 
 	if (decoder->ip) {
 		decoder->state.type = 0; /* Do not have a sample */
-		decoder->pkt_state = INTEL_PT_STATE_IN_SYNC;
+		/*
+		 * In hop mode, resample to get the PSB FUP ip as an
+		 * "instruction" sample.
+		 */
+		if (decoder->hop)
+			decoder->pkt_state = INTEL_PT_STATE_RESAMPLE;
+		else
+			decoder->pkt_state = INTEL_PT_STATE_IN_SYNC;
 	} else {
 		return intel_pt_sync_ip(decoder);
 	}
@@ -2609,6 +2765,9 @@ const struct intel_pt_state *intel_pt_decode(struct intel_pt_decoder *decoder)
 			if (err == -EAGAIN)
 				err = intel_pt_walk_trace(decoder);
 			break;
+		case INTEL_PT_STATE_RESAMPLE:
+			err = intel_pt_resample(decoder);
+			break;
 		default:
 			err = intel_pt_bug(decoder);
 			break;
diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.h b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.h
index e289e463d635..8645fc265481 100644
--- a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.h
+++ b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.h
@@ -250,6 +250,7 @@ struct intel_pt_params {
 	uint32_t tsc_ctc_ratio_n;
 	uint32_t tsc_ctc_ratio_d;
 	enum intel_pt_param_flags flags;
+	unsigned int quick;
 };
 
 struct intel_pt_decoder;
diff --git a/tools/perf/util/intel-pt.c b/tools/perf/util/intel-pt.c
index 597120dd6b77..93659e738d40 100644
--- a/tools/perf/util/intel-pt.c
+++ b/tools/perf/util/intel-pt.c
@@ -1036,6 +1036,7 @@ static struct intel_pt_queue *intel_pt_alloc_queue(struct intel_pt *pt,
 	params.mtc_period = intel_pt_mtc_period(pt);
 	params.tsc_ctc_ratio_n = pt->tsc_ctc_ratio_n;
 	params.tsc_ctc_ratio_d = pt->tsc_ctc_ratio_d;
+	params.quick = pt->synth_opts.quick;
 
 	if (pt->filts.cnt > 0)
 		params.pgd_ip = intel_pt_pgd_ip;
@@ -1429,7 +1430,10 @@ static int intel_pt_synth_instruction_sample(struct intel_pt_queue *ptq)
 
 	sample.id = ptq->pt->instructions_id;
 	sample.stream_id = ptq->pt->instructions_id;
-	sample.period = ptq->state->tot_insn_cnt - ptq->last_insn_cnt;
+	if (pt->synth_opts.quick)
+		sample.period = 1;
+	else
+		sample.period = ptq->state->tot_insn_cnt - ptq->last_insn_cnt;
 
 	sample.cyc_cnt = ptq->ipc_cyc_cnt - ptq->last_in_cyc_cnt;
 	if (sample.cyc_cnt) {
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 11/11] perf intel-pt: Add support for decoding PSB+ only
  2020-07-09 17:36 [PATCH 00/11] perf intel-pt: Add support for decoding FUP/TIP only Adrian Hunter
                   ` (9 preceding siblings ...)
  2020-07-09 17:36 ` [PATCH 10/11] perf intel-pt: Add support for decoding FUP/TIP only Adrian Hunter
@ 2020-07-09 17:36 ` Adrian Hunter
  2020-07-09 17:59   ` Andi Kleen
  10 siblings, 1 reply; 20+ messages in thread
From: Adrian Hunter @ 2020-07-09 17:36 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: Jiri Olsa, Andi Kleen, linux-kernel

A single q option decodes ip from only FUP/TIP packets. Make it so that
repeating the q option (i.e. qq) decodes only PSB+, getting ip if there is
a FUP packet within PSB+ (i.e. between PSB and PSBEND).

Example:

 $ perf record -e intel_pt//u grep -rI pudding drivers
 [ perf record: Woken up 52 times to write data ]
 [ perf record: Captured and wrote 57.870 MB perf.data ]
 $ time perf script --itrace=bi | wc -l
 58948289

 real    1m23.863s
 user    1m23.251s
 sys     0m7.452s
 $ time perf script --itrace=biq | wc -l
 3385694

 real    0m4.453s
 user    0m4.455s
 sys     0m0.328s
 $ time perf script --itrace=biqq | wc -l
 1883

 real    0m0.047s
 user    0m0.043s
 sys     0m0.009s

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/Documentation/perf-intel-pt.txt     |  3 +++
 .../util/intel-pt-decoder/intel-pt-decoder.c   | 18 ++++++++++++++++++
 2 files changed, 21 insertions(+)

diff --git a/tools/perf/Documentation/perf-intel-pt.txt b/tools/perf/Documentation/perf-intel-pt.txt
index 758295a7e3d6..849474629fe7 100644
--- a/tools/perf/Documentation/perf-intel-pt.txt
+++ b/tools/perf/Documentation/perf-intel-pt.txt
@@ -973,6 +973,9 @@ useful only if the control flow of interest is represented or indicated by FUP,
 TIP, TIP.PGE, or TIP.PGD packets.  However the q option could be used to find
 time ranges that could then be decoded fully using the --time option.
 
+Repeating the q option (i.e. qq) decodes only PSB+, getting ip if there is a
+FUP packet within PSB+ (i.e. between PSB and PSBEND).
+
 
 dump option
 ~~~~~~~~~~~
diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
index ccb204b1a050..697513f35154 100644
--- a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
+++ b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
@@ -113,6 +113,7 @@ struct intel_pt_decoder {
 	bool in_psb;
 	bool hop;
 	bool hop_psb_fup;
+	bool leap;
 	enum intel_pt_param_flags flags;
 	uint64_t pos;
 	uint64_t last_ip;
@@ -240,6 +241,7 @@ struct intel_pt_decoder *intel_pt_decoder_new(struct intel_pt_params *params)
 	decoder->return_compression = params->return_compression;
 	decoder->branch_enable      = params->branch_enable;
 	decoder->hop                = params->quick >= 1;
+	decoder->leap               = params->quick >= 2;
 
 	decoder->flags              = params->flags;
 
@@ -1903,9 +1905,18 @@ static int intel_pt_resample(struct intel_pt_decoder *decoder)
 #define HOP_RETURN	2
 #define HOP_AGAIN	3
 
+static int intel_pt_scan_for_psb(struct intel_pt_decoder *decoder);
+
 /* Hop mode: Ignore TNT, do not walk code, but get ip from FUPs and TIPs */
 static int intel_pt_hop_trace(struct intel_pt_decoder *decoder, bool *no_tip, int *err)
 {
+	/* Leap from PSB to PSB, getting ip from FUP within PSB+ */
+	if (decoder->leap && !decoder->in_psb && decoder->packet.type != INTEL_PT_PSB) {
+		*err = intel_pt_scan_for_psb(decoder);
+		if (*err)
+			return HOP_RETURN;
+	}
+
 	switch (decoder->packet.type) {
 	case INTEL_PT_TNT:
 		return HOP_IGNORE;
@@ -2681,6 +2692,7 @@ static int intel_pt_sync(struct intel_pt_decoder *decoder)
 	decoder->ip = 0;
 	intel_pt_clear_stack(&decoder->stack);
 
+leap:
 	err = intel_pt_scan_for_psb(decoder);
 	if (err)
 		return err;
@@ -2702,6 +2714,12 @@ static int intel_pt_sync(struct intel_pt_decoder *decoder)
 			decoder->pkt_state = INTEL_PT_STATE_RESAMPLE;
 		else
 			decoder->pkt_state = INTEL_PT_STATE_IN_SYNC;
+	} else if (decoder->leap) {
+		/*
+		 * In leap mode, only PSB+ is decoded, so keeping leaping to the
+		 * next PSB until there is an ip.
+		 */
+		goto leap;
 	} else {
 		return intel_pt_sync_ip(decoder);
 	}
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH 05/11] perf intel-pt: Use itrace error flags to suppress some errors
  2020-07-09 17:36 ` [PATCH 05/11] perf intel-pt: Use itrace error flags to suppress some errors Adrian Hunter
@ 2020-07-09 17:50   ` Andi Kleen
  2020-07-09 18:13     ` Adrian Hunter
  0 siblings, 1 reply; 20+ messages in thread
From: Andi Kleen @ 2020-07-09 17:50 UTC (permalink / raw)
  To: Adrian Hunter; +Cc: Arnaldo Carvalho de Melo, Jiri Olsa, linux-kernel

On Thu, Jul 09, 2020 at 08:36:22PM +0300, Adrian Hunter wrote:
> The itrace "e" option may be followed by a number which has the
> following effect for Intel PT:
> 	1	Suppress overflow events
> 	2	Suppress trace data lost events
> The values may be combined by bitwise OR'ing them.
> 
> Suppressing those errors can be useful for testing and debugging
> because they are not due to decoding.

I suspect it will be useful to more than just decoding and debugging.

But the number is not a nice user interface.

How about e[....] 

like e[ol] 

Also it's a bit unusual that this disables instead of enables, but ok.

-Andi

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 06/11] perf auxtrace: Add optional log flags to the itrace 'd' option
  2020-07-09 17:36 ` [PATCH 06/11] perf auxtrace: Add optional log flags to the itrace 'd' option Adrian Hunter
@ 2020-07-09 17:51   ` Andi Kleen
  0 siblings, 0 replies; 20+ messages in thread
From: Andi Kleen @ 2020-07-09 17:51 UTC (permalink / raw)
  To: Adrian Hunter; +Cc: Arnaldo Carvalho de Melo, Jiri Olsa, linux-kernel

>  	flags what kind of errors will or will not be reported.
> +
> +	If supported, The 'd' option may be followed by an architecture-specific
> +	number which flags what kind of debug messages will or will not be logged.

Would need documentation here.

Also in the include used by the command line help.

-Andi

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 09/11] perf auxtrace: Add itrace 'q' option for quicker, less detailed decoding
  2020-07-09 17:36 ` [PATCH 09/11] perf auxtrace: Add itrace 'q' option for quicker, less detailed decoding Adrian Hunter
@ 2020-07-09 17:52   ` Andi Kleen
  0 siblings, 0 replies; 20+ messages in thread
From: Andi Kleen @ 2020-07-09 17:52 UTC (permalink / raw)
  To: Adrian Hunter; +Cc: Arnaldo Carvalho de Melo, Jiri Olsa, linux-kernel

> +
> +	If supported, the 'q' option may be repeated to increase the effect.

Need better documentation here. What does it mean for PT?


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 10/11] perf intel-pt: Add support for decoding FUP/TIP only
  2020-07-09 17:36 ` [PATCH 10/11] perf intel-pt: Add support for decoding FUP/TIP only Adrian Hunter
@ 2020-07-09 17:55   ` Andi Kleen
  0 siblings, 0 replies; 20+ messages in thread
From: Andi Kleen @ 2020-07-09 17:55 UTC (permalink / raw)
  To: Adrian Hunter; +Cc: Arnaldo Carvalho de Melo, Jiri Olsa, linux-kernel

On Thu, Jul 09, 2020 at 08:36:27PM +0300, Adrian Hunter wrote:
> +The q option does not decode TNT packets, and does not walk object code, but
> +gets the ip from FUP and TIP packets.  The q option can be used with the b and i
> +options but the period is not used.  The q option decodes more quickly, but is
> +useful only if the control flow of interest is represented or indicated by FUP,
> +TIP, TIP.PGE, or TIP.PGD packets.  However the q option could be used to find
> +time ranges that could then be decoded fully using the --time option.

Ah ok the documentation is here. Ignore previous suggestion.

Can you describe the effect in a high level without referring to packet names?

The user may not be familar with them.

So two qs will be PSB only decoding I hope?

-Andi



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 11/11] perf intel-pt: Add support for decoding PSB+ only
  2020-07-09 17:36 ` [PATCH 11/11] perf intel-pt: Add support for decoding PSB+ only Adrian Hunter
@ 2020-07-09 17:59   ` Andi Kleen
  0 siblings, 0 replies; 20+ messages in thread
From: Andi Kleen @ 2020-07-09 17:59 UTC (permalink / raw)
  To: Adrian Hunter; +Cc: Arnaldo Carvalho de Melo, Jiri Olsa, linux-kernel

> diff --git a/tools/perf/Documentation/perf-intel-pt.txt b/tools/perf/Documentation/perf-intel-pt.txt
> index 758295a7e3d6..849474629fe7 100644
> --- a/tools/perf/Documentation/perf-intel-pt.txt
> +++ b/tools/perf/Documentation/perf-intel-pt.txt
> @@ -973,6 +973,9 @@ useful only if the control flow of interest is represented or indicated by FUP,
>  TIP, TIP.PGE, or TIP.PGD packets.  However the q option could be used to find
>  time ranges that could then be decoded fully using the --time option.
>  
> +Repeating the q option (i.e. qq) decodes only PSB+, getting ip if there is a
> +FUP packet within PSB+ (i.e. between PSB and PSBEND).

Also need high level description without PT jargon
(and perhaps also reference how to configure PSB frequency)

Other than that great feature. I'll be an enthuisastic user :-)

-Andi
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 05/11] perf intel-pt: Use itrace error flags to suppress some errors
  2020-07-09 17:50   ` Andi Kleen
@ 2020-07-09 18:13     ` Adrian Hunter
  2020-07-09 18:22       ` Adrian Hunter
  0 siblings, 1 reply; 20+ messages in thread
From: Adrian Hunter @ 2020-07-09 18:13 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Arnaldo Carvalho de Melo, Jiri Olsa, linux-kernel

On 9/07/20 8:50 pm, Andi Kleen wrote:
> On Thu, Jul 09, 2020 at 08:36:22PM +0300, Adrian Hunter wrote:
>> The itrace "e" option may be followed by a number which has the
>> following effect for Intel PT:
>> 	1	Suppress overflow events
>> 	2	Suppress trace data lost events
>> The values may be combined by bitwise OR'ing them.
>>
>> Suppressing those errors can be useful for testing and debugging
>> because they are not due to decoding.
> 
> I suspect it will be useful to more than just decoding and debugging.
> 
> But the number is not a nice user interface.
> 
> How about e[....] 
> 
> like e[ol] 

Do you mean literally square-brackets? If you were really unlucky you might
get pathname expansion with that.

> 
> Also it's a bit unusual that this disables instead of enables, but ok.
> 
> -Andi
> 


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 05/11] perf intel-pt: Use itrace error flags to suppress some errors
  2020-07-09 18:13     ` Adrian Hunter
@ 2020-07-09 18:22       ` Adrian Hunter
  2020-07-20 22:18         ` Andi Kleen
  0 siblings, 1 reply; 20+ messages in thread
From: Adrian Hunter @ 2020-07-09 18:22 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Arnaldo Carvalho de Melo, Jiri Olsa, linux-kernel

On 9/07/20 9:13 pm, Adrian Hunter wrote:
> On 9/07/20 8:50 pm, Andi Kleen wrote:
>> On Thu, Jul 09, 2020 at 08:36:22PM +0300, Adrian Hunter wrote:
>>> The itrace "e" option may be followed by a number which has the
>>> following effect for Intel PT:
>>> 	1	Suppress overflow events
>>> 	2	Suppress trace data lost events
>>> The values may be combined by bitwise OR'ing them.
>>>
>>> Suppressing those errors can be useful for testing and debugging
>>> because they are not due to decoding.
>>
>> I suspect it will be useful to more than just decoding and debugging.
>>
>> But the number is not a nice user interface.
>>
>> How about e[....] 
>>
>> like e[ol] 
> 
> Do you mean literally square-brackets? If you were really unlucky you might
> get pathname expansion with that.
> 
>>
>> Also it's a bit unusual that this disables instead of enables, but ok.

What about prefixing each flag with - i.e.

e-o
e-l
e-o-l


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 05/11] perf intel-pt: Use itrace error flags to suppress some errors
  2020-07-09 18:22       ` Adrian Hunter
@ 2020-07-20 22:18         ` Andi Kleen
  0 siblings, 0 replies; 20+ messages in thread
From: Andi Kleen @ 2020-07-20 22:18 UTC (permalink / raw)
  To: Adrian Hunter; +Cc: Arnaldo Carvalho de Melo, Jiri Olsa, linux-kernel

> What about prefixing each flag with - i.e.
> 
> e-o
> e-l
> e-o-l

I was thinking square brackets, but yes the shell collision is a fair point.
- should work too.

-andi

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2020-07-20 22:18 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-09 17:36 [PATCH 00/11] perf intel-pt: Add support for decoding FUP/TIP only Adrian Hunter
2020-07-09 17:36 ` [PATCH 01/11] perf intel-pt: Fix FUP packet state Adrian Hunter
2020-07-09 17:36 ` [PATCH 02/11] perf intel-pt: Fix duplicate branch after CBR Adrian Hunter
2020-07-09 17:36 ` [PATCH 03/11] perf tools: Improve aux_output not supported error Adrian Hunter
2020-07-09 17:36 ` [PATCH 04/11] perf auxtrace: Add optional error flags to the itrace 'e' option Adrian Hunter
2020-07-09 17:36 ` [PATCH 05/11] perf intel-pt: Use itrace error flags to suppress some errors Adrian Hunter
2020-07-09 17:50   ` Andi Kleen
2020-07-09 18:13     ` Adrian Hunter
2020-07-09 18:22       ` Adrian Hunter
2020-07-20 22:18         ` Andi Kleen
2020-07-09 17:36 ` [PATCH 06/11] perf auxtrace: Add optional log flags to the itrace 'd' option Adrian Hunter
2020-07-09 17:51   ` Andi Kleen
2020-07-09 17:36 ` [PATCH 07/11] perf intel-pt: Use itrace debug log flags to suppress some messages Adrian Hunter
2020-07-09 17:36 ` [PATCH 08/11] perf intel-pt: Time filter logged perf events Adrian Hunter
2020-07-09 17:36 ` [PATCH 09/11] perf auxtrace: Add itrace 'q' option for quicker, less detailed decoding Adrian Hunter
2020-07-09 17:52   ` Andi Kleen
2020-07-09 17:36 ` [PATCH 10/11] perf intel-pt: Add support for decoding FUP/TIP only Adrian Hunter
2020-07-09 17:55   ` Andi Kleen
2020-07-09 17:36 ` [PATCH 11/11] perf intel-pt: Add support for decoding PSB+ only Adrian Hunter
2020-07-09 17:59   ` Andi Kleen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.