All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/22] perf intel-pt: Add support for instructions-per-cycle (IPC)
@ 2019-05-20 11:37 Adrian Hunter
  2019-05-20 11:37 ` [PATCH 01/22] perf intel-pt: Fix itrace defaults for perf script Adrian Hunter
                   ` (21 more replies)
  0 siblings, 22 replies; 52+ messages in thread
From: Adrian Hunter @ 2019-05-20 11:37 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: Jiri Olsa, linux-kernel

Hi

Here are some patches that add support for IPC. There are 3 dependent fixes
to start with.  Display of IPC by perf script is in patch 8 "perf script: Add
output of IPC ratio".  Information about Intel PT and IPC is in patch 12
"perf intel-pt: Document IPC usage".  Then there are patches to export IPC
including un-related patch 14 "perf db-export: Add brief documentation" and
un-related patch 22 "perf scripts python: exported-sql-viewer.py: Select find
text when find bar is activated"


Adrian Hunter (22):
      perf intel-pt: Fix itrace defaults for perf script
      perf auxtrace: Fix itrace defaults for perf script
      perf intel-pt: Fix itrace defaults for perf script intel-pt documentation
      perf intel-pt: Factor out intel_pt_update_sample_time
      perf intel-pt: Accumulate cycle count from CYC packets
      perf tools: Add IPC information to perf_sample
      perf intel-pt: Add support for samples to contain IPC ratio
      perf script: Add output of IPC ratio
      perf intel-pt: Record when decoding PSB+ packets
      perf intel-pt: Re-factor TIP cases in intel_pt_walk_to_ip
      perf intel-pt: Accumulate cycle count from TSC/TMA/MTC packets
      perf intel-pt: Document IPC usage
      perf thread-stack: Accumulate IPC information
      perf db-export: Add brief documentation
      perf db-export: Export IPC information
      perf scripts python: export-to-sqlite.py: Export IPC information
      perf scripts python: export-to-postgresql.py: Export IPC information
      perf scripts python: exported-sql-viewer.py: Add IPC information to the Branch reports
      perf scripts python: exported-sql-viewer.py: Add CallGraphModelParams
      perf scripts python: exported-sql-viewer.py: Add IPC information to Call Graph Graph
      perf scripts python: exported-sql-viewer.py: Add IPC information to Call Tree
      perf scripts python: exported-sql-viewer.py: Select find text when find bar is activated

 tools/perf/Documentation/db-export.txt             |  41 +++
 tools/perf/Documentation/intel-pt.txt              |  40 ++-
 tools/perf/Documentation/perf-script.txt           |   5 +-
 tools/perf/builtin-script.c                        |  23 +-
 tools/perf/scripts/python/export-to-postgresql.py  |  36 ++-
 tools/perf/scripts/python/export-to-sqlite.py      |  36 ++-
 tools/perf/scripts/python/exported-sql-viewer.py   | 294 ++++++++++++++++-----
 tools/perf/util/auxtrace.c                         |   3 +-
 tools/perf/util/event.h                            |   2 +
 .../perf/util/intel-pt-decoder/intel-pt-decoder.c  | 141 ++++++++--
 .../perf/util/intel-pt-decoder/intel-pt-decoder.h  |   1 +
 tools/perf/util/intel-pt.c                         |  32 ++-
 .../util/scripting-engines/trace-event-python.c    |   8 +-
 tools/perf/util/thread-stack.c                     |  14 +
 tools/perf/util/thread-stack.h                     |   4 +
 15 files changed, 556 insertions(+), 124 deletions(-)
 create mode 100644 tools/perf/Documentation/db-export.txt


Regards
Adrian

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH 01/22] perf intel-pt: Fix itrace defaults for perf script
  2019-05-20 11:37 [PATCH 00/22] perf intel-pt: Add support for instructions-per-cycle (IPC) Adrian Hunter
@ 2019-05-20 11:37 ` Adrian Hunter
  2019-05-20 13:58   ` Sasha Levin
                     ` (2 more replies)
  2019-05-20 11:37 ` [PATCH 02/22] perf auxtrace: " Adrian Hunter
                   ` (20 subsequent siblings)
  21 siblings, 3 replies; 52+ messages in thread
From: Adrian Hunter @ 2019-05-20 11:37 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: Jiri Olsa, linux-kernel

Commit 4eb068157121 ("perf script: Make itrace script default to all
calls") does not work because 'use_browser' is being used to determine
whether to default to periodic sampling (i.e. better for perf report).
The result is that nothing but CBR events display for perf script
when no --itrace option is specified.

Fix by using 'default_no_sample' and 'inject' instead.

Example:

 Before:

  $ perf record -e intel_pt/cyc/u ls
  $ perf script > cmp1.txt
  $ perf script --itrace=cepwx > cmp2.txt
  $ diff -sq cmp1.txt cmp2.txt
  Files cmp1.txt and cmp2.txt differ

 After:

  $ perf script > cmp1.txt
  $ perf script --itrace=cepwx > cmp2.txt
  $ diff -sq cmp1.txt cmp2.txt
  Files cmp1.txt and cmp2.txt are identical

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Fixes: 90e457f7be08 ("perf tools: Add Intel PT support")
Cc: stable@vger.kernel.org # v4.20+
---
 tools/perf/util/intel-pt.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/intel-pt.c b/tools/perf/util/intel-pt.c
index 551ae10d1c7b..7a70693c1b91 100644
--- a/tools/perf/util/intel-pt.c
+++ b/tools/perf/util/intel-pt.c
@@ -2626,7 +2626,8 @@ int intel_pt_process_auxtrace_info(union perf_event *event,
 	} else {
 		itrace_synth_opts__set_default(&pt->synth_opts,
 				session->itrace_synth_opts->default_no_sample);
-		if (use_browser != -1) {
+		if (!session->itrace_synth_opts->default_no_sample &&
+		    !session->itrace_synth_opts->inject) {
 			pt->synth_opts.branches = false;
 			pt->synth_opts.callchain = true;
 		}
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 02/22] perf auxtrace: Fix itrace defaults for perf script
  2019-05-20 11:37 [PATCH 00/22] perf intel-pt: Add support for instructions-per-cycle (IPC) Adrian Hunter
  2019-05-20 11:37 ` [PATCH 01/22] perf intel-pt: Fix itrace defaults for perf script Adrian Hunter
@ 2019-05-20 11:37 ` Adrian Hunter
  2019-05-30  7:54   ` [tip:perf/core] " tip-bot for Adrian Hunter
  2019-05-20 11:37 ` [PATCH 03/22] perf intel-pt: Fix itrace defaults for perf script intel-pt documentation Adrian Hunter
                   ` (19 subsequent siblings)
  21 siblings, 1 reply; 52+ messages in thread
From: Adrian Hunter @ 2019-05-20 11:37 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: Jiri Olsa, linux-kernel

Commit 4eb068157121 ("perf script: Make itrace script default to all
calls") does not work for the case when '--itrace' only is used, because
default_no_sample is not being passed.

Example:

 Before:

  $ perf record -e intel_pt/cyc/u ls
  $ perf script --itrace > cmp1.txt
  $ perf script --itrace=cepwx > cmp2.txt
  $ diff -sq cmp1.txt cmp2.txt
  Files cmp1.txt and cmp2.txt differ

 After:

  $ perf script --itrace > cmp1.txt
  $ perf script --itrace=cepwx > cmp2.txt
  $ diff -sq cmp1.txt cmp2.txt
  Files cmp1.txt and cmp2.txt are identical

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Fixes: 4eb068157121 ("perf script: Make itrace script default to all calls")
Cc: stable@vger.kernel.org
---
 tools/perf/util/auxtrace.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
index fb76b6b232d4..5dd9d1893b89 100644
--- a/tools/perf/util/auxtrace.c
+++ b/tools/perf/util/auxtrace.c
@@ -1010,7 +1010,8 @@ int itrace_parse_synth_opts(const struct option *opt, const char *str,
 	}
 
 	if (!str) {
-		itrace_synth_opts__set_default(synth_opts, false);
+		itrace_synth_opts__set_default(synth_opts,
+					       synth_opts->default_no_sample);
 		return 0;
 	}
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 03/22] perf intel-pt: Fix itrace defaults for perf script intel-pt documentation
  2019-05-20 11:37 [PATCH 00/22] perf intel-pt: Add support for instructions-per-cycle (IPC) Adrian Hunter
  2019-05-20 11:37 ` [PATCH 01/22] perf intel-pt: Fix itrace defaults for perf script Adrian Hunter
  2019-05-20 11:37 ` [PATCH 02/22] perf auxtrace: " Adrian Hunter
@ 2019-05-20 11:37 ` Adrian Hunter
  2019-05-30  7:54   ` [tip:perf/core] " tip-bot for Adrian Hunter
  2019-05-20 11:37 ` [PATCH 04/22] perf intel-pt: Factor out intel_pt_update_sample_time Adrian Hunter
                   ` (18 subsequent siblings)
  21 siblings, 1 reply; 52+ messages in thread
From: Adrian Hunter @ 2019-05-20 11:37 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: Jiri Olsa, linux-kernel

Fix intel-pt documentation to reflect the change of itrace defaults for
perf script.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Fixes: 4eb068157121 ("perf script: Make itrace script default to all calls")
Cc: stable@vger.kernel.org
---
 tools/perf/Documentation/intel-pt.txt | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/tools/perf/Documentation/intel-pt.txt b/tools/perf/Documentation/intel-pt.txt
index 115eaacc455f..60d99e5e7921 100644
--- a/tools/perf/Documentation/intel-pt.txt
+++ b/tools/perf/Documentation/intel-pt.txt
@@ -88,16 +88,16 @@ smaller.
 
 To represent software control flow, "branches" samples are produced.  By default
 a branch sample is synthesized for every single branch.  To get an idea what
-data is available you can use the 'perf script' tool with no parameters, which
-will list all the samples.
+data is available you can use the 'perf script' tool with all itrace sampling
+options, which will list all the samples.
 
 	perf record -e intel_pt//u ls
-	perf script
+	perf script --itrace=ibxwpe
 
 An interesting field that is not printed by default is 'flags' which can be
 displayed as follows:
 
-	perf script -Fcomm,tid,pid,time,cpu,event,trace,ip,sym,dso,addr,symoff,flags
+	perf script --itrace=ibxwpe -F+flags
 
 The flags are "bcrosyiABEx" which stand for branch, call, return, conditional,
 system, asynchronous, interrupt, transaction abort, trace begin, trace end, and
@@ -713,7 +713,7 @@ Having no option is the same as
 
 which, in turn, is the same as
 
-	--itrace=ibxwpe
+	--itrace=cepwx
 
 The letters are:
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 04/22] perf intel-pt: Factor out intel_pt_update_sample_time
  2019-05-20 11:37 [PATCH 00/22] perf intel-pt: Add support for instructions-per-cycle (IPC) Adrian Hunter
                   ` (2 preceding siblings ...)
  2019-05-20 11:37 ` [PATCH 03/22] perf intel-pt: Fix itrace defaults for perf script intel-pt documentation Adrian Hunter
@ 2019-05-20 11:37 ` Adrian Hunter
  2019-06-17 18:59   ` [tip:perf/core] " tip-bot for Adrian Hunter
  2019-05-20 11:37 ` [PATCH 05/22] perf intel-pt: Accumulate cycle count from CYC packets Adrian Hunter
                   ` (17 subsequent siblings)
  21 siblings, 1 reply; 52+ messages in thread
From: Adrian Hunter @ 2019-05-20 11:37 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: Jiri Olsa, linux-kernel

To eliminate some duplication and make the code more understandable, factor
out intel_pt_update_sample_time.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 .../util/intel-pt-decoder/intel-pt-decoder.c   | 18 ++++++++++--------
 1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
index f4c3c84b090f..1ab4070b5633 100644
--- a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
+++ b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
@@ -479,6 +479,12 @@ static int intel_pt_bad_packet(struct intel_pt_decoder *decoder)
 	return -EBADMSG;
 }
 
+static inline void intel_pt_update_sample_time(struct intel_pt_decoder *decoder)
+{
+	decoder->sample_timestamp = decoder->timestamp;
+	decoder->sample_insn_cnt = decoder->timestamp_insn_cnt;
+}
+
 static int intel_pt_get_data(struct intel_pt_decoder *decoder)
 {
 	struct intel_pt_buffer buffer = { .buf = 0, };
@@ -1319,8 +1325,7 @@ static int intel_pt_walk_tnt(struct intel_pt_decoder *decoder)
 			}
 			decoder->ip += intel_pt_insn.length;
 			if (!decoder->tnt.count) {
-				decoder->sample_timestamp = decoder->timestamp;
-				decoder->sample_insn_cnt = decoder->timestamp_insn_cnt;
+				intel_pt_update_sample_time(decoder);
 				return -EAGAIN;
 			}
 			decoder->tnt.payload <<= 1;
@@ -2413,8 +2418,7 @@ const struct intel_pt_state *intel_pt_decode(struct intel_pt_decoder *decoder)
 	if (err) {
 		decoder->state.err = intel_pt_ext_err(err);
 		decoder->state.from_ip = decoder->ip;
-		decoder->sample_timestamp = decoder->timestamp;
-		decoder->sample_insn_cnt = decoder->timestamp_insn_cnt;
+		intel_pt_update_sample_time(decoder);
 	} else {
 		decoder->state.err = 0;
 		if (decoder->cbr != decoder->cbr_seen && decoder->state.type) {
@@ -2422,10 +2426,8 @@ const struct intel_pt_state *intel_pt_decode(struct intel_pt_decoder *decoder)
 			decoder->state.type |= INTEL_PT_CBR_CHG;
 			decoder->state.cbr_payload = decoder->cbr_payload;
 		}
-		if (intel_pt_sample_time(decoder->pkt_state)) {
-			decoder->sample_timestamp = decoder->timestamp;
-			decoder->sample_insn_cnt = decoder->timestamp_insn_cnt;
-		}
+		if (intel_pt_sample_time(decoder->pkt_state))
+			intel_pt_update_sample_time(decoder);
 	}
 
 	decoder->state.timestamp = decoder->sample_timestamp;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 05/22] perf intel-pt: Accumulate cycle count from CYC packets
  2019-05-20 11:37 [PATCH 00/22] perf intel-pt: Add support for instructions-per-cycle (IPC) Adrian Hunter
                   ` (3 preceding siblings ...)
  2019-05-20 11:37 ` [PATCH 04/22] perf intel-pt: Factor out intel_pt_update_sample_time Adrian Hunter
@ 2019-05-20 11:37 ` Adrian Hunter
  2019-06-17 18:59   ` [tip:perf/core] " tip-bot for Adrian Hunter
  2019-05-20 11:37 ` [PATCH 06/22] perf tools: Add IPC information to perf_sample Adrian Hunter
                   ` (16 subsequent siblings)
  21 siblings, 1 reply; 52+ messages in thread
From: Adrian Hunter @ 2019-05-20 11:37 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: Jiri Olsa, linux-kernel

In preparation for providing instructions-per-cycle (IPC) information,
accumulate cycle count from CYC packets.

Although CYC packets are optional (requires config term 'cyc' to enable
cycle-accurate mode when recording), the simplest way to count cycles is
with CYC packets. The first complication is that cycles must be counted
only when also counting instructions. That means when control flow packet
generation is enabled i.e. between TIP.PGE and TIP.PGD packets. Also,
sampling the cycle count follows the same rules as sampling the timestamp,
that is, not before the instruction to which the decoder is walking is
reached. In addition, the cycle count is not accurate for any but the first
branch of a TNT packet.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 .../perf/util/intel-pt-decoder/intel-pt-decoder.c  | 14 +++++++++++++-
 .../perf/util/intel-pt-decoder/intel-pt-decoder.h  |  1 +
 2 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
index 1ab4070b5633..ef3a1c1cd250 100644
--- a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
+++ b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
@@ -160,6 +160,8 @@ struct intel_pt_decoder {
 	uint64_t period_mask;
 	uint64_t period_ticks;
 	uint64_t last_masked_timestamp;
+	uint64_t tot_cyc_cnt;
+	uint64_t sample_tot_cyc_cnt;
 	bool continuous_period;
 	bool overflow;
 	bool set_fup_tx_flags;
@@ -167,6 +169,7 @@ struct intel_pt_decoder {
 	bool set_fup_mwait;
 	bool set_fup_pwre;
 	bool set_fup_exstop;
+	bool sample_cyc;
 	unsigned int fup_tx_flags;
 	unsigned int tx_flags;
 	uint64_t fup_ptw_payload;
@@ -1323,6 +1326,7 @@ static int intel_pt_walk_tnt(struct intel_pt_decoder *decoder)
 				decoder->ip += intel_pt_insn.length;
 				return 0;
 			}
+			decoder->sample_cyc = false;
 			decoder->ip += intel_pt_insn.length;
 			if (!decoder->tnt.count) {
 				intel_pt_update_sample_time(decoder);
@@ -1515,6 +1519,9 @@ static void intel_pt_calc_cyc_timestamp(struct intel_pt_decoder *decoder)
 	decoder->have_cyc = true;
 
 	decoder->cycle_cnt += decoder->packet.payload;
+	if (decoder->pge)
+		decoder->tot_cyc_cnt += decoder->packet.payload;
+	decoder->sample_cyc = true;
 
 	if (!decoder->cyc_ref_timestamp)
 		return;
@@ -2419,6 +2426,7 @@ const struct intel_pt_state *intel_pt_decode(struct intel_pt_decoder *decoder)
 		decoder->state.err = intel_pt_ext_err(err);
 		decoder->state.from_ip = decoder->ip;
 		intel_pt_update_sample_time(decoder);
+		decoder->sample_tot_cyc_cnt = decoder->tot_cyc_cnt;
 	} else {
 		decoder->state.err = 0;
 		if (decoder->cbr != decoder->cbr_seen && decoder->state.type) {
@@ -2426,14 +2434,18 @@ const struct intel_pt_state *intel_pt_decode(struct intel_pt_decoder *decoder)
 			decoder->state.type |= INTEL_PT_CBR_CHG;
 			decoder->state.cbr_payload = decoder->cbr_payload;
 		}
-		if (intel_pt_sample_time(decoder->pkt_state))
+		if (intel_pt_sample_time(decoder->pkt_state)) {
 			intel_pt_update_sample_time(decoder);
+			if (decoder->sample_cyc)
+				decoder->sample_tot_cyc_cnt = decoder->tot_cyc_cnt;
+		}
 	}
 
 	decoder->state.timestamp = decoder->sample_timestamp;
 	decoder->state.est_timestamp = intel_pt_est_timestamp(decoder);
 	decoder->state.cr3 = decoder->cr3;
 	decoder->state.tot_insn_cnt = decoder->tot_insn_cnt;
+	decoder->state.tot_cyc_cnt = decoder->sample_tot_cyc_cnt;
 
 	return &decoder->state;
 }
diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.h b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.h
index ed088d4726ba..6a61773dc44b 100644
--- a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.h
+++ b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.h
@@ -77,6 +77,7 @@ struct intel_pt_state {
 	uint64_t to_ip;
 	uint64_t cr3;
 	uint64_t tot_insn_cnt;
+	uint64_t tot_cyc_cnt;
 	uint64_t timestamp;
 	uint64_t est_timestamp;
 	uint64_t trace_nr;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 06/22] perf tools: Add IPC information to perf_sample
  2019-05-20 11:37 [PATCH 00/22] perf intel-pt: Add support for instructions-per-cycle (IPC) Adrian Hunter
                   ` (4 preceding siblings ...)
  2019-05-20 11:37 ` [PATCH 05/22] perf intel-pt: Accumulate cycle count from CYC packets Adrian Hunter
@ 2019-05-20 11:37 ` Adrian Hunter
  2019-06-17 19:00   ` [tip:perf/core] " tip-bot for Adrian Hunter
  2019-05-20 11:37 ` [PATCH 07/22] perf intel-pt: Add support for samples to contain IPC ratio Adrian Hunter
                   ` (15 subsequent siblings)
  21 siblings, 1 reply; 52+ messages in thread
From: Adrian Hunter @ 2019-05-20 11:37 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: Jiri Olsa, linux-kernel

Add counts of instructions and cycles, in order to represent
instructions-per-cycle (IPC).

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/util/event.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
index 9e999550f247..1f1da6082806 100644
--- a/tools/perf/util/event.h
+++ b/tools/perf/util/event.h
@@ -204,6 +204,8 @@ struct perf_sample {
 	u64 period;
 	u64 weight;
 	u64 transaction;
+	u64 insn_cnt;
+	u64 cyc_cnt;
 	u32 cpu;
 	u32 raw_size;
 	u64 data_src;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 07/22] perf intel-pt: Add support for samples to contain IPC ratio
  2019-05-20 11:37 [PATCH 00/22] perf intel-pt: Add support for instructions-per-cycle (IPC) Adrian Hunter
                   ` (5 preceding siblings ...)
  2019-05-20 11:37 ` [PATCH 06/22] perf tools: Add IPC information to perf_sample Adrian Hunter
@ 2019-05-20 11:37 ` Adrian Hunter
  2019-06-17 19:01   ` [tip:perf/core] " tip-bot for Adrian Hunter
  2019-05-20 11:37 ` [PATCH 08/22] perf script: Add output of " Adrian Hunter
                   ` (14 subsequent siblings)
  21 siblings, 1 reply; 52+ messages in thread
From: Adrian Hunter @ 2019-05-20 11:37 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: Jiri Olsa, linux-kernel

Copy the incremental instruction count and cycle count onto 'instructions'
and 'branches' samples. Because Intel PT does not update the cycle count on
every branch or instruction, the incremental values will often be zero.
When there are values, they will be the number of instructions and number
of cycles since the last update, and thus represent the average IPC since
the last IPC value.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/util/intel-pt.c | 29 +++++++++++++++++++++++++++++
 1 file changed, 29 insertions(+)

diff --git a/tools/perf/util/intel-pt.c b/tools/perf/util/intel-pt.c
index 7a70693c1b91..3cff8fe2eaa0 100644
--- a/tools/perf/util/intel-pt.c
+++ b/tools/perf/util/intel-pt.c
@@ -157,6 +157,12 @@ struct intel_pt_queue {
 	u32 flags;
 	u16 insn_len;
 	u64 last_insn_cnt;
+	u64 ipc_insn_cnt;
+	u64 ipc_cyc_cnt;
+	u64 last_in_insn_cnt;
+	u64 last_in_cyc_cnt;
+	u64 last_br_insn_cnt;
+	u64 last_br_cyc_cnt;
 	char insn[INTEL_PT_INSN_BUF_SZ];
 };
 
@@ -1162,6 +1168,13 @@ static int intel_pt_synth_branch_sample(struct intel_pt_queue *ptq)
 		sample.branch_stack = (struct branch_stack *)&dummy_bs;
 	}
 
+	sample.cyc_cnt = ptq->ipc_cyc_cnt - ptq->last_br_cyc_cnt;
+	if (sample.cyc_cnt) {
+		sample.insn_cnt = ptq->ipc_insn_cnt - ptq->last_br_insn_cnt;
+		ptq->last_br_insn_cnt = ptq->ipc_insn_cnt;
+		ptq->last_br_cyc_cnt = ptq->ipc_cyc_cnt;
+	}
+
 	return intel_pt_deliver_synth_b_event(pt, event, &sample,
 					      pt->branches_sample_type);
 }
@@ -1217,6 +1230,13 @@ static int intel_pt_synth_instruction_sample(struct intel_pt_queue *ptq)
 	sample.stream_id = ptq->pt->instructions_id;
 	sample.period = ptq->state->tot_insn_cnt - ptq->last_insn_cnt;
 
+	sample.cyc_cnt = ptq->ipc_cyc_cnt - ptq->last_in_cyc_cnt;
+	if (sample.cyc_cnt) {
+		sample.insn_cnt = ptq->ipc_insn_cnt - ptq->last_in_insn_cnt;
+		ptq->last_in_insn_cnt = ptq->ipc_insn_cnt;
+		ptq->last_in_cyc_cnt = ptq->ipc_cyc_cnt;
+	}
+
 	ptq->last_insn_cnt = ptq->state->tot_insn_cnt;
 
 	return intel_pt_deliver_synth_event(pt, ptq, event, &sample,
@@ -1488,6 +1508,15 @@ static int intel_pt_sample(struct intel_pt_queue *ptq)
 
 	ptq->have_sample = false;
 
+	if (ptq->state->tot_cyc_cnt > ptq->ipc_cyc_cnt) {
+		/*
+		 * Cycle count and instruction count only go together to create
+		 * a valid IPC ratio when the cycle count changes.
+		 */
+		ptq->ipc_insn_cnt = ptq->state->tot_insn_cnt;
+		ptq->ipc_cyc_cnt = ptq->state->tot_cyc_cnt;
+	}
+
 	if (pt->sample_pwr_events && (state->type & INTEL_PT_PWR_EVT)) {
 		if (state->type & INTEL_PT_CBR_CHG) {
 			err = intel_pt_synth_cbr_sample(ptq);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 08/22] perf script: Add output of IPC ratio
  2019-05-20 11:37 [PATCH 00/22] perf intel-pt: Add support for instructions-per-cycle (IPC) Adrian Hunter
                   ` (6 preceding siblings ...)
  2019-05-20 11:37 ` [PATCH 07/22] perf intel-pt: Add support for samples to contain IPC ratio Adrian Hunter
@ 2019-05-20 11:37 ` Adrian Hunter
  2019-06-17 19:01   ` [tip:perf/core] " tip-bot for Adrian Hunter
  2019-05-20 11:37 ` [PATCH 09/22] perf intel-pt: Record when decoding PSB+ packets Adrian Hunter
                   ` (13 subsequent siblings)
  21 siblings, 1 reply; 52+ messages in thread
From: Adrian Hunter @ 2019-05-20 11:37 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: Jiri Olsa, linux-kernel

Add field 'ipc' to display instructions-per-cycle.

Example:

 perf record -e intel_pt/cyc/u ls
 perf script --insn-trace --xed -F+ipc,-dso,-cpu,-tid

 ls  2670177.697113434:  7f0dfdbcd090 _start+0x0      mov %rsp, %rdi   IPC: 0.00 (1/877)
 ls  2670177.697113434:  7f0dfdbcd093 _start+0x3      callq  0x7f0dfdbce030
 ls  2670177.697113434:  7f0dfdbce030 _dl_start+0x0   pushq  %rbp
 ls  2670177.697113434:  7f0dfdbce031 _dl_start+0x1   mov %rsp, %rbp
 ls  2670177.697113434:  7f0dfdbce034 _dl_start+0x4   pushq  %r15
 ls  2670177.697113434:  7f0dfdbce036 _dl_start+0x6   pushq  %r14
 ls  2670177.697113434:  7f0dfdbce038 _dl_start+0x8   pushq  %r13
 ls  2670177.697113434:  7f0dfdbce03a _dl_start+0xa   pushq  %r12
 ls  2670177.697113434:  7f0dfdbce03c _dl_start+0xc   mov %rdi, %r12
 ls  2670177.697113434:  7f0dfdbce03f _dl_start+0xf   pushq  %rbx
 ls  2670177.697113434:  7f0dfdbce040 _dl_start+0x10  sub $0x38, %rsp
 ls  2670177.697113434:  7f0dfdbce044 _dl_start+0x14  rdtsc
 ls  2670177.697113434:  7f0dfdbce046 _dl_start+0x16  mov %eax, %eax
 ls  2670177.697113434:  7f0dfdbce048 _dl_start+0x18  shl $0x20, %rdx
 ls  2670177.697113434:  7f0dfdbce04c _dl_start+0x1c  or %rax, %rdx
 ls  2670177.697114471:  7f0dfdbce04f _dl_start+0x1f  movq  0x27e22(%rip), %rax        IPC: 0.00 (15/1685)
 ls  2670177.697116177:  7f0dfdbce056 _dl_start+0x26  movq  %rdx, 0x27683(%rip)        IPC: 0.00 (1/881)

Note, the IPC values are low due to page faults at the beginning of
execution. The additional cycles are due to the time to enter the kernel,
not the actual kernel page fault handler.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/Documentation/perf-script.txt |  5 ++++-
 tools/perf/builtin-script.c              | 23 ++++++++++++++++++++++-
 2 files changed, 26 insertions(+), 2 deletions(-)

diff --git a/tools/perf/Documentation/perf-script.txt b/tools/perf/Documentation/perf-script.txt
index 9b0d04dd2a61..3a630dd64916 100644
--- a/tools/perf/Documentation/perf-script.txt
+++ b/tools/perf/Documentation/perf-script.txt
@@ -117,7 +117,7 @@ OPTIONS
         Comma separated list of fields to print. Options are:
         comm, tid, pid, time, cpu, event, trace, ip, sym, dso, addr, symoff,
         srcline, period, iregs, uregs, brstack, brstacksym, flags, bpf-output, brstackinsn,
-        brstackoff, callindent, insn, insnlen, synth, phys_addr, metric, misc, srccode.
+        brstackoff, callindent, insn, insnlen, synth, phys_addr, metric, misc, srccode, ipc.
         Field list can be prepended with the type, trace, sw or hw,
         to indicate to which event type the field list applies.
         e.g., -F sw:comm,tid,time,ip,sym  and -F trace:time,cpu,trace
@@ -203,6 +203,9 @@ OPTIONS
 	The synth field is used by synthesized events which may be created when
 	Instruction Trace decoding.
 
+	The ipc (instructions per cycle) field is synthesized and may have a value when
+	Instruction Trace decoding.
+
 	Finally, a user may not set fields to none for all event types.
 	i.e., -F "" is not allowed.
 
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 61cfd8f70989..331309485368 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -102,6 +102,7 @@ enum perf_output_field {
 	PERF_OUTPUT_METRIC	    = 1U << 28,
 	PERF_OUTPUT_MISC            = 1U << 29,
 	PERF_OUTPUT_SRCCODE	    = 1U << 30,
+	PERF_OUTPUT_IPC             = 1U << 31,
 };
 
 struct output_option {
@@ -139,6 +140,7 @@ struct output_option {
 	{.str = "metric", .field = PERF_OUTPUT_METRIC},
 	{.str = "misc", .field = PERF_OUTPUT_MISC},
 	{.str = "srccode", .field = PERF_OUTPUT_SRCCODE},
+	{.str = "ipc", .field = PERF_OUTPUT_IPC},
 };
 
 enum {
@@ -1268,6 +1270,20 @@ static int perf_sample__fprintf_insn(struct perf_sample *sample,
 	return printed;
 }
 
+static int perf_sample__fprintf_ipc(struct perf_sample *sample,
+				    struct perf_event_attr *attr, FILE *fp)
+{
+	unsigned int ipc;
+
+	if (!PRINT_FIELD(IPC) || !sample->cyc_cnt || !sample->insn_cnt)
+		return 0;
+
+	ipc = (sample->insn_cnt * 100) / sample->cyc_cnt;
+
+	return fprintf(fp, " \t IPC: %u.%02u (%" PRIu64 "/%" PRIu64 ") ",
+		       ipc / 100, ipc % 100, sample->insn_cnt, sample->cyc_cnt);
+}
+
 static int perf_sample__fprintf_bts(struct perf_sample *sample,
 				    struct perf_evsel *evsel,
 				    struct thread *thread,
@@ -1312,6 +1328,8 @@ static int perf_sample__fprintf_bts(struct perf_sample *sample,
 		printed += perf_sample__fprintf_addr(sample, thread, attr, fp);
 	}
 
+	printed += perf_sample__fprintf_ipc(sample, attr, fp);
+
 	if (print_srcline_last)
 		printed += map__fprintf_srcline(al->map, al->addr, "\n  ", fp);
 
@@ -1858,6 +1876,9 @@ static void process_event(struct perf_script *script,
 
 	if (PRINT_FIELD(PHYS_ADDR))
 		fprintf(fp, "%16" PRIx64, sample->phys_addr);
+
+	perf_sample__fprintf_ipc(sample, attr, fp);
+
 	fprintf(fp, "\n");
 
 	if (PRINT_FIELD(SRCCODE)) {
@@ -3392,7 +3413,7 @@ int cmd_script(int argc, const char **argv)
 		     "Fields: comm,tid,pid,time,cpu,event,trace,ip,sym,dso,"
 		     "addr,symoff,srcline,period,iregs,uregs,brstack,"
 		     "brstacksym,flags,bpf-output,brstackinsn,brstackoff,"
-		     "callindent,insn,insnlen,synth,phys_addr,metric,misc",
+		     "callindent,insn,insnlen,synth,phys_addr,metric,misc,ipc",
 		     parse_output_fields),
 	OPT_BOOLEAN('a', "all-cpus", &system_wide,
 		    "system-wide collection from all CPUs"),
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 09/22] perf intel-pt: Record when decoding PSB+ packets
  2019-05-20 11:37 [PATCH 00/22] perf intel-pt: Add support for instructions-per-cycle (IPC) Adrian Hunter
                   ` (7 preceding siblings ...)
  2019-05-20 11:37 ` [PATCH 08/22] perf script: Add output of " Adrian Hunter
@ 2019-05-20 11:37 ` Adrian Hunter
  2019-06-17 19:02   ` [tip:perf/core] " tip-bot for Adrian Hunter
  2019-05-20 11:37 ` [PATCH 10/22] perf intel-pt: Re-factor TIP cases in intel_pt_walk_to_ip Adrian Hunter
                   ` (12 subsequent siblings)
  21 siblings, 1 reply; 52+ messages in thread
From: Adrian Hunter @ 2019-05-20 11:37 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: Jiri Olsa, linux-kernel

In preparation for using MTC packets to count cycles, record whether
decoding is between a PSB and PSBEND packets.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 .../util/intel-pt-decoder/intel-pt-decoder.c  | 41 ++++++++++++++-----
 1 file changed, 31 insertions(+), 10 deletions(-)

diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
index ef3a1c1cd250..a2384a314990 100644
--- a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
+++ b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
@@ -116,6 +116,7 @@ struct intel_pt_decoder {
 	bool have_cyc;
 	bool fixup_last_mtc;
 	bool have_last_ip;
+	bool in_psb;
 	enum intel_pt_param_flags flags;
 	uint64_t pos;
 	uint64_t last_ip;
@@ -1549,14 +1550,17 @@ static int intel_pt_walk_psbend(struct intel_pt_decoder *decoder)
 {
 	int err;
 
+	decoder->in_psb = true;
+
 	while (1) {
 		err = intel_pt_get_next_packet(decoder);
 		if (err)
-			return err;
+			goto out;
 
 		switch (decoder->packet.type) {
 		case INTEL_PT_PSBEND:
-			return 0;
+			err = 0;
+			goto out;
 
 		case INTEL_PT_TIP_PGD:
 		case INTEL_PT_TIP_PGE:
@@ -1574,10 +1578,12 @@ static int intel_pt_walk_psbend(struct intel_pt_decoder *decoder)
 		case INTEL_PT_PWRX:
 			decoder->have_tma = false;
 			intel_pt_log("ERROR: Unexpected packet\n");
-			return -EAGAIN;
+			err = -EAGAIN;
+			goto out;
 
 		case INTEL_PT_OVF:
-			return intel_pt_overflow(decoder);
+			err = intel_pt_overflow(decoder);
+			goto out;
 
 		case INTEL_PT_TSC:
 			intel_pt_calc_tsc_timestamp(decoder);
@@ -1623,6 +1629,10 @@ static int intel_pt_walk_psbend(struct intel_pt_decoder *decoder)
 			break;
 		}
 	}
+out:
+	decoder->in_psb = false;
+
+	return err;
 }
 
 static int intel_pt_walk_fup_tip(struct intel_pt_decoder *decoder)
@@ -1996,10 +2006,12 @@ static int intel_pt_walk_psb(struct intel_pt_decoder *decoder)
 {
 	int err;
 
+	decoder->in_psb = true;
+
 	while (1) {
 		err = intel_pt_get_next_packet(decoder);
 		if (err)
-			return err;
+			goto out;
 
 		switch (decoder->packet.type) {
 		case INTEL_PT_TIP_PGD:
@@ -2015,7 +2027,8 @@ static int intel_pt_walk_psb(struct intel_pt_decoder *decoder)
 		case INTEL_PT_PWRE:
 		case INTEL_PT_PWRX:
 			intel_pt_log("ERROR: Unexpected packet\n");
-			return -ENOENT;
+			err = -ENOENT;
+			goto out;
 
 		case INTEL_PT_FUP:
 			decoder->pge = true;
@@ -2074,16 +2087,20 @@ static int intel_pt_walk_psb(struct intel_pt_decoder *decoder)
 				decoder->pkt_state = INTEL_PT_STATE_ERR4;
 			else
 				decoder->pkt_state = INTEL_PT_STATE_ERR3;
-			return -ENOENT;
+			err = -ENOENT;
+			goto out;
 
 		case INTEL_PT_BAD: /* Does not happen */
-			return intel_pt_bug(decoder);
+			err = intel_pt_bug(decoder);
+			goto out;
 
 		case INTEL_PT_OVF:
-			return intel_pt_overflow(decoder);
+			err = intel_pt_overflow(decoder);
+			goto out;
 
 		case INTEL_PT_PSBEND:
-			return 0;
+			err = 0;
+			goto out;
 
 		case INTEL_PT_PSB:
 		case INTEL_PT_VMCS:
@@ -2093,6 +2110,10 @@ static int intel_pt_walk_psb(struct intel_pt_decoder *decoder)
 			break;
 		}
 	}
+out:
+	decoder->in_psb = false;
+
+	return err;
 }
 
 static int intel_pt_walk_to_ip(struct intel_pt_decoder *decoder)
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 10/22] perf intel-pt: Re-factor TIP cases in intel_pt_walk_to_ip
  2019-05-20 11:37 [PATCH 00/22] perf intel-pt: Add support for instructions-per-cycle (IPC) Adrian Hunter
                   ` (8 preceding siblings ...)
  2019-05-20 11:37 ` [PATCH 09/22] perf intel-pt: Record when decoding PSB+ packets Adrian Hunter
@ 2019-05-20 11:37 ` Adrian Hunter
  2019-06-17 19:03   ` [tip:perf/core] " tip-bot for Adrian Hunter
  2019-05-20 11:37 ` [PATCH 11/22] perf intel-pt: Accumulate cycle count from TSC/TMA/MTC packets Adrian Hunter
                   ` (11 subsequent siblings)
  21 siblings, 1 reply; 52+ messages in thread
From: Adrian Hunter @ 2019-05-20 11:37 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: Jiri Olsa, linux-kernel

To make it easier to add new code for different TIP cases, separate each
case.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 .../util/intel-pt-decoder/intel-pt-decoder.c  | 23 ++++++++++++++-----
 1 file changed, 17 insertions(+), 6 deletions(-)

diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
index a2384a314990..99773445872d 100644
--- a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
+++ b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
@@ -2128,18 +2128,29 @@ static int intel_pt_walk_to_ip(struct intel_pt_decoder *decoder)
 		switch (decoder->packet.type) {
 		case INTEL_PT_TIP_PGD:
 			decoder->continuous_period = false;
-			__fallthrough;
+			decoder->pge = false;
+			if (intel_pt_have_ip(decoder))
+				intel_pt_set_ip(decoder);
+			if (!decoder->ip)
+				break;
+			decoder->state.type |= INTEL_PT_TRACE_END;
+			return 0;
+
 		case INTEL_PT_TIP_PGE:
+			decoder->pge = true;
+			if (intel_pt_have_ip(decoder))
+				intel_pt_set_ip(decoder);
+			if (!decoder->ip)
+				break;
+			decoder->state.type |= INTEL_PT_TRACE_BEGIN;
+			return 0;
+
 		case INTEL_PT_TIP:
-			decoder->pge = decoder->packet.type != INTEL_PT_TIP_PGD;
+			decoder->pge = true;
 			if (intel_pt_have_ip(decoder))
 				intel_pt_set_ip(decoder);
 			if (!decoder->ip)
 				break;
-			if (decoder->packet.type == INTEL_PT_TIP_PGE)
-				decoder->state.type |= INTEL_PT_TRACE_BEGIN;
-			if (decoder->packet.type == INTEL_PT_TIP_PGD)
-				decoder->state.type |= INTEL_PT_TRACE_END;
 			return 0;
 
 		case INTEL_PT_FUP:
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 11/22] perf intel-pt: Accumulate cycle count from TSC/TMA/MTC packets
  2019-05-20 11:37 [PATCH 00/22] perf intel-pt: Add support for instructions-per-cycle (IPC) Adrian Hunter
                   ` (9 preceding siblings ...)
  2019-05-20 11:37 ` [PATCH 10/22] perf intel-pt: Re-factor TIP cases in intel_pt_walk_to_ip Adrian Hunter
@ 2019-05-20 11:37 ` Adrian Hunter
  2019-06-17 19:04   ` [tip:perf/core] " tip-bot for Adrian Hunter
  2019-05-20 11:37 ` [PATCH 12/22] perf intel-pt: Document IPC usage Adrian Hunter
                   ` (10 subsequent siblings)
  21 siblings, 1 reply; 52+ messages in thread
From: Adrian Hunter @ 2019-05-20 11:37 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: Jiri Olsa, linux-kernel

When CYC packets are not available, it is still possible to count cycles
using TSC/TMA/MTC timestamps. As the timestamp increments in TSC ticks,
convert to CPU cycles using the current core-to-bus ratio. Do not
accumulate cycles when control flow packet generation is not enabled, nor
when time has been "lost", typically due to mwait, which is indicated by a
TSC/TMA packet that is not part of PSB+.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 .../util/intel-pt-decoder/intel-pt-decoder.c  | 51 +++++++++++++++++++
 1 file changed, 51 insertions(+)

diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
index 99773445872d..9eb778f9c911 100644
--- a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
+++ b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
@@ -163,6 +163,9 @@ struct intel_pt_decoder {
 	uint64_t last_masked_timestamp;
 	uint64_t tot_cyc_cnt;
 	uint64_t sample_tot_cyc_cnt;
+	uint64_t base_cyc_cnt;
+	uint64_t cyc_cnt_timestamp;
+	double tsc_to_cyc;
 	bool continuous_period;
 	bool overflow;
 	bool set_fup_tx_flags;
@@ -1423,6 +1426,42 @@ static int intel_pt_overflow(struct intel_pt_decoder *decoder)
 	return -EOVERFLOW;
 }
 
+static inline void intel_pt_mtc_cyc_cnt_pge(struct intel_pt_decoder *decoder)
+{
+	if (decoder->have_cyc)
+		return;
+
+	decoder->cyc_cnt_timestamp = decoder->timestamp;
+	decoder->base_cyc_cnt = decoder->tot_cyc_cnt;
+}
+
+static inline void intel_pt_mtc_cyc_cnt_cbr(struct intel_pt_decoder *decoder)
+{
+	decoder->tsc_to_cyc = decoder->cbr / decoder->max_non_turbo_ratio_fp;
+
+	if (decoder->pge)
+		intel_pt_mtc_cyc_cnt_pge(decoder);
+}
+
+static inline void intel_pt_mtc_cyc_cnt_upd(struct intel_pt_decoder *decoder)
+{
+	uint64_t tot_cyc_cnt, tsc_delta;
+
+	if (decoder->have_cyc)
+		return;
+
+	decoder->sample_cyc = true;
+
+	if (!decoder->pge || decoder->timestamp <= decoder->cyc_cnt_timestamp)
+		return;
+
+	tsc_delta = decoder->timestamp - decoder->cyc_cnt_timestamp;
+	tot_cyc_cnt = tsc_delta * decoder->tsc_to_cyc + decoder->base_cyc_cnt;
+
+	if (tot_cyc_cnt > decoder->tot_cyc_cnt)
+		decoder->tot_cyc_cnt = tot_cyc_cnt;
+}
+
 static void intel_pt_calc_tma(struct intel_pt_decoder *decoder)
 {
 	uint32_t ctc = decoder->packet.payload;
@@ -1432,6 +1471,11 @@ static void intel_pt_calc_tma(struct intel_pt_decoder *decoder)
 	if (!decoder->tsc_ctc_ratio_d)
 		return;
 
+	if (decoder->pge && !decoder->in_psb)
+		intel_pt_mtc_cyc_cnt_pge(decoder);
+	else
+		intel_pt_mtc_cyc_cnt_upd(decoder);
+
 	decoder->last_mtc = (ctc >> decoder->mtc_shift) & 0xff;
 	decoder->ctc_timestamp = decoder->tsc_timestamp - fc;
 	if (decoder->tsc_ctc_mult) {
@@ -1487,6 +1531,8 @@ static void intel_pt_calc_mtc_timestamp(struct intel_pt_decoder *decoder)
 	else
 		decoder->timestamp = timestamp;
 
+	intel_pt_mtc_cyc_cnt_upd(decoder);
+
 	decoder->timestamp_insn_cnt = 0;
 	decoder->last_mtc = mtc;
 
@@ -1511,6 +1557,8 @@ static void intel_pt_calc_cbr(struct intel_pt_decoder *decoder)
 
 	decoder->cbr = cbr;
 	decoder->cbr_cyc_to_tsc = decoder->max_non_turbo_ratio_fp / cbr;
+
+	intel_pt_mtc_cyc_cnt_cbr(decoder);
 }
 
 static void intel_pt_calc_cyc_timestamp(struct intel_pt_decoder *decoder)
@@ -1706,6 +1754,7 @@ static int intel_pt_walk_fup_tip(struct intel_pt_decoder *decoder)
 				decoder->state.to_ip = decoder->ip;
 			}
 			decoder->state.type |= INTEL_PT_TRACE_BEGIN;
+			intel_pt_mtc_cyc_cnt_pge(decoder);
 			return 0;
 
 		case INTEL_PT_TIP:
@@ -1776,6 +1825,7 @@ static int intel_pt_walk_trace(struct intel_pt_decoder *decoder)
 
 		case INTEL_PT_TIP_PGE: {
 			decoder->pge = true;
+			intel_pt_mtc_cyc_cnt_pge(decoder);
 			if (decoder->packet.count == 0) {
 				intel_pt_log_at("Skipping zero TIP.PGE",
 						decoder->pos);
@@ -2138,6 +2188,7 @@ static int intel_pt_walk_to_ip(struct intel_pt_decoder *decoder)
 
 		case INTEL_PT_TIP_PGE:
 			decoder->pge = true;
+			intel_pt_mtc_cyc_cnt_pge(decoder);
 			if (intel_pt_have_ip(decoder))
 				intel_pt_set_ip(decoder);
 			if (!decoder->ip)
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 12/22] perf intel-pt: Document IPC usage
  2019-05-20 11:37 [PATCH 00/22] perf intel-pt: Add support for instructions-per-cycle (IPC) Adrian Hunter
                   ` (10 preceding siblings ...)
  2019-05-20 11:37 ` [PATCH 11/22] perf intel-pt: Accumulate cycle count from TSC/TMA/MTC packets Adrian Hunter
@ 2019-05-20 11:37 ` Adrian Hunter
  2019-06-17 19:04   ` [tip:perf/core] " tip-bot for Adrian Hunter
  2019-05-20 11:37 ` [PATCH 13/22] perf thread-stack: Accumulate IPC information Adrian Hunter
                   ` (9 subsequent siblings)
  21 siblings, 1 reply; 52+ messages in thread
From: Adrian Hunter @ 2019-05-20 11:37 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: Jiri Olsa, linux-kernel

Add brief documentation about instructions-per-cycle (IPC) information
derived from Intel PT.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/Documentation/intel-pt.txt | 30 +++++++++++++++++++++++++++
 1 file changed, 30 insertions(+)

diff --git a/tools/perf/Documentation/intel-pt.txt b/tools/perf/Documentation/intel-pt.txt
index 60d99e5e7921..50c5b60101bd 100644
--- a/tools/perf/Documentation/intel-pt.txt
+++ b/tools/perf/Documentation/intel-pt.txt
@@ -103,6 +103,36 @@ The flags are "bcrosyiABEx" which stand for branch, call, return, conditional,
 system, asynchronous, interrupt, transaction abort, trace begin, trace end, and
 in transaction, respectively.
 
+Another interesting field that is not printed by default is 'ipc' which can be
+displayed as follows:
+
+	perf script --itrace=be -F+ipc
+
+There are two ways that instructions-per-cycle (IPC) can be calculated depending
+on the recording.
+
+If the 'cyc' config term (see config terms section below) was used, then IPC is
+calculated using the cycle count from CYC packets, otherwise MTC packets are
+used - refer to the 'mtc' config term.  When MTC is used, however, the values
+are less accurate because the timing is less accurate.
+
+Because Intel PT does not update the cycle count on every branch or instruction,
+the values will often be zero.  When there are values, they will be the number
+of instructions and number of cycles since the last update, and thus represent
+the average IPC since the last IPC for that event type.  Note IPC for "branches"
+events is calculated separately from IPC for "instructions" events.
+
+Also note that the IPC instruction count may or may not include the current
+instruction.  If the cycle count is associated with an asynchronous branch
+(e.g. page fault or interrupt), then the instruction count does not include the
+current instruction, otherwise it does.  That is consistent with whether or not
+that instruction has retired when the cycle count is updated.
+
+Another note, in the case of "branches" events, non-taken branches are not
+presently sampled, so IPC values for them do not appear e.g. a CYC packet with a
+TNT packet that starts with a non-taken branch.  To see every possible IPC
+value, "instructions" events can be used e.g. --itrace=i0ns
+
 While it is possible to create scripts to analyze the data, an alternative
 approach is available to export the data to a sqlite or postgresql database.
 Refer to script export-to-sqlite.py or export-to-postgresql.py for more details,
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 13/22] perf thread-stack: Accumulate IPC information
  2019-05-20 11:37 [PATCH 00/22] perf intel-pt: Add support for instructions-per-cycle (IPC) Adrian Hunter
                   ` (11 preceding siblings ...)
  2019-05-20 11:37 ` [PATCH 12/22] perf intel-pt: Document IPC usage Adrian Hunter
@ 2019-05-20 11:37 ` Adrian Hunter
  2019-06-17 19:05   ` [tip:perf/core] " tip-bot for Adrian Hunter
  2019-05-20 11:37 ` [PATCH 14/22] perf db-export: Add brief documentation Adrian Hunter
                   ` (8 subsequent siblings)
  21 siblings, 1 reply; 52+ messages in thread
From: Adrian Hunter @ 2019-05-20 11:37 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: Jiri Olsa, linux-kernel

Cycle and instruction counts are added to the stack. The IPC of a function
and all functions it calls, is also recorded.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/util/thread-stack.c | 14 ++++++++++++++
 tools/perf/util/thread-stack.h |  4 ++++
 2 files changed, 18 insertions(+)

diff --git a/tools/perf/util/thread-stack.c b/tools/perf/util/thread-stack.c
index 41942c2aaa18..8e390f78486f 100644
--- a/tools/perf/util/thread-stack.c
+++ b/tools/perf/util/thread-stack.c
@@ -49,6 +49,8 @@ enum retpoline_state_t {
  * @timestamp: timestamp (if known)
  * @ref: external reference (e.g. db_id of sample)
  * @branch_count: the branch count when the entry was created
+ * @insn_count: the instruction count when the entry was created
+ * @cyc_count the cycle count when the entry was created
  * @db_id: id used for db-export
  * @cp: call path
  * @no_call: a 'call' was not seen
@@ -60,6 +62,8 @@ struct thread_stack_entry {
 	u64 timestamp;
 	u64 ref;
 	u64 branch_count;
+	u64 insn_count;
+	u64 cyc_count;
 	u64 db_id;
 	struct call_path *cp;
 	bool no_call;
@@ -75,6 +79,8 @@ struct thread_stack_entry {
  * @sz: current maximum stack size
  * @trace_nr: current trace number
  * @branch_count: running branch count
+ * @insn_count: running  instruction count
+ * @cyc_count running  cycle count
  * @kernel_start: kernel start address
  * @last_time: last timestamp
  * @crp: call/return processor
@@ -88,6 +94,8 @@ struct thread_stack {
 	size_t sz;
 	u64 trace_nr;
 	u64 branch_count;
+	u64 insn_count;
+	u64 cyc_count;
 	u64 kernel_start;
 	u64 last_time;
 	struct call_return_processor *crp;
@@ -289,6 +297,8 @@ static int thread_stack__call_return(struct thread *thread,
 	cr.call_time = tse->timestamp;
 	cr.return_time = timestamp;
 	cr.branch_count = ts->branch_count - tse->branch_count;
+	cr.insn_count = ts->insn_count - tse->insn_count;
+	cr.cyc_count = ts->cyc_count - tse->cyc_count;
 	cr.db_id = tse->db_id;
 	cr.call_ref = tse->ref;
 	cr.return_ref = ref;
@@ -544,6 +554,8 @@ static int thread_stack__push_cp(struct thread_stack *ts, u64 ret_addr,
 	tse->timestamp = timestamp;
 	tse->ref = ref;
 	tse->branch_count = ts->branch_count;
+	tse->insn_count = ts->insn_count;
+	tse->cyc_count = ts->cyc_count;
 	tse->cp = cp;
 	tse->no_call = no_call;
 	tse->trace_end = trace_end;
@@ -874,6 +886,8 @@ int thread_stack__process(struct thread *thread, struct comm *comm,
 	}
 
 	ts->branch_count += 1;
+	ts->insn_count += sample->insn_cnt;
+	ts->cyc_count += sample->cyc_cnt;
 	ts->last_time = sample->time;
 
 	if (sample->flags & PERF_IP_FLAG_CALL) {
diff --git a/tools/perf/util/thread-stack.h b/tools/perf/util/thread-stack.h
index 9c45f947f5a9..bddb1daf6453 100644
--- a/tools/perf/util/thread-stack.h
+++ b/tools/perf/util/thread-stack.h
@@ -52,6 +52,8 @@ enum {
  * @call_time: timestamp of call (if known)
  * @return_time: timestamp of return (if known)
  * @branch_count: number of branches seen between call and return
+ * @insn_count: approx. number of instructions between call and return
+ * @cyc_count: approx. number of cycles between call and return
  * @call_ref: external reference to 'call' sample (e.g. db_id)
  * @return_ref:  external reference to 'return' sample (e.g. db_id)
  * @db_id: id used for db-export
@@ -65,6 +67,8 @@ struct call_return {
 	u64 call_time;
 	u64 return_time;
 	u64 branch_count;
+	u64 insn_count;
+	u64 cyc_count;
 	u64 call_ref;
 	u64 return_ref;
 	u64 db_id;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 14/22] perf db-export: Add brief documentation
  2019-05-20 11:37 [PATCH 00/22] perf intel-pt: Add support for instructions-per-cycle (IPC) Adrian Hunter
                   ` (12 preceding siblings ...)
  2019-05-20 11:37 ` [PATCH 13/22] perf thread-stack: Accumulate IPC information Adrian Hunter
@ 2019-05-20 11:37 ` Adrian Hunter
  2019-06-17 19:06   ` [tip:perf/core] " tip-bot for Adrian Hunter
  2019-05-20 11:37 ` [PATCH 15/22] perf db-export: Export IPC information Adrian Hunter
                   ` (7 subsequent siblings)
  21 siblings, 1 reply; 52+ messages in thread
From: Adrian Hunter @ 2019-05-20 11:37 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: Jiri Olsa, linux-kernel

Add brief documentation to explain how the database export maintains
backward and forward compatibility.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/Documentation/db-export.txt | 41 ++++++++++++++++++++++++++
 1 file changed, 41 insertions(+)
 create mode 100644 tools/perf/Documentation/db-export.txt

diff --git a/tools/perf/Documentation/db-export.txt b/tools/perf/Documentation/db-export.txt
new file mode 100644
index 000000000000..52ffccb02d55
--- /dev/null
+++ b/tools/perf/Documentation/db-export.txt
@@ -0,0 +1,41 @@
+Database Export
+===============
+
+perf tool's python scripting engine:
+
+	tools/perf/util/scripting-engines/trace-event-python.c
+
+supports scripts:
+
+	tools/perf/scripts/python/export-to-sqlite.py
+	tools/perf/scripts/python/export-to-postgresql.py
+
+which export data to a SQLite3 or PostgreSQL database.
+
+The export process provides records with unique sequential ids which allows the
+data to be imported directly to a database and provides the relationships
+between tables.
+
+Over time it is possible to continue to expand the export while maintaining
+backward and forward compatibility, by following some simple rules:
+
+1. Because of the nature of SQL, existing tables and columns can continue to be
+used so long as the names and meanings (and to some extent data types) remain
+the same.
+
+2. New tables and columns can be added, without affecting existing SQL queries,
+so long as the new names are unique.
+
+3. Scripts that use a database (e.g. exported-sql-viewer.py) can maintain
+backward compatibility by testing for the presence of new tables and columns
+before using them. e.g. function IsSelectable() in exported-sql-viewer.py
+
+4. The export scripts themselves maintain forward compatibility (i.e. an existing
+script will continue to work with new versions of perf) by accepting a variable
+number of arguments (e.g. def call_return_table(*x)) i.e. perf can pass more
+arguments which old scripts will ignore.
+
+5. The scripting engine tests for the existence of script handler functions
+before calling them.  The scripting engine can also test for the support of new
+or optional features by checking for the existence and value of script global
+variables.
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 15/22] perf db-export: Export IPC information
  2019-05-20 11:37 [PATCH 00/22] perf intel-pt: Add support for instructions-per-cycle (IPC) Adrian Hunter
                   ` (13 preceding siblings ...)
  2019-05-20 11:37 ` [PATCH 14/22] perf db-export: Add brief documentation Adrian Hunter
@ 2019-05-20 11:37 ` Adrian Hunter
  2019-06-17 19:07   ` [tip:perf/core] " tip-bot for Adrian Hunter
  2019-05-20 11:37 ` [PATCH 16/22] perf scripts python: export-to-sqlite.py: " Adrian Hunter
                   ` (6 subsequent siblings)
  21 siblings, 1 reply; 52+ messages in thread
From: Adrian Hunter @ 2019-05-20 11:37 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: Jiri Olsa, linux-kernel

Export cycle and instruction counts on samples and call-returns.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/util/scripting-engines/trace-event-python.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/scripting-engines/trace-event-python.c b/tools/perf/util/scripting-engines/trace-event-python.c
index 22f52b669871..6acb379b53ec 100644
--- a/tools/perf/util/scripting-engines/trace-event-python.c
+++ b/tools/perf/util/scripting-engines/trace-event-python.c
@@ -1111,7 +1111,7 @@ static int python_export_sample(struct db_export *dbe,
 	struct tables *tables = container_of(dbe, struct tables, dbe);
 	PyObject *t;
 
-	t = tuple_new(22);
+	t = tuple_new(24);
 
 	tuple_set_u64(t, 0, es->db_id);
 	tuple_set_u64(t, 1, es->evsel->db_id);
@@ -1135,6 +1135,8 @@ static int python_export_sample(struct db_export *dbe,
 	tuple_set_s32(t, 19, es->sample->flags & PERF_BRANCH_MASK);
 	tuple_set_s32(t, 20, !!(es->sample->flags & PERF_IP_FLAG_IN_TX));
 	tuple_set_u64(t, 21, es->call_path_id);
+	tuple_set_u64(t, 22, es->sample->insn_cnt);
+	tuple_set_u64(t, 23, es->sample->cyc_cnt);
 
 	call_object(tables->sample_handler, t, "sample_table");
 
@@ -1173,7 +1175,7 @@ static int python_export_call_return(struct db_export *dbe,
 	u64 comm_db_id = cr->comm ? cr->comm->db_id : 0;
 	PyObject *t;
 
-	t = tuple_new(12);
+	t = tuple_new(14);
 
 	tuple_set_u64(t, 0, cr->db_id);
 	tuple_set_u64(t, 1, cr->thread->db_id);
@@ -1187,6 +1189,8 @@ static int python_export_call_return(struct db_export *dbe,
 	tuple_set_u64(t, 9, cr->cp->parent->db_id);
 	tuple_set_s32(t, 10, cr->flags);
 	tuple_set_u64(t, 11, cr->parent_db_id);
+	tuple_set_u64(t, 12, cr->insn_count);
+	tuple_set_u64(t, 13, cr->cyc_count);
 
 	call_object(tables->call_return_handler, t, "call_return_table");
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 16/22] perf scripts python: export-to-sqlite.py: Export IPC information
  2019-05-20 11:37 [PATCH 00/22] perf intel-pt: Add support for instructions-per-cycle (IPC) Adrian Hunter
                   ` (14 preceding siblings ...)
  2019-05-20 11:37 ` [PATCH 15/22] perf db-export: Export IPC information Adrian Hunter
@ 2019-05-20 11:37 ` Adrian Hunter
  2019-06-17 19:07   ` [tip:perf/core] " tip-bot for Adrian Hunter
  2019-05-20 11:37 ` [PATCH 17/22] perf scripts python: export-to-postgresql.py: " Adrian Hunter
                   ` (5 subsequent siblings)
  21 siblings, 1 reply; 52+ messages in thread
From: Adrian Hunter @ 2019-05-20 11:37 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: Jiri Olsa, linux-kernel

Export cycle and instruction counts on samples and calls tables.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/scripts/python/export-to-sqlite.py | 36 ++++++++++++-------
 1 file changed, 24 insertions(+), 12 deletions(-)

diff --git a/tools/perf/scripts/python/export-to-sqlite.py b/tools/perf/scripts/python/export-to-sqlite.py
index f617e518332f..4542ce89034b 100644
--- a/tools/perf/scripts/python/export-to-sqlite.py
+++ b/tools/perf/scripts/python/export-to-sqlite.py
@@ -218,7 +218,9 @@ if branches:
 		'to_ip		bigint,'
 		'branch_type	integer,'
 		'in_tx		boolean,'
-		'call_path_id	bigint)')
+		'call_path_id	bigint,'
+		'insn_count	bigint,'
+		'cyc_count	bigint)')
 else:
 	do_query(query, 'CREATE TABLE samples ('
 		'id		integer		NOT NULL	PRIMARY KEY,'
@@ -242,7 +244,9 @@ else:
 		'data_src	bigint,'
 		'branch_type	integer,'
 		'in_tx		boolean,'
-		'call_path_id	bigint)')
+		'call_path_id	bigint,'
+		'insn_count	bigint,'
+		'cyc_count	bigint)')
 
 if perf_db_export_calls or perf_db_export_callchains:
 	do_query(query, 'CREATE TABLE call_paths ('
@@ -263,7 +267,9 @@ if perf_db_export_calls:
 		'return_id	bigint,'
 		'parent_call_path_id	bigint,'
 		'flags		integer,'
-		'parent_id	bigint)')
+		'parent_id	bigint,'
+		'insn_count	bigint,'
+		'cyc_count	bigint)')
 
 # printf was added to sqlite in version 3.8.3
 sqlite_has_printf = False
@@ -359,6 +365,9 @@ if perf_db_export_calls:
 			'return_time,'
 			'return_time - call_time AS elapsed_time,'
 			'branch_count,'
+			'insn_count,'
+			'cyc_count,'
+			'CASE WHEN cyc_count=0 THEN CAST(0 AS FLOAT) ELSE ROUND(CAST(insn_count AS FLOAT) / cyc_count, 2) END AS IPC,'
 			'call_id,'
 			'return_id,'
 			'CASE WHEN flags=0 THEN \'\' WHEN flags=1 THEN \'no call\' WHEN flags=2 THEN \'no return\' WHEN flags=3 THEN \'no call/return\' WHEN flags=6 THEN \'jump\' ELSE flags END AS flags,'
@@ -384,7 +393,10 @@ do_query(query, 'CREATE VIEW samples_view AS '
 		'to_sym_offset,'
 		'(SELECT short_name FROM dsos WHERE id = to_dso_id) AS to_dso_short_name,'
 		'(SELECT name FROM branch_types WHERE id = branch_type) AS branch_type_name,'
-		'in_tx'
+		'in_tx,'
+		'insn_count,'
+		'cyc_count,'
+		'CASE WHEN cyc_count=0 THEN CAST(0 AS FLOAT) ELSE ROUND(CAST(insn_count AS FLOAT) / cyc_count, 2) END AS IPC'
 	' FROM samples')
 
 do_query(query, 'END TRANSACTION')
@@ -407,15 +419,15 @@ branch_type_query = QSqlQuery(db)
 branch_type_query.prepare("INSERT INTO branch_types VALUES (?, ?)")
 sample_query = QSqlQuery(db)
 if branches:
-	sample_query.prepare("INSERT INTO samples VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)")
+	sample_query.prepare("INSERT INTO samples VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)")
 else:
-	sample_query.prepare("INSERT INTO samples VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)")
+	sample_query.prepare("INSERT INTO samples VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)")
 if perf_db_export_calls or perf_db_export_callchains:
 	call_path_query = QSqlQuery(db)
 	call_path_query.prepare("INSERT INTO call_paths VALUES (?, ?, ?, ?)")
 if perf_db_export_calls:
 	call_query = QSqlQuery(db)
-	call_query.prepare("INSERT INTO calls VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)")
+	call_query.prepare("INSERT INTO calls VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)")
 
 def trace_begin():
 	printdate("Writing records...")
@@ -427,10 +439,10 @@ def trace_begin():
 	comm_table(0, "unknown")
 	dso_table(0, 0, "unknown", "unknown", "")
 	symbol_table(0, 0, 0, 0, 0, "unknown")
-	sample_table(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
+	sample_table(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
 	if perf_db_export_calls or perf_db_export_callchains:
 		call_path_table(0, 0, 0, 0)
-		call_return_table(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
+		call_return_table(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
 
 unhandled_count = 0
 
@@ -486,14 +498,14 @@ def sample_table(*x):
 	if branches:
 		for xx in x[0:15]:
 			sample_query.addBindValue(str(xx))
-		for xx in x[19:22]:
+		for xx in x[19:24]:
 			sample_query.addBindValue(str(xx))
 		do_query_(sample_query)
 	else:
-		bind_exec(sample_query, 22, x)
+		bind_exec(sample_query, 24, x)
 
 def call_path_table(*x):
 	bind_exec(call_path_query, 4, x)
 
 def call_return_table(*x):
-	bind_exec(call_query, 12, x)
+	bind_exec(call_query, 14, x)
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 17/22] perf scripts python: export-to-postgresql.py: Export IPC information
  2019-05-20 11:37 [PATCH 00/22] perf intel-pt: Add support for instructions-per-cycle (IPC) Adrian Hunter
                   ` (15 preceding siblings ...)
  2019-05-20 11:37 ` [PATCH 16/22] perf scripts python: export-to-sqlite.py: " Adrian Hunter
@ 2019-05-20 11:37 ` Adrian Hunter
  2019-06-17 19:08   ` [tip:perf/core] " tip-bot for Adrian Hunter
  2019-05-20 11:37 ` [PATCH 18/22] perf scripts python: exported-sql-viewer.py: Add IPC information to the Branch reports Adrian Hunter
                   ` (4 subsequent siblings)
  21 siblings, 1 reply; 52+ messages in thread
From: Adrian Hunter @ 2019-05-20 11:37 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: Jiri Olsa, linux-kernel

Export cycle and instruction counts on samples and calls tables.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 .../scripts/python/export-to-postgresql.py    | 36 ++++++++++++-------
 1 file changed, 24 insertions(+), 12 deletions(-)

diff --git a/tools/perf/scripts/python/export-to-postgresql.py b/tools/perf/scripts/python/export-to-postgresql.py
index b2f481b0d28d..93225c02117e 100644
--- a/tools/perf/scripts/python/export-to-postgresql.py
+++ b/tools/perf/scripts/python/export-to-postgresql.py
@@ -394,7 +394,9 @@ if branches:
 		'to_ip		bigint,'
 		'branch_type	integer,'
 		'in_tx		boolean,'
-		'call_path_id	bigint)')
+		'call_path_id	bigint,'
+		'insn_count	bigint,'
+		'cyc_count	bigint)')
 else:
 	do_query(query, 'CREATE TABLE samples ('
 		'id		bigint		NOT NULL,'
@@ -418,7 +420,9 @@ else:
 		'data_src	bigint,'
 		'branch_type	integer,'
 		'in_tx		boolean,'
-		'call_path_id	bigint)')
+		'call_path_id	bigint,'
+		'insn_count	bigint,'
+		'cyc_count	bigint)')
 
 if perf_db_export_calls or perf_db_export_callchains:
 	do_query(query, 'CREATE TABLE call_paths ('
@@ -439,7 +443,9 @@ if perf_db_export_calls:
 		'return_id	bigint,'
 		'parent_call_path_id	bigint,'
 		'flags		integer,'
-		'parent_id	bigint)')
+		'parent_id	bigint,'
+		'insn_count	bigint,'
+		'cyc_count	bigint)')
 
 do_query(query, 'CREATE VIEW machines_view AS '
 	'SELECT '
@@ -521,6 +527,9 @@ if perf_db_export_calls:
 			'return_time,'
 			'return_time - call_time AS elapsed_time,'
 			'branch_count,'
+			'insn_count,'
+			'cyc_count,'
+			'CASE WHEN cyc_count=0 THEN CAST(0 AS NUMERIC(20, 2)) ELSE CAST((CAST(insn_count AS FLOAT) / cyc_count) AS NUMERIC(20, 2)) END AS IPC,'
 			'call_id,'
 			'return_id,'
 			'CASE WHEN flags=0 THEN \'\' WHEN flags=1 THEN \'no call\' WHEN flags=2 THEN \'no return\' WHEN flags=3 THEN \'no call/return\' WHEN flags=6 THEN \'jump\' ELSE CAST ( flags AS VARCHAR(6) ) END AS flags,'
@@ -546,7 +555,10 @@ do_query(query, 'CREATE VIEW samples_view AS '
 		'to_sym_offset,'
 		'(SELECT short_name FROM dsos WHERE id = to_dso_id) AS to_dso_short_name,'
 		'(SELECT name FROM branch_types WHERE id = branch_type) AS branch_type_name,'
-		'in_tx'
+		'in_tx,'
+		'insn_count,'
+		'cyc_count,'
+		'CASE WHEN cyc_count=0 THEN CAST(0 AS NUMERIC(20, 2)) ELSE CAST((CAST(insn_count AS FLOAT) / cyc_count) AS NUMERIC(20, 2)) END AS IPC'
 	' FROM samples')
 
 
@@ -618,10 +630,10 @@ def trace_begin():
 	comm_table(0, "unknown")
 	dso_table(0, 0, "unknown", "unknown", "")
 	symbol_table(0, 0, 0, 0, 0, "unknown")
-	sample_table(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
+	sample_table(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
 	if perf_db_export_calls or perf_db_export_callchains:
 		call_path_table(0, 0, 0, 0)
-		call_return_table(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
+		call_return_table(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
 
 unhandled_count = 0
 
@@ -772,11 +784,11 @@ def branch_type_table(branch_type, name, *x):
 	value = struct.pack(fmt, 2, 4, branch_type, n, name)
 	branch_type_file.write(value)
 
-def sample_table(sample_id, evsel_id, machine_id, thread_id, comm_id, dso_id, symbol_id, sym_offset, ip, time, cpu, to_dso_id, to_symbol_id, to_sym_offset, to_ip, period, weight, transaction, data_src, branch_type, in_tx, call_path_id, *x):
+def sample_table(sample_id, evsel_id, machine_id, thread_id, comm_id, dso_id, symbol_id, sym_offset, ip, time, cpu, to_dso_id, to_symbol_id, to_sym_offset, to_ip, period, weight, transaction, data_src, branch_type, in_tx, call_path_id, insn_cnt, cyc_cnt, *x):
 	if branches:
-		value = struct.pack("!hiqiqiqiqiqiqiqiqiqiqiiiqiqiqiqiiiBiq", 18, 8, sample_id, 8, evsel_id, 8, machine_id, 8, thread_id, 8, comm_id, 8, dso_id, 8, symbol_id, 8, sym_offset, 8, ip, 8, time, 4, cpu, 8, to_dso_id, 8, to_symbol_id, 8, to_sym_offset, 8, to_ip, 4, branch_type, 1, in_tx, 8, call_path_id)
+		value = struct.pack("!hiqiqiqiqiqiqiqiqiqiqiiiqiqiqiqiiiBiqiqiq", 20, 8, sample_id, 8, evsel_id, 8, machine_id, 8, thread_id, 8, comm_id, 8, dso_id, 8, symbol_id, 8, sym_offset, 8, ip, 8, time, 4, cpu, 8, to_dso_id, 8, to_symbol_id, 8, to_sym_offset, 8, to_ip, 4, branch_type, 1, in_tx, 8, call_path_id, 8, insn_cnt, 8, cyc_cnt)
 	else:
-		value = struct.pack("!hiqiqiqiqiqiqiqiqiqiqiiiqiqiqiqiqiqiqiqiiiBiq", 22, 8, sample_id, 8, evsel_id, 8, machine_id, 8, thread_id, 8, comm_id, 8, dso_id, 8, symbol_id, 8, sym_offset, 8, ip, 8, time, 4, cpu, 8, to_dso_id, 8, to_symbol_id, 8, to_sym_offset, 8, to_ip, 8, period, 8, weight, 8, transaction, 8, data_src, 4, branch_type, 1, in_tx, 8, call_path_id)
+		value = struct.pack("!hiqiqiqiqiqiqiqiqiqiqiiiqiqiqiqiqiqiqiqiiiBiqiqiq", 24, 8, sample_id, 8, evsel_id, 8, machine_id, 8, thread_id, 8, comm_id, 8, dso_id, 8, symbol_id, 8, sym_offset, 8, ip, 8, time, 4, cpu, 8, to_dso_id, 8, to_symbol_id, 8, to_sym_offset, 8, to_ip, 8, period, 8, weight, 8, transaction, 8, data_src, 4, branch_type, 1, in_tx, 8, call_path_id, 8, insn_cnt, 8, cyc_cnt)
 	sample_file.write(value)
 
 def call_path_table(cp_id, parent_id, symbol_id, ip, *x):
@@ -784,7 +796,7 @@ def call_path_table(cp_id, parent_id, symbol_id, ip, *x):
 	value = struct.pack(fmt, 4, 8, cp_id, 8, parent_id, 8, symbol_id, 8, ip)
 	call_path_file.write(value)
 
-def call_return_table(cr_id, thread_id, comm_id, call_path_id, call_time, return_time, branch_count, call_id, return_id, parent_call_path_id, flags, parent_id, *x):
-	fmt = "!hiqiqiqiqiqiqiqiqiqiqiiiq"
-	value = struct.pack(fmt, 12, 8, cr_id, 8, thread_id, 8, comm_id, 8, call_path_id, 8, call_time, 8, return_time, 8, branch_count, 8, call_id, 8, return_id, 8, parent_call_path_id, 4, flags, 8, parent_id)
+def call_return_table(cr_id, thread_id, comm_id, call_path_id, call_time, return_time, branch_count, call_id, return_id, parent_call_path_id, flags, parent_id, insn_cnt, cyc_cnt, *x):
+	fmt = "!hiqiqiqiqiqiqiqiqiqiqiiiqiqiq"
+	value = struct.pack(fmt, 14, 8, cr_id, 8, thread_id, 8, comm_id, 8, call_path_id, 8, call_time, 8, return_time, 8, branch_count, 8, call_id, 8, return_id, 8, parent_call_path_id, 4, flags, 8, parent_id, 8, insn_cnt, 8, cyc_cnt)
 	call_file.write(value)
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 18/22] perf scripts python: exported-sql-viewer.py: Add IPC information to the Branch reports
  2019-05-20 11:37 [PATCH 00/22] perf intel-pt: Add support for instructions-per-cycle (IPC) Adrian Hunter
                   ` (16 preceding siblings ...)
  2019-05-20 11:37 ` [PATCH 17/22] perf scripts python: export-to-postgresql.py: " Adrian Hunter
@ 2019-05-20 11:37 ` Adrian Hunter
  2019-05-31 16:44   ` Arnaldo Carvalho de Melo
  2019-06-17 19:09   ` [tip:perf/core] " tip-bot for Adrian Hunter
  2019-05-20 11:37 ` [PATCH 19/22] perf scripts python: exported-sql-viewer.py: Add CallGraphModelParams Adrian Hunter
                   ` (3 subsequent siblings)
  21 siblings, 2 replies; 52+ messages in thread
From: Adrian Hunter @ 2019-05-20 11:37 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: Jiri Olsa, linux-kernel

Enhance the "All branches" and "Selected branches" reports to display IPC
information if it is available.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 .../scripts/python/exported-sql-viewer.py     | 102 ++++++++++++++----
 1 file changed, 83 insertions(+), 19 deletions(-)

diff --git a/tools/perf/scripts/python/exported-sql-viewer.py b/tools/perf/scripts/python/exported-sql-viewer.py
index 6fe553258ce5..a607235c8cd9 100755
--- a/tools/perf/scripts/python/exported-sql-viewer.py
+++ b/tools/perf/scripts/python/exported-sql-viewer.py
@@ -1369,11 +1369,11 @@ class FetchMoreRecordsBar():
 
 class BranchLevelTwoItem():
 
-	def __init__(self, row, text, parent_item):
+	def __init__(self, row, col, text, parent_item):
 		self.row = row
 		self.parent_item = parent_item
-		self.data = [""] * 8
-		self.data[7] = text
+		self.data = [""] * (col + 1)
+		self.data[col] = text
 		self.level = 2
 
 	def getParentItem(self):
@@ -1405,6 +1405,7 @@ class BranchLevelOneItem():
 		self.dbid = data[0]
 		self.level = 1
 		self.query_done = False
+		self.br_col = len(self.data) - 1
 
 	def getChildItem(self, row):
 		return self.child_items[row]
@@ -1485,7 +1486,7 @@ class BranchLevelOneItem():
 				while k < 15:
 					byte_str += "   "
 					k += 1
-				self.child_items.append(BranchLevelTwoItem(0, byte_str + " " + text, self))
+				self.child_items.append(BranchLevelTwoItem(0, self.br_col, byte_str + " " + text, self))
 				self.child_count += 1
 			else:
 				return
@@ -1536,16 +1537,37 @@ class BranchRootItem():
 	def getData(self, column):
 		return ""
 
+# Calculate instructions per cycle
+
+def CalcIPC(cyc_cnt, insn_cnt):
+	if cyc_cnt and insn_cnt:
+		ipc = Decimal(float(insn_cnt) / cyc_cnt)
+		ipc = str(ipc.quantize(Decimal(".01"), rounding=ROUND_HALF_UP))
+	else:
+		ipc = "0"
+	return ipc
+
 # Branch data preparation
 
-def BranchDataPrep(query):
-	data = []
-	for i in xrange(0, 8):
-		data.append(query.value(i))
+def BranchDataPrepBr(query, data):
 	data.append(tohex(query.value(8)).rjust(16) + " " + query.value(9) + offstr(query.value(10)) +
 			" (" + dsoname(query.value(11)) + ")" + " -> " +
 			tohex(query.value(12)) + " " + query.value(13) + offstr(query.value(14)) +
 			" (" + dsoname(query.value(15)) + ")")
+
+def BranchDataPrepIPC(query, data):
+	insn_cnt = query.value(16)
+	cyc_cnt = query.value(17)
+	ipc = CalcIPC(cyc_cnt, insn_cnt)
+	data.append(insn_cnt)
+	data.append(cyc_cnt)
+	data.append(ipc)
+
+def BranchDataPrep(query):
+	data = []
+	for i in xrange(0, 8):
+		data.append(query.value(i))
+	BranchDataPrepBr(query, data)
 	return data
 
 def BranchDataPrepWA(query):
@@ -1555,10 +1577,26 @@ def BranchDataPrepWA(query):
 	data.append("{:>19}".format(query.value(1)))
 	for i in xrange(2, 8):
 		data.append(query.value(i))
-	data.append(tohex(query.value(8)).rjust(16) + " " + query.value(9) + offstr(query.value(10)) +
-			" (" + dsoname(query.value(11)) + ")" + " -> " +
-			tohex(query.value(12)) + " " + query.value(13) + offstr(query.value(14)) +
-			" (" + dsoname(query.value(15)) + ")")
+	BranchDataPrepBr(query, data)
+	return data
+
+def BranchDataWithIPCPrep(query):
+	data = []
+	for i in xrange(0, 8):
+		data.append(query.value(i))
+	BranchDataPrepIPC(query, data)
+	BranchDataPrepBr(query, data)
+	return data
+
+def BranchDataWithIPCPrepWA(query):
+	data = []
+	data.append(query.value(0))
+	# Workaround pyside failing to handle large integers (i.e. time) in python3 by converting to a string
+	data.append("{:>19}".format(query.value(1)))
+	for i in xrange(2, 8):
+		data.append(query.value(i))
+	BranchDataPrepIPC(query, data)
+	BranchDataPrepBr(query, data)
 	return data
 
 # Branch data model
@@ -1572,10 +1610,20 @@ class BranchModel(TreeModel):
 		self.event_id = event_id
 		self.more = True
 		self.populated = 0
+		self.have_ipc = IsSelectable(glb.db, "samples", columns = "insn_count, cyc_count")
+		if self.have_ipc:
+			select_ipc = ", insn_count, cyc_count"
+			prep_fn = BranchDataWithIPCPrep
+			prep_wa_fn = BranchDataWithIPCPrepWA
+		else:
+			select_ipc = ""
+			prep_fn = BranchDataPrep
+			prep_wa_fn = BranchDataPrepWA
 		sql = ("SELECT samples.id, time, cpu, comm, pid, tid, branch_types.name,"
 			" CASE WHEN in_tx = '0' THEN 'No' ELSE 'Yes' END,"
 			" ip, symbols.name, sym_offset, dsos.short_name,"
 			" to_ip, to_symbols.name, to_sym_offset, to_dsos.short_name"
+			+ select_ipc +
 			" FROM samples"
 			" INNER JOIN comms ON comm_id = comms.id"
 			" INNER JOIN threads ON thread_id = threads.id"
@@ -1589,9 +1637,9 @@ class BranchModel(TreeModel):
 			" ORDER BY samples.id"
 			" LIMIT " + str(glb_chunk_sz))
 		if pyside_version_1 and sys.version_info[0] == 3:
-			prep = BranchDataPrepWA
+			prep = prep_fn
 		else:
-			prep = BranchDataPrep
+			prep = prep_wa_fn
 		self.fetcher = SQLFetcher(glb, sql, prep, self.AddSample)
 		self.fetcher.done.connect(self.Update)
 		self.fetcher.Fetch(glb_chunk_sz)
@@ -1600,13 +1648,23 @@ class BranchModel(TreeModel):
 		return BranchRootItem()
 
 	def columnCount(self, parent=None):
-		return 8
+		if self.have_ipc:
+			return 11
+		else:
+			return 8
 
 	def columnHeader(self, column):
-		return ("Time", "CPU", "Command", "PID", "TID", "Branch Type", "In Tx", "Branch")[column]
+		if self.have_ipc:
+			return ("Time", "CPU", "Command", "PID", "TID", "Branch Type", "In Tx", "Insn Cnt", "Cyc Cnt", "IPC", "Branch")[column]
+		else:
+			return ("Time", "CPU", "Command", "PID", "TID", "Branch Type", "In Tx", "Branch")[column]
 
 	def columnFont(self, column):
-		if column != 7:
+		if self.have_ipc:
+			br_col = 10
+		else:
+			br_col = 7
+		if column != br_col:
 			return None
 		return QFont("Monospace")
 
@@ -2114,10 +2172,10 @@ def GetEventList(db):
 
 # Is a table selectable
 
-def IsSelectable(db, table, sql = ""):
+def IsSelectable(db, table, sql = "", columns = "*"):
 	query = QSqlQuery(db)
 	try:
-		QueryExec(query, "SELECT * FROM " + table + " " + sql + " LIMIT 1")
+		QueryExec(query, "SELECT " + columns + " FROM " + table + " " + sql + " LIMIT 1")
 	except:
 		return False
 	return True
@@ -2854,6 +2912,12 @@ cd xed
 sudo ./mfile.py --prefix=/usr/local install
 sudo ldconfig
 </pre>
+<h3>Instructions per Cycle (IPC)</h3>
+If available, IPC information is displayed in columns 'insn_cnt', 'cyc_cnt' and 'IPC'.
+<p><b>Intel PT note:</b> The information applies to the blocks of code ending with, and including, that branch.
+Due to the granularity of timing information, the number of cycles for some code blocks will not be known.
+In that case, 'insn_cnt', 'cyc_cnt' and 'IPC' are zero, but when 'IPC' is displayed it covers the period
+since the previous displayed 'IPC'.
 <h3>Find</h3>
 Ctrl-F displays a Find bar which finds substrings by either an exact match or a regular expression match.
 Refer to Python documentation for the regular expression syntax.
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 19/22] perf scripts python: exported-sql-viewer.py: Add CallGraphModelParams
  2019-05-20 11:37 [PATCH 00/22] perf intel-pt: Add support for instructions-per-cycle (IPC) Adrian Hunter
                   ` (17 preceding siblings ...)
  2019-05-20 11:37 ` [PATCH 18/22] perf scripts python: exported-sql-viewer.py: Add IPC information to the Branch reports Adrian Hunter
@ 2019-05-20 11:37 ` Adrian Hunter
  2019-06-17 19:09   ` [tip:perf/core] " tip-bot for Adrian Hunter
  2019-05-20 11:37 ` [PATCH 20/22] perf scripts python: exported-sql-viewer.py: Add IPC information to Call Graph Graph Adrian Hunter
                   ` (2 subsequent siblings)
  21 siblings, 1 reply; 52+ messages in thread
From: Adrian Hunter @ 2019-05-20 11:37 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: Jiri Olsa, linux-kernel

Add a parameter to call graph and call tree, to determine whether IPC
information is available.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 .../scripts/python/exported-sql-viewer.py     | 73 +++++++++++--------
 1 file changed, 41 insertions(+), 32 deletions(-)

diff --git a/tools/perf/scripts/python/exported-sql-viewer.py b/tools/perf/scripts/python/exported-sql-viewer.py
index a607235c8cd9..b3508bd4eb00 100755
--- a/tools/perf/scripts/python/exported-sql-viewer.py
+++ b/tools/perf/scripts/python/exported-sql-viewer.py
@@ -200,9 +200,10 @@ class Thread(QThread):
 
 class TreeModel(QAbstractItemModel):
 
-	def __init__(self, glb, parent=None):
+	def __init__(self, glb, params, parent=None):
 		super(TreeModel, self).__init__(parent)
 		self.glb = glb
+		self.params = params
 		self.root = self.GetRoot()
 		self.last_row_read = 0
 
@@ -463,8 +464,9 @@ class FindBar():
 
 class CallGraphLevelItemBase(object):
 
-	def __init__(self, glb, row, parent_item):
+	def __init__(self, glb, params, row, parent_item):
 		self.glb = glb
+		self.params = params
 		self.row = row
 		self.parent_item = parent_item
 		self.query_done = False;
@@ -503,8 +505,8 @@ class CallGraphLevelItemBase(object):
 
 class CallGraphLevelTwoPlusItemBase(CallGraphLevelItemBase):
 
-	def __init__(self, glb, row, comm_id, thread_id, call_path_id, time, branch_count, parent_item):
-		super(CallGraphLevelTwoPlusItemBase, self).__init__(glb, row, parent_item)
+	def __init__(self, glb, params, row, comm_id, thread_id, call_path_id, time, branch_count, parent_item):
+		super(CallGraphLevelTwoPlusItemBase, self).__init__(glb, params, row, parent_item)
 		self.comm_id = comm_id
 		self.thread_id = thread_id
 		self.call_path_id = call_path_id
@@ -525,7 +527,7 @@ class CallGraphLevelTwoPlusItemBase(CallGraphLevelItemBase):
 					" GROUP BY call_path_id, name, short_name"
 					" ORDER BY call_path_id")
 		while query.next():
-			child_item = CallGraphLevelThreeItem(self.glb, self.child_count, self.comm_id, self.thread_id, query.value(0), query.value(1), query.value(2), query.value(3), int(query.value(4)), int(query.value(5)), self)
+			child_item = CallGraphLevelThreeItem(self.glb, self.params, self.child_count, self.comm_id, self.thread_id, query.value(0), query.value(1), query.value(2), query.value(3), int(query.value(4)), int(query.value(5)), self)
 			self.child_items.append(child_item)
 			self.child_count += 1
 
@@ -533,8 +535,8 @@ class CallGraphLevelTwoPlusItemBase(CallGraphLevelItemBase):
 
 class CallGraphLevelThreeItem(CallGraphLevelTwoPlusItemBase):
 
-	def __init__(self, glb, row, comm_id, thread_id, call_path_id, name, dso, count, time, branch_count, parent_item):
-		super(CallGraphLevelThreeItem, self).__init__(glb, row, comm_id, thread_id, call_path_id, time, branch_count, parent_item)
+	def __init__(self, glb, params, row, comm_id, thread_id, call_path_id, name, dso, count, time, branch_count, parent_item):
+		super(CallGraphLevelThreeItem, self).__init__(glb, params, row, comm_id, thread_id, call_path_id, time, branch_count, parent_item)
 		dso = dsoname(dso)
 		self.data = [ name, dso, str(count), str(time), PercentToOneDP(time, parent_item.time), str(branch_count), PercentToOneDP(branch_count, parent_item.branch_count) ]
 		self.dbid = call_path_id
@@ -543,8 +545,8 @@ class CallGraphLevelThreeItem(CallGraphLevelTwoPlusItemBase):
 
 class CallGraphLevelTwoItem(CallGraphLevelTwoPlusItemBase):
 
-	def __init__(self, glb, row, comm_id, thread_id, pid, tid, parent_item):
-		super(CallGraphLevelTwoItem, self).__init__(glb, row, comm_id, thread_id, 1, 0, 0, parent_item)
+	def __init__(self, glb, params, row, comm_id, thread_id, pid, tid, parent_item):
+		super(CallGraphLevelTwoItem, self).__init__(glb, params, row, comm_id, thread_id, 1, 0, 0, parent_item)
 		self.data = [str(pid) + ":" + str(tid), "", "", "", "", "", ""]
 		self.dbid = thread_id
 
@@ -561,8 +563,8 @@ class CallGraphLevelTwoItem(CallGraphLevelTwoPlusItemBase):
 
 class CallGraphLevelOneItem(CallGraphLevelItemBase):
 
-	def __init__(self, glb, row, comm_id, comm, parent_item):
-		super(CallGraphLevelOneItem, self).__init__(glb, row, parent_item)
+	def __init__(self, glb, params, row, comm_id, comm, parent_item):
+		super(CallGraphLevelOneItem, self).__init__(glb, params, row, parent_item)
 		self.data = [comm, "", "", "", "", "", ""]
 		self.dbid = comm_id
 
@@ -574,7 +576,7 @@ class CallGraphLevelOneItem(CallGraphLevelItemBase):
 					" INNER JOIN threads ON thread_id = threads.id"
 					" WHERE comm_id = " + str(self.dbid))
 		while query.next():
-			child_item = CallGraphLevelTwoItem(self.glb, self.child_count, self.dbid, query.value(0), query.value(1), query.value(2), self)
+			child_item = CallGraphLevelTwoItem(self.glb, self.params, self.child_count, self.dbid, query.value(0), query.value(1), query.value(2), self)
 			self.child_items.append(child_item)
 			self.child_count += 1
 
@@ -582,8 +584,8 @@ class CallGraphLevelOneItem(CallGraphLevelItemBase):
 
 class CallGraphRootItem(CallGraphLevelItemBase):
 
-	def __init__(self, glb):
-		super(CallGraphRootItem, self).__init__(glb, 0, None)
+	def __init__(self, glb, params):
+		super(CallGraphRootItem, self).__init__(glb, params, 0, None)
 		self.dbid = 0
 		self.query_done = True;
 		query = QSqlQuery(glb.db)
@@ -591,16 +593,23 @@ class CallGraphRootItem(CallGraphLevelItemBase):
 		while query.next():
 			if not query.value(0):
 				continue
-			child_item = CallGraphLevelOneItem(glb, self.child_count, query.value(0), query.value(1), self)
+			child_item = CallGraphLevelOneItem(glb, params, self.child_count, query.value(0), query.value(1), self)
 			self.child_items.append(child_item)
 			self.child_count += 1
 
+# Call graph model parameters
+
+class CallGraphModelParams():
+
+	def __init__(self, glb, parent=None):
+		self.have_ipc = IsSelectable(glb.db, "calls", columns = "insn_count, cyc_count")
+
 # Context-sensitive call graph data model base
 
 class CallGraphModelBase(TreeModel):
 
 	def __init__(self, glb, parent=None):
-		super(CallGraphModelBase, self).__init__(glb, parent)
+		super(CallGraphModelBase, self).__init__(glb, CallGraphModelParams(glb), parent)
 
 	def FindSelect(self, value, pattern, query):
 		if pattern:
@@ -682,7 +691,7 @@ class CallGraphModel(CallGraphModelBase):
 		super(CallGraphModel, self).__init__(glb, parent)
 
 	def GetRoot(self):
-		return CallGraphRootItem(self.glb)
+		return CallGraphRootItem(self.glb, self.params)
 
 	def columnCount(self, parent=None):
 		return 7
@@ -729,8 +738,8 @@ class CallGraphModel(CallGraphModelBase):
 
 class CallTreeLevelTwoPlusItemBase(CallGraphLevelItemBase):
 
-	def __init__(self, glb, row, comm_id, thread_id, calls_id, time, branch_count, parent_item):
-		super(CallTreeLevelTwoPlusItemBase, self).__init__(glb, row, parent_item)
+	def __init__(self, glb, params, row, comm_id, thread_id, calls_id, time, branch_count, parent_item):
+		super(CallTreeLevelTwoPlusItemBase, self).__init__(glb, params, row, parent_item)
 		self.comm_id = comm_id
 		self.thread_id = thread_id
 		self.calls_id = calls_id
@@ -752,7 +761,7 @@ class CallTreeLevelTwoPlusItemBase(CallGraphLevelItemBase):
 					" WHERE calls.parent_id = " + str(self.calls_id) + comm_thread +
 					" ORDER BY call_time, calls.id")
 		while query.next():
-			child_item = CallTreeLevelThreeItem(self.glb, self.child_count, self.comm_id, self.thread_id, query.value(0), query.value(1), query.value(2), query.value(3), int(query.value(4)), int(query.value(5)), self)
+			child_item = CallTreeLevelThreeItem(self.glb, self.params, self.child_count, self.comm_id, self.thread_id, query.value(0), query.value(1), query.value(2), query.value(3), int(query.value(4)), int(query.value(5)), self)
 			self.child_items.append(child_item)
 			self.child_count += 1
 
@@ -760,8 +769,8 @@ class CallTreeLevelTwoPlusItemBase(CallGraphLevelItemBase):
 
 class CallTreeLevelThreeItem(CallTreeLevelTwoPlusItemBase):
 
-	def __init__(self, glb, row, comm_id, thread_id, calls_id, name, dso, count, time, branch_count, parent_item):
-		super(CallTreeLevelThreeItem, self).__init__(glb, row, comm_id, thread_id, calls_id, time, branch_count, parent_item)
+	def __init__(self, glb, params, row, comm_id, thread_id, calls_id, name, dso, count, time, branch_count, parent_item):
+		super(CallTreeLevelThreeItem, self).__init__(glb, params, row, comm_id, thread_id, calls_id, time, branch_count, parent_item)
 		dso = dsoname(dso)
 		self.data = [ name, dso, str(count), str(time), PercentToOneDP(time, parent_item.time), str(branch_count), PercentToOneDP(branch_count, parent_item.branch_count) ]
 		self.dbid = calls_id
@@ -770,8 +779,8 @@ class CallTreeLevelThreeItem(CallTreeLevelTwoPlusItemBase):
 
 class CallTreeLevelTwoItem(CallTreeLevelTwoPlusItemBase):
 
-	def __init__(self, glb, row, comm_id, thread_id, pid, tid, parent_item):
-		super(CallTreeLevelTwoItem, self).__init__(glb, row, comm_id, thread_id, 0, 0, 0, parent_item)
+	def __init__(self, glb, params, row, comm_id, thread_id, pid, tid, parent_item):
+		super(CallTreeLevelTwoItem, self).__init__(glb, params, row, comm_id, thread_id, 0, 0, 0, parent_item)
 		self.data = [str(pid) + ":" + str(tid), "", "", "", "", "", ""]
 		self.dbid = thread_id
 
@@ -788,8 +797,8 @@ class CallTreeLevelTwoItem(CallTreeLevelTwoPlusItemBase):
 
 class CallTreeLevelOneItem(CallGraphLevelItemBase):
 
-	def __init__(self, glb, row, comm_id, comm, parent_item):
-		super(CallTreeLevelOneItem, self).__init__(glb, row, parent_item)
+	def __init__(self, glb, params, row, comm_id, comm, parent_item):
+		super(CallTreeLevelOneItem, self).__init__(glb, params, row, parent_item)
 		self.data = [comm, "", "", "", "", "", ""]
 		self.dbid = comm_id
 
@@ -801,7 +810,7 @@ class CallTreeLevelOneItem(CallGraphLevelItemBase):
 					" INNER JOIN threads ON thread_id = threads.id"
 					" WHERE comm_id = " + str(self.dbid))
 		while query.next():
-			child_item = CallTreeLevelTwoItem(self.glb, self.child_count, self.dbid, query.value(0), query.value(1), query.value(2), self)
+			child_item = CallTreeLevelTwoItem(self.glb, self.params, self.child_count, self.dbid, query.value(0), query.value(1), query.value(2), self)
 			self.child_items.append(child_item)
 			self.child_count += 1
 
@@ -809,8 +818,8 @@ class CallTreeLevelOneItem(CallGraphLevelItemBase):
 
 class CallTreeRootItem(CallGraphLevelItemBase):
 
-	def __init__(self, glb):
-		super(CallTreeRootItem, self).__init__(glb, 0, None)
+	def __init__(self, glb, params):
+		super(CallTreeRootItem, self).__init__(glb, params, 0, None)
 		self.dbid = 0
 		self.query_done = True;
 		query = QSqlQuery(glb.db)
@@ -818,7 +827,7 @@ class CallTreeRootItem(CallGraphLevelItemBase):
 		while query.next():
 			if not query.value(0):
 				continue
-			child_item = CallTreeLevelOneItem(glb, self.child_count, query.value(0), query.value(1), self)
+			child_item = CallTreeLevelOneItem(glb, params, self.child_count, query.value(0), query.value(1), self)
 			self.child_items.append(child_item)
 			self.child_count += 1
 
@@ -830,7 +839,7 @@ class CallTreeModel(CallGraphModelBase):
 		super(CallTreeModel, self).__init__(glb, parent)
 
 	def GetRoot(self):
-		return CallTreeRootItem(self.glb)
+		return CallTreeRootItem(self.glb, self.params)
 
 	def columnCount(self, parent=None):
 		return 7
@@ -1606,7 +1615,7 @@ class BranchModel(TreeModel):
 	progress = Signal(object)
 
 	def __init__(self, glb, event_id, where_clause, parent=None):
-		super(BranchModel, self).__init__(glb, parent)
+		super(BranchModel, self).__init__(glb, None, parent)
 		self.event_id = event_id
 		self.more = True
 		self.populated = 0
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 20/22] perf scripts python: exported-sql-viewer.py: Add IPC information to Call Graph Graph
  2019-05-20 11:37 [PATCH 00/22] perf intel-pt: Add support for instructions-per-cycle (IPC) Adrian Hunter
                   ` (18 preceding siblings ...)
  2019-05-20 11:37 ` [PATCH 19/22] perf scripts python: exported-sql-viewer.py: Add CallGraphModelParams Adrian Hunter
@ 2019-05-20 11:37 ` Adrian Hunter
  2019-06-17 19:10   ` [tip:perf/core] " tip-bot for Adrian Hunter
  2019-05-20 11:37 ` [PATCH 21/22] perf scripts python: exported-sql-viewer.py: Add IPC information to Call Tree Adrian Hunter
  2019-05-20 11:37 ` [PATCH 22/22] perf scripts python: exported-sql-viewer.py: Select find text when find bar is activated Adrian Hunter
  21 siblings, 1 reply; 52+ messages in thread
From: Adrian Hunter @ 2019-05-20 11:37 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: Jiri Olsa, linux-kernel

Enhance the call graph to display IPC information if it is available.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 .../scripts/python/exported-sql-viewer.py     | 69 +++++++++++++++----
 1 file changed, 56 insertions(+), 13 deletions(-)

diff --git a/tools/perf/scripts/python/exported-sql-viewer.py b/tools/perf/scripts/python/exported-sql-viewer.py
index b3508bd4eb00..f5b1b63995b0 100755
--- a/tools/perf/scripts/python/exported-sql-viewer.py
+++ b/tools/perf/scripts/python/exported-sql-viewer.py
@@ -505,18 +505,24 @@ class CallGraphLevelItemBase(object):
 
 class CallGraphLevelTwoPlusItemBase(CallGraphLevelItemBase):
 
-	def __init__(self, glb, params, row, comm_id, thread_id, call_path_id, time, branch_count, parent_item):
+	def __init__(self, glb, params, row, comm_id, thread_id, call_path_id, time, insn_cnt, cyc_cnt, branch_count, parent_item):
 		super(CallGraphLevelTwoPlusItemBase, self).__init__(glb, params, row, parent_item)
 		self.comm_id = comm_id
 		self.thread_id = thread_id
 		self.call_path_id = call_path_id
+		self.insn_cnt = insn_cnt
+		self.cyc_cnt = cyc_cnt
 		self.branch_count = branch_count
 		self.time = time
 
 	def Select(self):
 		self.query_done = True;
 		query = QSqlQuery(self.glb.db)
-		QueryExec(query, "SELECT call_path_id, name, short_name, COUNT(calls.id), SUM(return_time - call_time), SUM(branch_count)"
+		if self.params.have_ipc:
+			ipc_str = ", SUM(insn_count), SUM(cyc_count)"
+		else:
+			ipc_str = ""
+		QueryExec(query, "SELECT call_path_id, name, short_name, COUNT(calls.id), SUM(return_time - call_time)" + ipc_str + ", SUM(branch_count)"
 					" FROM calls"
 					" INNER JOIN call_paths ON calls.call_path_id = call_paths.id"
 					" INNER JOIN symbols ON call_paths.symbol_id = symbols.id"
@@ -527,7 +533,15 @@ class CallGraphLevelTwoPlusItemBase(CallGraphLevelItemBase):
 					" GROUP BY call_path_id, name, short_name"
 					" ORDER BY call_path_id")
 		while query.next():
-			child_item = CallGraphLevelThreeItem(self.glb, self.params, self.child_count, self.comm_id, self.thread_id, query.value(0), query.value(1), query.value(2), query.value(3), int(query.value(4)), int(query.value(5)), self)
+			if self.params.have_ipc:
+				insn_cnt = int(query.value(5))
+				cyc_cnt = int(query.value(6))
+				branch_count = int(query.value(7))
+			else:
+				insn_cnt = 0
+				cyc_cnt = 0
+				branch_count = int(query.value(5))
+			child_item = CallGraphLevelThreeItem(self.glb, self.params, self.child_count, self.comm_id, self.thread_id, query.value(0), query.value(1), query.value(2), query.value(3), int(query.value(4)), insn_cnt, cyc_cnt, branch_count, self)
 			self.child_items.append(child_item)
 			self.child_count += 1
 
@@ -535,10 +549,17 @@ class CallGraphLevelTwoPlusItemBase(CallGraphLevelItemBase):
 
 class CallGraphLevelThreeItem(CallGraphLevelTwoPlusItemBase):
 
-	def __init__(self, glb, params, row, comm_id, thread_id, call_path_id, name, dso, count, time, branch_count, parent_item):
-		super(CallGraphLevelThreeItem, self).__init__(glb, params, row, comm_id, thread_id, call_path_id, time, branch_count, parent_item)
+	def __init__(self, glb, params, row, comm_id, thread_id, call_path_id, name, dso, count, time, insn_cnt, cyc_cnt, branch_count, parent_item):
+		super(CallGraphLevelThreeItem, self).__init__(glb, params, row, comm_id, thread_id, call_path_id, time, insn_cnt, cyc_cnt, branch_count, parent_item)
 		dso = dsoname(dso)
-		self.data = [ name, dso, str(count), str(time), PercentToOneDP(time, parent_item.time), str(branch_count), PercentToOneDP(branch_count, parent_item.branch_count) ]
+		if self.params.have_ipc:
+			insn_pcnt = PercentToOneDP(insn_cnt, parent_item.insn_cnt)
+			cyc_pcnt = PercentToOneDP(cyc_cnt, parent_item.cyc_cnt)
+			br_pcnt = PercentToOneDP(branch_count, parent_item.branch_count)
+			ipc = CalcIPC(cyc_cnt, insn_cnt)
+			self.data = [ name, dso, str(count), str(time), PercentToOneDP(time, parent_item.time), str(insn_cnt), insn_pcnt, str(cyc_cnt), cyc_pcnt, ipc, str(branch_count), br_pcnt ]
+		else:
+			self.data = [ name, dso, str(count), str(time), PercentToOneDP(time, parent_item.time), str(branch_count), PercentToOneDP(branch_count, parent_item.branch_count) ]
 		self.dbid = call_path_id
 
 # Context-sensitive call graph data model level two item
@@ -546,18 +567,28 @@ class CallGraphLevelThreeItem(CallGraphLevelTwoPlusItemBase):
 class CallGraphLevelTwoItem(CallGraphLevelTwoPlusItemBase):
 
 	def __init__(self, glb, params, row, comm_id, thread_id, pid, tid, parent_item):
-		super(CallGraphLevelTwoItem, self).__init__(glb, params, row, comm_id, thread_id, 1, 0, 0, parent_item)
-		self.data = [str(pid) + ":" + str(tid), "", "", "", "", "", ""]
+		super(CallGraphLevelTwoItem, self).__init__(glb, params, row, comm_id, thread_id, 1, 0, 0, 0, 0, parent_item)
+		if self.params.have_ipc:
+			self.data = [str(pid) + ":" + str(tid), "", "", "", "", "", "", "", "", "", "", ""]
+		else:
+			self.data = [str(pid) + ":" + str(tid), "", "", "", "", "", ""]
 		self.dbid = thread_id
 
 	def Select(self):
 		super(CallGraphLevelTwoItem, self).Select()
 		for child_item in self.child_items:
 			self.time += child_item.time
+			self.insn_cnt += child_item.insn_cnt
+			self.cyc_cnt += child_item.cyc_cnt
 			self.branch_count += child_item.branch_count
 		for child_item in self.child_items:
 			child_item.data[4] = PercentToOneDP(child_item.time, self.time)
-			child_item.data[6] = PercentToOneDP(child_item.branch_count, self.branch_count)
+			if self.params.have_ipc:
+				child_item.data[6] = PercentToOneDP(child_item.insn_cnt, self.insn_cnt)
+				child_item.data[8] = PercentToOneDP(child_item.cyc_cnt, self.cyc_cnt)
+				child_item.data[11] = PercentToOneDP(child_item.branch_count, self.branch_count)
+			else:
+				child_item.data[6] = PercentToOneDP(child_item.branch_count, self.branch_count)
 
 # Context-sensitive call graph data model level one item
 
@@ -565,7 +596,10 @@ class CallGraphLevelOneItem(CallGraphLevelItemBase):
 
 	def __init__(self, glb, params, row, comm_id, comm, parent_item):
 		super(CallGraphLevelOneItem, self).__init__(glb, params, row, parent_item)
-		self.data = [comm, "", "", "", "", "", ""]
+		if self.params.have_ipc:
+			self.data = [comm, "", "", "", "", "", "", "", "", "", "", ""]
+		else:
+			self.data = [comm, "", "", "", "", "", ""]
 		self.dbid = comm_id
 
 	def Select(self):
@@ -694,14 +728,23 @@ class CallGraphModel(CallGraphModelBase):
 		return CallGraphRootItem(self.glb, self.params)
 
 	def columnCount(self, parent=None):
-		return 7
+		if self.params.have_ipc:
+			return 12
+		else:
+			return 7
 
 	def columnHeader(self, column):
-		headers = ["Call Path", "Object", "Count ", "Time (ns) ", "Time (%) ", "Branch Count ", "Branch Count (%) "]
+		if self.params.have_ipc:
+			headers = ["Call Path", "Object", "Count ", "Time (ns) ", "Time (%) ", "Insn Cnt", "Insn Cnt (%)", "Cyc Cnt", "Cyc Cnt (%)", "IPC", "Branch Count ", "Branch Count (%) "]
+		else:
+			headers = ["Call Path", "Object", "Count ", "Time (ns) ", "Time (%) ", "Branch Count ", "Branch Count (%) "]
 		return headers[column]
 
 	def columnAlignment(self, column):
-		alignment = [ Qt.AlignLeft, Qt.AlignLeft, Qt.AlignRight, Qt.AlignRight, Qt.AlignRight, Qt.AlignRight, Qt.AlignRight ]
+		if self.params.have_ipc:
+			alignment = [ Qt.AlignLeft, Qt.AlignLeft, Qt.AlignRight, Qt.AlignRight, Qt.AlignRight, Qt.AlignRight, Qt.AlignRight, Qt.AlignRight, Qt.AlignRight, Qt.AlignRight, Qt.AlignRight, Qt.AlignRight ]
+		else:
+			alignment = [ Qt.AlignLeft, Qt.AlignLeft, Qt.AlignRight, Qt.AlignRight, Qt.AlignRight, Qt.AlignRight, Qt.AlignRight ]
 		return alignment[column]
 
 	def DoFindSelect(self, query, match):
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 21/22] perf scripts python: exported-sql-viewer.py: Add IPC information to Call Tree
  2019-05-20 11:37 [PATCH 00/22] perf intel-pt: Add support for instructions-per-cycle (IPC) Adrian Hunter
                   ` (19 preceding siblings ...)
  2019-05-20 11:37 ` [PATCH 20/22] perf scripts python: exported-sql-viewer.py: Add IPC information to Call Graph Graph Adrian Hunter
@ 2019-05-20 11:37 ` Adrian Hunter
  2019-06-17 19:11   ` [tip:perf/core] " tip-bot for Adrian Hunter
  2019-05-20 11:37 ` [PATCH 22/22] perf scripts python: exported-sql-viewer.py: Select find text when find bar is activated Adrian Hunter
  21 siblings, 1 reply; 52+ messages in thread
From: Adrian Hunter @ 2019-05-20 11:37 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: Jiri Olsa, linux-kernel

Enhance the call tree to display IPC information if it is available.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 .../scripts/python/exported-sql-viewer.py     | 69 +++++++++++++++----
 1 file changed, 56 insertions(+), 13 deletions(-)

diff --git a/tools/perf/scripts/python/exported-sql-viewer.py b/tools/perf/scripts/python/exported-sql-viewer.py
index f5b1b63995b0..94489cf2ce0e 100755
--- a/tools/perf/scripts/python/exported-sql-viewer.py
+++ b/tools/perf/scripts/python/exported-sql-viewer.py
@@ -781,11 +781,13 @@ class CallGraphModel(CallGraphModelBase):
 
 class CallTreeLevelTwoPlusItemBase(CallGraphLevelItemBase):
 
-	def __init__(self, glb, params, row, comm_id, thread_id, calls_id, time, branch_count, parent_item):
+	def __init__(self, glb, params, row, comm_id, thread_id, calls_id, time, insn_cnt, cyc_cnt, branch_count, parent_item):
 		super(CallTreeLevelTwoPlusItemBase, self).__init__(glb, params, row, parent_item)
 		self.comm_id = comm_id
 		self.thread_id = thread_id
 		self.calls_id = calls_id
+		self.insn_cnt = insn_cnt
+		self.cyc_cnt = cyc_cnt
 		self.branch_count = branch_count
 		self.time = time
 
@@ -795,8 +797,12 @@ class CallTreeLevelTwoPlusItemBase(CallGraphLevelItemBase):
 			comm_thread = " AND comm_id = " + str(self.comm_id) + " AND thread_id = " + str(self.thread_id)
 		else:
 			comm_thread = ""
+		if self.params.have_ipc:
+			ipc_str = ", insn_count, cyc_count"
+		else:
+			ipc_str = ""
 		query = QSqlQuery(self.glb.db)
-		QueryExec(query, "SELECT calls.id, name, short_name, call_time, return_time - call_time, branch_count"
+		QueryExec(query, "SELECT calls.id, name, short_name, call_time, return_time - call_time" + ipc_str + ", branch_count"
 					" FROM calls"
 					" INNER JOIN call_paths ON calls.call_path_id = call_paths.id"
 					" INNER JOIN symbols ON call_paths.symbol_id = symbols.id"
@@ -804,7 +810,15 @@ class CallTreeLevelTwoPlusItemBase(CallGraphLevelItemBase):
 					" WHERE calls.parent_id = " + str(self.calls_id) + comm_thread +
 					" ORDER BY call_time, calls.id")
 		while query.next():
-			child_item = CallTreeLevelThreeItem(self.glb, self.params, self.child_count, self.comm_id, self.thread_id, query.value(0), query.value(1), query.value(2), query.value(3), int(query.value(4)), int(query.value(5)), self)
+			if self.params.have_ipc:
+				insn_cnt = int(query.value(5))
+				cyc_cnt = int(query.value(6))
+				branch_count = int(query.value(7))
+			else:
+				insn_cnt = 0
+				cyc_cnt = 0
+				branch_count = int(query.value(5))
+			child_item = CallTreeLevelThreeItem(self.glb, self.params, self.child_count, self.comm_id, self.thread_id, query.value(0), query.value(1), query.value(2), query.value(3), int(query.value(4)), insn_cnt, cyc_cnt, branch_count, self)
 			self.child_items.append(child_item)
 			self.child_count += 1
 
@@ -812,10 +826,17 @@ class CallTreeLevelTwoPlusItemBase(CallGraphLevelItemBase):
 
 class CallTreeLevelThreeItem(CallTreeLevelTwoPlusItemBase):
 
-	def __init__(self, glb, params, row, comm_id, thread_id, calls_id, name, dso, count, time, branch_count, parent_item):
-		super(CallTreeLevelThreeItem, self).__init__(glb, params, row, comm_id, thread_id, calls_id, time, branch_count, parent_item)
+	def __init__(self, glb, params, row, comm_id, thread_id, calls_id, name, dso, count, time, insn_cnt, cyc_cnt, branch_count, parent_item):
+		super(CallTreeLevelThreeItem, self).__init__(glb, params, row, comm_id, thread_id, calls_id, time, insn_cnt, cyc_cnt, branch_count, parent_item)
 		dso = dsoname(dso)
-		self.data = [ name, dso, str(count), str(time), PercentToOneDP(time, parent_item.time), str(branch_count), PercentToOneDP(branch_count, parent_item.branch_count) ]
+		if self.params.have_ipc:
+			insn_pcnt = PercentToOneDP(insn_cnt, parent_item.insn_cnt)
+			cyc_pcnt = PercentToOneDP(cyc_cnt, parent_item.cyc_cnt)
+			br_pcnt = PercentToOneDP(branch_count, parent_item.branch_count)
+			ipc = CalcIPC(cyc_cnt, insn_cnt)
+			self.data = [ name, dso, str(count), str(time), PercentToOneDP(time, parent_item.time), str(insn_cnt), insn_pcnt, str(cyc_cnt), cyc_pcnt, ipc, str(branch_count), br_pcnt ]
+		else:
+			self.data = [ name, dso, str(count), str(time), PercentToOneDP(time, parent_item.time), str(branch_count), PercentToOneDP(branch_count, parent_item.branch_count) ]
 		self.dbid = calls_id
 
 # Call tree data model level two item
@@ -823,18 +844,28 @@ class CallTreeLevelThreeItem(CallTreeLevelTwoPlusItemBase):
 class CallTreeLevelTwoItem(CallTreeLevelTwoPlusItemBase):
 
 	def __init__(self, glb, params, row, comm_id, thread_id, pid, tid, parent_item):
-		super(CallTreeLevelTwoItem, self).__init__(glb, params, row, comm_id, thread_id, 0, 0, 0, parent_item)
-		self.data = [str(pid) + ":" + str(tid), "", "", "", "", "", ""]
+		super(CallTreeLevelTwoItem, self).__init__(glb, params, row, comm_id, thread_id, 0, 0, 0, 0, 0, parent_item)
+		if self.params.have_ipc:
+			self.data = [str(pid) + ":" + str(tid), "", "", "", "", "", "", "", "", "", "", ""]
+		else:
+			self.data = [str(pid) + ":" + str(tid), "", "", "", "", "", ""]
 		self.dbid = thread_id
 
 	def Select(self):
 		super(CallTreeLevelTwoItem, self).Select()
 		for child_item in self.child_items:
 			self.time += child_item.time
+			self.insn_cnt += child_item.insn_cnt
+			self.cyc_cnt += child_item.cyc_cnt
 			self.branch_count += child_item.branch_count
 		for child_item in self.child_items:
 			child_item.data[4] = PercentToOneDP(child_item.time, self.time)
-			child_item.data[6] = PercentToOneDP(child_item.branch_count, self.branch_count)
+			if self.params.have_ipc:
+				child_item.data[6] = PercentToOneDP(child_item.insn_cnt, self.insn_cnt)
+				child_item.data[8] = PercentToOneDP(child_item.cyc_cnt, self.cyc_cnt)
+				child_item.data[11] = PercentToOneDP(child_item.branch_count, self.branch_count)
+			else:
+				child_item.data[6] = PercentToOneDP(child_item.branch_count, self.branch_count)
 
 # Call tree data model level one item
 
@@ -842,7 +873,10 @@ class CallTreeLevelOneItem(CallGraphLevelItemBase):
 
 	def __init__(self, glb, params, row, comm_id, comm, parent_item):
 		super(CallTreeLevelOneItem, self).__init__(glb, params, row, parent_item)
-		self.data = [comm, "", "", "", "", "", ""]
+		if self.params.have_ipc:
+			self.data = [comm, "", "", "", "", "", "", "", "", "", "", ""]
+		else:
+			self.data = [comm, "", "", "", "", "", ""]
 		self.dbid = comm_id
 
 	def Select(self):
@@ -885,14 +919,23 @@ class CallTreeModel(CallGraphModelBase):
 		return CallTreeRootItem(self.glb, self.params)
 
 	def columnCount(self, parent=None):
-		return 7
+		if self.params.have_ipc:
+			return 12
+		else:
+			return 7
 
 	def columnHeader(self, column):
-		headers = ["Call Path", "Object", "Call Time", "Time (ns) ", "Time (%) ", "Branch Count ", "Branch Count (%) "]
+		if self.params.have_ipc:
+			headers = ["Call Path", "Object", "Call Time", "Time (ns) ", "Time (%) ", "Insn Cnt", "Insn Cnt (%)", "Cyc Cnt", "Cyc Cnt (%)", "IPC", "Branch Count ", "Branch Count (%) "]
+		else:
+			headers = ["Call Path", "Object", "Call Time", "Time (ns) ", "Time (%) ", "Branch Count ", "Branch Count (%) "]
 		return headers[column]
 
 	def columnAlignment(self, column):
-		alignment = [ Qt.AlignLeft, Qt.AlignLeft, Qt.AlignRight, Qt.AlignRight, Qt.AlignRight, Qt.AlignRight, Qt.AlignRight ]
+		if self.params.have_ipc:
+			alignment = [ Qt.AlignLeft, Qt.AlignLeft, Qt.AlignRight, Qt.AlignRight, Qt.AlignRight, Qt.AlignRight, Qt.AlignRight, Qt.AlignRight, Qt.AlignRight, Qt.AlignRight, Qt.AlignRight, Qt.AlignRight ]
+		else:
+			alignment = [ Qt.AlignLeft, Qt.AlignLeft, Qt.AlignRight, Qt.AlignRight, Qt.AlignRight, Qt.AlignRight, Qt.AlignRight ]
 		return alignment[column]
 
 	def DoFindSelect(self, query, match):
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 22/22] perf scripts python: exported-sql-viewer.py: Select find text when find bar is activated
  2019-05-20 11:37 [PATCH 00/22] perf intel-pt: Add support for instructions-per-cycle (IPC) Adrian Hunter
                   ` (20 preceding siblings ...)
  2019-05-20 11:37 ` [PATCH 21/22] perf scripts python: exported-sql-viewer.py: Add IPC information to Call Tree Adrian Hunter
@ 2019-05-20 11:37 ` Adrian Hunter
  2019-06-17 19:11   ` [tip:perf/core] " tip-bot for Adrian Hunter
  21 siblings, 1 reply; 52+ messages in thread
From: Adrian Hunter @ 2019-05-20 11:37 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: Jiri Olsa, linux-kernel

The user probably wants to replace the find text, so select the find
text when the find bar is activated. That is fairly standard behaviour
for search text entry. Entering text will replace the current text, but
using edit keys (arrows, home, end etc) cancels the selection and
enables editing.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/scripts/python/exported-sql-viewer.py | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tools/perf/scripts/python/exported-sql-viewer.py b/tools/perf/scripts/python/exported-sql-viewer.py
index 94489cf2ce0e..6e7934f2ac9a 100755
--- a/tools/perf/scripts/python/exported-sql-viewer.py
+++ b/tools/perf/scripts/python/exported-sql-viewer.py
@@ -400,6 +400,7 @@ class FindBar():
 
 	def Activate(self):
 		self.bar.show()
+		self.textbox.lineEdit().selectAll()
 		self.textbox.setFocus()
 
 	def Deactivate(self):
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* Re: [PATCH 01/22] perf intel-pt: Fix itrace defaults for perf script
  2019-05-20 11:37 ` [PATCH 01/22] perf intel-pt: Fix itrace defaults for perf script Adrian Hunter
@ 2019-05-20 13:58   ` Sasha Levin
  2019-05-20 14:45   ` Arnaldo Carvalho de Melo
  2019-05-30  7:53   ` [tip:perf/core] " tip-bot for Adrian Hunter
  2 siblings, 0 replies; 52+ messages in thread
From: Sasha Levin @ 2019-05-20 13:58 UTC (permalink / raw)
  To: Sasha Levin, Adrian Hunter, Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, linux-kernel, stable

Hi,

[This is an automated email]

This commit has been processed because it contains a "Fixes:" tag,
fixing commit: 90e457f7be08 perf tools: Add Intel PT support.

The bot has tested the following trees: v5.1.3, v5.0.17, v4.19.44, v4.14.120, v4.9.177, v4.4.180.

v5.1.3: Build OK!
v5.0.17: Build OK!
v4.19.44: Failed to apply! Possible dependencies:
    4eb068157121 ("perf script: Make itrace script default to all calls")

v4.14.120: Failed to apply! Possible dependencies:
    20d9c478b01a ("pert tools: Add queue management functionality")
    440a23b34c06 ("perf tools: Add initial entry point for decoder CoreSight traces")
    4eb068157121 ("perf script: Make itrace script default to all calls")
    68ffe3902898 ("perf tools: Add decoder mechanic to support dumping trace data")
    9f878b29da96 ("perf tools: Add full support for CoreSight trace decoding")
    b12235b113cf ("perf tools: Add mechanic to synthesise CoreSight trace packets")
    ffd3d18c20b8 ("perf tools: Add ARM Statistical Profiling Extensions (SPE) support")

v4.9.177: Failed to apply! Possible dependencies:
    20d9c478b01a ("pert tools: Add queue management functionality")
    3bdafdffa9ba ("perf auxtrace: Add itrace option to output ptwrite events")
    440a23b34c06 ("perf tools: Add initial entry point for decoder CoreSight traces")
    4eb068157121 ("perf script: Make itrace script default to all calls")
    68ffe3902898 ("perf tools: Add decoder mechanic to support dumping trace data")
    70d110d77599 ("perf auxtrace: Add itrace option to output power events")
    9f878b29da96 ("perf tools: Add full support for CoreSight trace decoding")
    b12235b113cf ("perf tools: Add mechanic to synthesise CoreSight trace packets")
    ffd3d18c20b8 ("perf tools: Add ARM Statistical Profiling Extensions (SPE) support")

v4.4.180: Failed to apply! Possible dependencies:
    3bdafdffa9ba ("perf auxtrace: Add itrace option to output ptwrite events")
    3becf4525d9c ("perf tools: Add sink configuration for cs_etm PMU")
    403cacb8a25e ("perf unwind: Don't mix LIBUNWIND_LIBS into LIBUNWIND_LDFLAGS")
    440a23b34c06 ("perf tools: Add initial entry point for decoder CoreSight traces")
    4eb068157121 ("perf script: Make itrace script default to all calls")
    5a155bb77a67 ("perf build: Remove all condition feature check {C,LD}FLAGS")
    68ffe3902898 ("perf tools: Add decoder mechanic to support dumping trace data")
    70d110d77599 ("perf auxtrace: Add itrace option to output power events")
    7e21b0d579a4 ("perf tools: Make coresight PMU listable")
    9d8e14d306ef ("perf unwind: Separate local/remote libunwind config")
    9f878b29da96 ("perf tools: Add full support for CoreSight trace decoding")
    a818c563ae16 ("perf tools: Add coresight etm PMU record capabilities")
    b12235b113cf ("perf tools: Add mechanic to synthesise CoreSight trace packets")
    d1706b39f0af ("perf tools: Add support for skipping itrace instructions")
    ffd3d18c20b8 ("perf tools: Add ARM Statistical Profiling Extensions (SPE) support")


How should we proceed with this patch?

--
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 01/22] perf intel-pt: Fix itrace defaults for perf script
  2019-05-20 11:37 ` [PATCH 01/22] perf intel-pt: Fix itrace defaults for perf script Adrian Hunter
  2019-05-20 13:58   ` Sasha Levin
@ 2019-05-20 14:45   ` Arnaldo Carvalho de Melo
  2019-05-31 16:45     ` Arnaldo Carvalho de Melo
  2019-05-30  7:53   ` [tip:perf/core] " tip-bot for Adrian Hunter
  2 siblings, 1 reply; 52+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-05-20 14:45 UTC (permalink / raw)
  To: Adrian Hunter; +Cc: Jiri Olsa, linux-kernel

Em Mon, May 20, 2019 at 02:37:07PM +0300, Adrian Hunter escreveu:
> Commit 4eb068157121 ("perf script: Make itrace script default to all
> calls") does not work because 'use_browser' is being used to determine
> whether to default to periodic sampling (i.e. better for perf report).
> The result is that nothing but CBR events display for perf script
> when no --itrace option is specified.
> 
> Fix by using 'default_no_sample' and 'inject' instead.

Applied 1-3 for now, concentrating on fixes, will process 4-22 later.

- Arnaldo
 
> Example:
> 
>  Before:
> 
>   $ perf record -e intel_pt/cyc/u ls
>   $ perf script > cmp1.txt
>   $ perf script --itrace=cepwx > cmp2.txt
>   $ diff -sq cmp1.txt cmp2.txt
>   Files cmp1.txt and cmp2.txt differ
> 
>  After:
> 
>   $ perf script > cmp1.txt
>   $ perf script --itrace=cepwx > cmp2.txt
>   $ diff -sq cmp1.txt cmp2.txt
>   Files cmp1.txt and cmp2.txt are identical
> 
> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
> Fixes: 90e457f7be08 ("perf tools: Add Intel PT support")
> Cc: stable@vger.kernel.org # v4.20+
> ---
>  tools/perf/util/intel-pt.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/perf/util/intel-pt.c b/tools/perf/util/intel-pt.c
> index 551ae10d1c7b..7a70693c1b91 100644
> --- a/tools/perf/util/intel-pt.c
> +++ b/tools/perf/util/intel-pt.c
> @@ -2626,7 +2626,8 @@ int intel_pt_process_auxtrace_info(union perf_event *event,
>  	} else {
>  		itrace_synth_opts__set_default(&pt->synth_opts,
>  				session->itrace_synth_opts->default_no_sample);
> -		if (use_browser != -1) {
> +		if (!session->itrace_synth_opts->default_no_sample &&
> +		    !session->itrace_synth_opts->inject) {
>  			pt->synth_opts.branches = false;
>  			pt->synth_opts.callchain = true;
>  		}
> -- 
> 2.17.1

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [tip:perf/core] perf intel-pt: Fix itrace defaults for perf script
  2019-05-20 11:37 ` [PATCH 01/22] perf intel-pt: Fix itrace defaults for perf script Adrian Hunter
  2019-05-20 13:58   ` Sasha Levin
  2019-05-20 14:45   ` Arnaldo Carvalho de Melo
@ 2019-05-30  7:53   ` tip-bot for Adrian Hunter
  2 siblings, 0 replies; 52+ messages in thread
From: tip-bot for Adrian Hunter @ 2019-05-30  7:53 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, jolsa, mingo, tglx, acme, hpa, adrian.hunter

Commit-ID:  26f19c2eb7e54015564ff133b91983a74e84541b
Gitweb:     https://git.kernel.org/tip/26f19c2eb7e54015564ff133b91983a74e84541b
Author:     Adrian Hunter <adrian.hunter@intel.com>
AuthorDate: Mon, 20 May 2019 14:37:07 +0300
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Tue, 28 May 2019 18:37:42 -0300

perf intel-pt: Fix itrace defaults for perf script

Commit 4eb068157121 ("perf script: Make itrace script default to all
calls") does not work because 'use_browser' is being used to determine
whether to default to periodic sampling (i.e. better for perf report).
The result is that nothing but CBR events display for perf script when
no --itrace option is specified.

Fix by using 'default_no_sample' and 'inject' instead.

Example:

 Before:

  $ perf record -e intel_pt/cyc/u ls
  $ perf script > cmp1.txt
  $ perf script --itrace=cepwx > cmp2.txt
  $ diff -sq cmp1.txt cmp2.txt
  Files cmp1.txt and cmp2.txt differ

 After:

  $ perf script > cmp1.txt
  $ perf script --itrace=cepwx > cmp2.txt
  $ diff -sq cmp1.txt cmp2.txt
  Files cmp1.txt and cmp2.txt are identical

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org # v4.20+
Fixes: 90e457f7be08 ("perf tools: Add Intel PT support")
Link: http://lkml.kernel.org/r/20190520113728.14389-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/intel-pt.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/intel-pt.c b/tools/perf/util/intel-pt.c
index 6d288237887b..03b1da6d1da4 100644
--- a/tools/perf/util/intel-pt.c
+++ b/tools/perf/util/intel-pt.c
@@ -2588,7 +2588,8 @@ int intel_pt_process_auxtrace_info(union perf_event *event,
 	} else {
 		itrace_synth_opts__set_default(&pt->synth_opts,
 				session->itrace_synth_opts->default_no_sample);
-		if (use_browser != -1) {
+		if (!session->itrace_synth_opts->default_no_sample &&
+		    !session->itrace_synth_opts->inject) {
 			pt->synth_opts.branches = false;
 			pt->synth_opts.callchain = true;
 		}

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [tip:perf/core] perf auxtrace: Fix itrace defaults for perf script
  2019-05-20 11:37 ` [PATCH 02/22] perf auxtrace: " Adrian Hunter
@ 2019-05-30  7:54   ` tip-bot for Adrian Hunter
  0 siblings, 0 replies; 52+ messages in thread
From: tip-bot for Adrian Hunter @ 2019-05-30  7:54 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, adrian.hunter, acme, hpa, mingo, tglx, jolsa

Commit-ID:  355200e0f6a9ce14771625014aa469f5ecbd8977
Gitweb:     https://git.kernel.org/tip/355200e0f6a9ce14771625014aa469f5ecbd8977
Author:     Adrian Hunter <adrian.hunter@intel.com>
AuthorDate: Mon, 20 May 2019 14:37:08 +0300
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Tue, 28 May 2019 18:37:42 -0300

perf auxtrace: Fix itrace defaults for perf script

Commit 4eb068157121 ("perf script: Make itrace script default to all
calls") does not work for the case when '--itrace' only is used, because
default_no_sample is not being passed.

Example:

 Before:

  $ perf record -e intel_pt/cyc/u ls
  $ perf script --itrace > cmp1.txt
  $ perf script --itrace=cepwx > cmp2.txt
  $ diff -sq cmp1.txt cmp2.txt
  Files cmp1.txt and cmp2.txt differ

 After:

  $ perf script --itrace > cmp1.txt
  $ perf script --itrace=cepwx > cmp2.txt
  $ diff -sq cmp1.txt cmp2.txt
  Files cmp1.txt and cmp2.txt are identical

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org
Fixes: 4eb068157121 ("perf script: Make itrace script default to all calls")
Link: http://lkml.kernel.org/r/20190520113728.14389-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/auxtrace.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
index fb76b6b232d4..5dd9d1893b89 100644
--- a/tools/perf/util/auxtrace.c
+++ b/tools/perf/util/auxtrace.c
@@ -1010,7 +1010,8 @@ int itrace_parse_synth_opts(const struct option *opt, const char *str,
 	}
 
 	if (!str) {
-		itrace_synth_opts__set_default(synth_opts, false);
+		itrace_synth_opts__set_default(synth_opts,
+					       synth_opts->default_no_sample);
 		return 0;
 	}
 

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [tip:perf/core] perf intel-pt: Fix itrace defaults for perf script intel-pt documentation
  2019-05-20 11:37 ` [PATCH 03/22] perf intel-pt: Fix itrace defaults for perf script intel-pt documentation Adrian Hunter
@ 2019-05-30  7:54   ` tip-bot for Adrian Hunter
  0 siblings, 0 replies; 52+ messages in thread
From: tip-bot for Adrian Hunter @ 2019-05-30  7:54 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: adrian.hunter, linux-kernel, tglx, acme, jolsa, hpa, mingo

Commit-ID:  a2d8a1585e35444789c1c8cf7e2e51fb15589880
Gitweb:     https://git.kernel.org/tip/a2d8a1585e35444789c1c8cf7e2e51fb15589880
Author:     Adrian Hunter <adrian.hunter@intel.com>
AuthorDate: Mon, 20 May 2019 14:37:09 +0300
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Tue, 28 May 2019 18:37:42 -0300

perf intel-pt: Fix itrace defaults for perf script intel-pt documentation

Fix intel-pt documentation to reflect the change of itrace defaults for
perf script.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org
Fixes: 4eb068157121 ("perf script: Make itrace script default to all calls")
Link: http://lkml.kernel.org/r/20190520113728.14389-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/Documentation/intel-pt.txt | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/tools/perf/Documentation/intel-pt.txt b/tools/perf/Documentation/intel-pt.txt
index 115eaacc455f..60d99e5e7921 100644
--- a/tools/perf/Documentation/intel-pt.txt
+++ b/tools/perf/Documentation/intel-pt.txt
@@ -88,16 +88,16 @@ smaller.
 
 To represent software control flow, "branches" samples are produced.  By default
 a branch sample is synthesized for every single branch.  To get an idea what
-data is available you can use the 'perf script' tool with no parameters, which
-will list all the samples.
+data is available you can use the 'perf script' tool with all itrace sampling
+options, which will list all the samples.
 
 	perf record -e intel_pt//u ls
-	perf script
+	perf script --itrace=ibxwpe
 
 An interesting field that is not printed by default is 'flags' which can be
 displayed as follows:
 
-	perf script -Fcomm,tid,pid,time,cpu,event,trace,ip,sym,dso,addr,symoff,flags
+	perf script --itrace=ibxwpe -F+flags
 
 The flags are "bcrosyiABEx" which stand for branch, call, return, conditional,
 system, asynchronous, interrupt, transaction abort, trace begin, trace end, and
@@ -713,7 +713,7 @@ Having no option is the same as
 
 which, in turn, is the same as
 
-	--itrace=ibxwpe
+	--itrace=cepwx
 
 The letters are:
 

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* Re: [PATCH 18/22] perf scripts python: exported-sql-viewer.py: Add IPC information to the Branch reports
  2019-05-20 11:37 ` [PATCH 18/22] perf scripts python: exported-sql-viewer.py: Add IPC information to the Branch reports Adrian Hunter
@ 2019-05-31 16:44   ` Arnaldo Carvalho de Melo
  2019-06-03  5:50     ` Adrian Hunter
  2019-06-17 19:09   ` [tip:perf/core] " tip-bot for Adrian Hunter
  1 sibling, 1 reply; 52+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-05-31 16:44 UTC (permalink / raw)
  To: Adrian Hunter; +Cc: Jiri Olsa, linux-kernel

Em Mon, May 20, 2019 at 02:37:24PM +0300, Adrian Hunter escreveu:
> Enhance the "All branches" and "Selected branches" reports to display IPC
> information if it is available.

So, testing this I noticed that it all starts with the left arrow in every
line, that should mean there is some tree there, i.e. look at all those ▶
symbols:

Time              CPU Command         PID   TID   Branch Type  In Tx  Insn Cnt  Cyc Cnt  IPC  Branch
▶ 187836112195670 7   simple-retpolin 23003 23003 trace begin  No     0         0        0               0 unknown (unknown) -> 7f6f33d4f110 _start (ld-2.28.so)
▶ 187836112195987 7   simple-retpolin 23003 23003 trace end    No     0         883      0    7f6f33d4f110 _start (ld-2.28.so) -> 0 unknown (unknown)
▶ 187836112199189 7   simple-retpolin 23003 23003 trace begin  No     0         0        0               0 unknown (unknown) -> 7f6f33d4f110 _start (ld-2.28.so)
▶ 187836112199189 7   simple-retpolin 23003 23003 call         No     0         0        0    7f6f33d4f113 _start+0x3 (ld-2.28.so) -> 7f6f33d4ff50 _dl_start (ld-2.28.so)
▶ 187836112199544 7   simple-retpolin 23003 23003 trace end    No     17        996      0.02 7f6f33d4ff73 _dl_start+0x23 (ld-2.28.so) -> 0 unknown (unknown)
▶ 187836112200939 7   simple-retpolin 23003 23003 trace begin  No     0         0        0               0 unknown (unknown) -> 7f6f33d4ff73 _dl_start+0x23 (ld-2.28.so)
▶ 187836112201229 7   simple-retpolin 23003 23003 trace end    No     1         816      0.00 7f6f33d4ff7a _dl_start+0x2a (ld-2.28.so) -> 0 unknown (unknown)
▶ 187836112203500 7   simple-retpolin 23003 23003 trace begin  No     0         0        0               0 unknown (unknown) -> 7f6f33d4ff7a _dl_start+0x2a (ld-2.28.so)

But if you click on it, that ▶ disappears and a new click doesn't make it
reappear, looks buggy, but seems like a minor oddity that will not prevent me
from applying it now, please check and provide a fix on top of this,

Thanks,

- Arnaldo
 
> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
> ---
>  .../scripts/python/exported-sql-viewer.py     | 102 ++++++++++++++----
>  1 file changed, 83 insertions(+), 19 deletions(-)
> 
> diff --git a/tools/perf/scripts/python/exported-sql-viewer.py b/tools/perf/scripts/python/exported-sql-viewer.py
> index 6fe553258ce5..a607235c8cd9 100755
> --- a/tools/perf/scripts/python/exported-sql-viewer.py
> +++ b/tools/perf/scripts/python/exported-sql-viewer.py
> @@ -1369,11 +1369,11 @@ class FetchMoreRecordsBar():
>  
>  class BranchLevelTwoItem():
>  
> -	def __init__(self, row, text, parent_item):
> +	def __init__(self, row, col, text, parent_item):
>  		self.row = row
>  		self.parent_item = parent_item
> -		self.data = [""] * 8
> -		self.data[7] = text
> +		self.data = [""] * (col + 1)
> +		self.data[col] = text
>  		self.level = 2
>  
>  	def getParentItem(self):
> @@ -1405,6 +1405,7 @@ class BranchLevelOneItem():
>  		self.dbid = data[0]
>  		self.level = 1
>  		self.query_done = False
> +		self.br_col = len(self.data) - 1
>  
>  	def getChildItem(self, row):
>  		return self.child_items[row]
> @@ -1485,7 +1486,7 @@ class BranchLevelOneItem():
>  				while k < 15:
>  					byte_str += "   "
>  					k += 1
> -				self.child_items.append(BranchLevelTwoItem(0, byte_str + " " + text, self))
> +				self.child_items.append(BranchLevelTwoItem(0, self.br_col, byte_str + " " + text, self))
>  				self.child_count += 1
>  			else:
>  				return
> @@ -1536,16 +1537,37 @@ class BranchRootItem():
>  	def getData(self, column):
>  		return ""
>  
> +# Calculate instructions per cycle
> +
> +def CalcIPC(cyc_cnt, insn_cnt):
> +	if cyc_cnt and insn_cnt:
> +		ipc = Decimal(float(insn_cnt) / cyc_cnt)
> +		ipc = str(ipc.quantize(Decimal(".01"), rounding=ROUND_HALF_UP))
> +	else:
> +		ipc = "0"
> +	return ipc
> +
>  # Branch data preparation
>  
> -def BranchDataPrep(query):
> -	data = []
> -	for i in xrange(0, 8):
> -		data.append(query.value(i))
> +def BranchDataPrepBr(query, data):
>  	data.append(tohex(query.value(8)).rjust(16) + " " + query.value(9) + offstr(query.value(10)) +
>  			" (" + dsoname(query.value(11)) + ")" + " -> " +
>  			tohex(query.value(12)) + " " + query.value(13) + offstr(query.value(14)) +
>  			" (" + dsoname(query.value(15)) + ")")
> +
> +def BranchDataPrepIPC(query, data):
> +	insn_cnt = query.value(16)
> +	cyc_cnt = query.value(17)
> +	ipc = CalcIPC(cyc_cnt, insn_cnt)
> +	data.append(insn_cnt)
> +	data.append(cyc_cnt)
> +	data.append(ipc)
> +
> +def BranchDataPrep(query):
> +	data = []
> +	for i in xrange(0, 8):
> +		data.append(query.value(i))
> +	BranchDataPrepBr(query, data)
>  	return data
>  
>  def BranchDataPrepWA(query):
> @@ -1555,10 +1577,26 @@ def BranchDataPrepWA(query):
>  	data.append("{:>19}".format(query.value(1)))
>  	for i in xrange(2, 8):
>  		data.append(query.value(i))
> -	data.append(tohex(query.value(8)).rjust(16) + " " + query.value(9) + offstr(query.value(10)) +
> -			" (" + dsoname(query.value(11)) + ")" + " -> " +
> -			tohex(query.value(12)) + " " + query.value(13) + offstr(query.value(14)) +
> -			" (" + dsoname(query.value(15)) + ")")
> +	BranchDataPrepBr(query, data)
> +	return data
> +
> +def BranchDataWithIPCPrep(query):
> +	data = []
> +	for i in xrange(0, 8):
> +		data.append(query.value(i))
> +	BranchDataPrepIPC(query, data)
> +	BranchDataPrepBr(query, data)
> +	return data
> +
> +def BranchDataWithIPCPrepWA(query):
> +	data = []
> +	data.append(query.value(0))
> +	# Workaround pyside failing to handle large integers (i.e. time) in python3 by converting to a string
> +	data.append("{:>19}".format(query.value(1)))
> +	for i in xrange(2, 8):
> +		data.append(query.value(i))
> +	BranchDataPrepIPC(query, data)
> +	BranchDataPrepBr(query, data)
>  	return data
>  
>  # Branch data model
> @@ -1572,10 +1610,20 @@ class BranchModel(TreeModel):
>  		self.event_id = event_id
>  		self.more = True
>  		self.populated = 0
> +		self.have_ipc = IsSelectable(glb.db, "samples", columns = "insn_count, cyc_count")
> +		if self.have_ipc:
> +			select_ipc = ", insn_count, cyc_count"
> +			prep_fn = BranchDataWithIPCPrep
> +			prep_wa_fn = BranchDataWithIPCPrepWA
> +		else:
> +			select_ipc = ""
> +			prep_fn = BranchDataPrep
> +			prep_wa_fn = BranchDataPrepWA
>  		sql = ("SELECT samples.id, time, cpu, comm, pid, tid, branch_types.name,"
>  			" CASE WHEN in_tx = '0' THEN 'No' ELSE 'Yes' END,"
>  			" ip, symbols.name, sym_offset, dsos.short_name,"
>  			" to_ip, to_symbols.name, to_sym_offset, to_dsos.short_name"
> +			+ select_ipc +
>  			" FROM samples"
>  			" INNER JOIN comms ON comm_id = comms.id"
>  			" INNER JOIN threads ON thread_id = threads.id"
> @@ -1589,9 +1637,9 @@ class BranchModel(TreeModel):
>  			" ORDER BY samples.id"
>  			" LIMIT " + str(glb_chunk_sz))
>  		if pyside_version_1 and sys.version_info[0] == 3:
> -			prep = BranchDataPrepWA
> +			prep = prep_fn
>  		else:
> -			prep = BranchDataPrep
> +			prep = prep_wa_fn
>  		self.fetcher = SQLFetcher(glb, sql, prep, self.AddSample)
>  		self.fetcher.done.connect(self.Update)
>  		self.fetcher.Fetch(glb_chunk_sz)
> @@ -1600,13 +1648,23 @@ class BranchModel(TreeModel):
>  		return BranchRootItem()
>  
>  	def columnCount(self, parent=None):
> -		return 8
> +		if self.have_ipc:
> +			return 11
> +		else:
> +			return 8
>  
>  	def columnHeader(self, column):
> -		return ("Time", "CPU", "Command", "PID", "TID", "Branch Type", "In Tx", "Branch")[column]
> +		if self.have_ipc:
> +			return ("Time", "CPU", "Command", "PID", "TID", "Branch Type", "In Tx", "Insn Cnt", "Cyc Cnt", "IPC", "Branch")[column]
> +		else:
> +			return ("Time", "CPU", "Command", "PID", "TID", "Branch Type", "In Tx", "Branch")[column]
>  
>  	def columnFont(self, column):
> -		if column != 7:
> +		if self.have_ipc:
> +			br_col = 10
> +		else:
> +			br_col = 7
> +		if column != br_col:
>  			return None
>  		return QFont("Monospace")
>  
> @@ -2114,10 +2172,10 @@ def GetEventList(db):
>  
>  # Is a table selectable
>  
> -def IsSelectable(db, table, sql = ""):
> +def IsSelectable(db, table, sql = "", columns = "*"):
>  	query = QSqlQuery(db)
>  	try:
> -		QueryExec(query, "SELECT * FROM " + table + " " + sql + " LIMIT 1")
> +		QueryExec(query, "SELECT " + columns + " FROM " + table + " " + sql + " LIMIT 1")
>  	except:
>  		return False
>  	return True
> @@ -2854,6 +2912,12 @@ cd xed
>  sudo ./mfile.py --prefix=/usr/local install
>  sudo ldconfig
>  </pre>
> +<h3>Instructions per Cycle (IPC)</h3>
> +If available, IPC information is displayed in columns 'insn_cnt', 'cyc_cnt' and 'IPC'.
> +<p><b>Intel PT note:</b> The information applies to the blocks of code ending with, and including, that branch.
> +Due to the granularity of timing information, the number of cycles for some code blocks will not be known.
> +In that case, 'insn_cnt', 'cyc_cnt' and 'IPC' are zero, but when 'IPC' is displayed it covers the period
> +since the previous displayed 'IPC'.
>  <h3>Find</h3>
>  Ctrl-F displays a Find bar which finds substrings by either an exact match or a regular expression match.
>  Refer to Python documentation for the regular expression syntax.
> -- 
> 2.17.1

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 01/22] perf intel-pt: Fix itrace defaults for perf script
  2019-05-20 14:45   ` Arnaldo Carvalho de Melo
@ 2019-05-31 16:45     ` Arnaldo Carvalho de Melo
  2019-06-04 11:32       ` Adrian Hunter
  0 siblings, 1 reply; 52+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-05-31 16:45 UTC (permalink / raw)
  To: Adrian Hunter; +Cc: Jiri Olsa, linux-kernel

Em Mon, May 20, 2019 at 11:45:16AM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Mon, May 20, 2019 at 02:37:07PM +0300, Adrian Hunter escreveu:
> > Commit 4eb068157121 ("perf script: Make itrace script default to all
> > calls") does not work because 'use_browser' is being used to determine
> > whether to default to periodic sampling (i.e. better for perf report).
> > The result is that nothing but CBR events display for perf script
> > when no --itrace option is specified.
> > 
> > Fix by using 'default_no_sample' and 'inject' instead.
> 
> Applied 1-3 for now, concentrating on fixes, will process 4-22 later.

Tested the ones that affect sqlite databases an the GUI, all look good,
added committer notes and text screenshots as committer notes, thanks!

- Arnaldo

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 18/22] perf scripts python: exported-sql-viewer.py: Add IPC information to the Branch reports
  2019-05-31 16:44   ` Arnaldo Carvalho de Melo
@ 2019-06-03  5:50     ` Adrian Hunter
  0 siblings, 0 replies; 52+ messages in thread
From: Adrian Hunter @ 2019-06-03  5:50 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: Jiri Olsa, linux-kernel

On 31/05/19 7:44 PM, Arnaldo Carvalho de Melo wrote:
> Em Mon, May 20, 2019 at 02:37:24PM +0300, Adrian Hunter escreveu:
>> Enhance the "All branches" and "Selected branches" reports to display IPC
>> information if it is available.
> 
> So, testing this I noticed that it all starts with the left arrow in every
> line, that should mean there is some tree there, i.e. look at all those ▶
> symbols:
> 
> Time              CPU Command         PID   TID   Branch Type  In Tx  Insn Cnt  Cyc Cnt  IPC  Branch
> ▶ 187836112195670 7   simple-retpolin 23003 23003 trace begin  No     0         0        0               0 unknown (unknown) -> 7f6f33d4f110 _start (ld-2.28.so)
> ▶ 187836112195987 7   simple-retpolin 23003 23003 trace end    No     0         883      0    7f6f33d4f110 _start (ld-2.28.so) -> 0 unknown (unknown)
> ▶ 187836112199189 7   simple-retpolin 23003 23003 trace begin  No     0         0        0               0 unknown (unknown) -> 7f6f33d4f110 _start (ld-2.28.so)
> ▶ 187836112199189 7   simple-retpolin 23003 23003 call         No     0         0        0    7f6f33d4f113 _start+0x3 (ld-2.28.so) -> 7f6f33d4ff50 _dl_start (ld-2.28.so)
> ▶ 187836112199544 7   simple-retpolin 23003 23003 trace end    No     17        996      0.02 7f6f33d4ff73 _dl_start+0x23 (ld-2.28.so) -> 0 unknown (unknown)
> ▶ 187836112200939 7   simple-retpolin 23003 23003 trace begin  No     0         0        0               0 unknown (unknown) -> 7f6f33d4ff73 _dl_start+0x23 (ld-2.28.so)
> ▶ 187836112201229 7   simple-retpolin 23003 23003 trace end    No     1         816      0.00 7f6f33d4ff7a _dl_start+0x2a (ld-2.28.so) -> 0 unknown (unknown)
> ▶ 187836112203500 7   simple-retpolin 23003 23003 trace begin  No     0         0        0               0 unknown (unknown) -> 7f6f33d4ff7a _dl_start+0x2a (ld-2.28.so)
> 
> But if you click on it, that ▶ disappears and a new click doesn't make it
> reappear, looks buggy, but seems like a minor oddity that will not prevent me
> from applying it now, please check and provide a fix on top of this,

The arrow is to display disssassembly, but only if xed is installed and the
object is in the buildid cache.  Unfortunately, it is not efficient to
determine if there is anything to expand before the user clicks.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 01/22] perf intel-pt: Fix itrace defaults for perf script
  2019-05-31 16:45     ` Arnaldo Carvalho de Melo
@ 2019-06-04 11:32       ` Adrian Hunter
  2019-06-04 13:20         ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 52+ messages in thread
From: Adrian Hunter @ 2019-06-04 11:32 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: Jiri Olsa, linux-kernel

On 31/05/19 7:45 PM, Arnaldo Carvalho de Melo wrote:
> Em Mon, May 20, 2019 at 11:45:16AM -0300, Arnaldo Carvalho de Melo escreveu:
>> Em Mon, May 20, 2019 at 02:37:07PM +0300, Adrian Hunter escreveu:
>>> Commit 4eb068157121 ("perf script: Make itrace script default to all
>>> calls") does not work because 'use_browser' is being used to determine
>>> whether to default to periodic sampling (i.e. better for perf report).
>>> The result is that nothing but CBR events display for perf script
>>> when no --itrace option is specified.
>>>
>>> Fix by using 'default_no_sample' and 'inject' instead.
>>
>> Applied 1-3 for now, concentrating on fixes, will process 4-22 later.
> 
> Tested the ones that affect sqlite databases an the GUI, all look good,
> added committer notes and text screenshots as committer notes, thanks!

Thanks for applying!  Are patches 11-22 in your tree?

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 01/22] perf intel-pt: Fix itrace defaults for perf script
  2019-06-04 11:32       ` Adrian Hunter
@ 2019-06-04 13:20         ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 52+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-06-04 13:20 UTC (permalink / raw)
  To: Adrian Hunter; +Cc: Arnaldo Carvalho de Melo, Jiri Olsa, linux-kernel

Em Tue, Jun 04, 2019 at 02:32:15PM +0300, Adrian Hunter escreveu:
> On 31/05/19 7:45 PM, Arnaldo Carvalho de Melo wrote:
> > Em Mon, May 20, 2019 at 11:45:16AM -0300, Arnaldo Carvalho de Melo escreveu:
> >> Em Mon, May 20, 2019 at 02:37:07PM +0300, Adrian Hunter escreveu:
> >>> Commit 4eb068157121 ("perf script: Make itrace script default to all
> >>> calls") does not work because 'use_browser' is being used to determine
> >>> whether to default to periodic sampling (i.e. better for perf report).
> >>> The result is that nothing but CBR events display for perf script
> >>> when no --itrace option is specified.

> >>> Fix by using 'default_no_sample' and 'inject' instead.

> >> Applied 1-3 for now, concentrating on fixes, will process 4-22 later.

> > Tested the ones that affect sqlite databases an the GUI, all look good,
> > added committer notes and text screenshots as committer notes, thanks!
 
> Thanks for applying!  Are patches 11-22 in your tree?

Yeah, delayed a bit push out as I was doing some automated tests, see
below what is there right now, if there is something missing, please let
me know, I also added commiter notes to most of them as I tested to see
if the comments matched the code and end GUI results.

- Arnaldo

ed1108bd8107 (HEAD -> perf/core, quaco/perf/core, acme/perf/core) perf augmented_raw_syscalls: Tell which args are filenames and how many bytes to copy
095b72ea8bc6 perf scripts python: exported-sql-viewer.py: Select find text when find bar is activated
08810ed07c90 perf scripts python: exported-sql-viewer.py: Add IPC information to Call Tree
96234e319dbe perf scripts python: exported-sql-viewer.py: Add IPC information to Call Graph Graph
18d0b3fa8f07 perf scripts python: exported-sql-viewer.py: Add CallGraphModelParams
a3f0fbba872e perf scripts python: exported-sql-viewer.py: Add IPC information to the Branch reports
44bd2140868f perf scripts python: export-to-postgresql.py: Export IPC information
ea76f3ba4a25 perf scripts python: export-to-sqlite.py: Export IPC information
75302e535b23 perf db-export: Export IPC information
a99d75fa022c perf db-export: Add brief documentation
0f9eea5e6974 perf thread-stack: Accumulate IPC information
1e80115f84a3 perf intel-pt: Document IPC usage
990917373744 perf intel-pt: Accumulate cycle count from TSC/TMA/MTC packets
17fa778da83e perf intel-pt: Re-factor TIP cases in intel_pt_walk_to_ip
dbb8dfd91a18 perf intel-pt: Record when decoding PSB+ packets
03386978a23b perf script: Add output of IPC ratio
f7c15a499e28 perf intel-pt: Add support for samples to contain IPC ratio
885ae6d43172 perf tools: Add IPC information to perf_sample
db3e60dd79a4 perf intel-pt: Accumulate cycle count from CYC packets
73a5a91c762d perf intel-pt: Factor out intel_pt_update_sample_time
3bbac0d75719 perf record: Allow mixing --user-regs with --call-graph=dwarf
7122f4bdb56d perf symbols: Remove unused variable 'err'
eea796aa2d6f perf data: Document directory format header: HEADER_DIR_FORMAT
cad909dc61fb perf data: Document clockid header: HEADER_CLOCKID
e1b3ce6a6898 perf data: Document memory topology header: HEADER_MEM_TOPOLOGY
8ab1ebfbb696 perf data: Add description of header HEADER_BPF_PROG_INFO and HEADER_BPF_BTF
b33fb3cf6f5e Merge tag 'perf-core-for-mingo-5.3-20190529' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
14f1cfd4f7b4 (tag: perf-core-for-mingo-5.3-20190529) perf intel-pt: Rationalize intel_pt_sync_switch()'s use of next_tid
<SNIP>

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [tip:perf/core] perf intel-pt: Factor out intel_pt_update_sample_time
  2019-05-20 11:37 ` [PATCH 04/22] perf intel-pt: Factor out intel_pt_update_sample_time Adrian Hunter
@ 2019-06-17 18:59   ` tip-bot for Adrian Hunter
  0 siblings, 0 replies; 52+ messages in thread
From: tip-bot for Adrian Hunter @ 2019-06-17 18:59 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: mingo, linux-kernel, adrian.hunter, hpa, tglx, jolsa, acme

Commit-ID:  948e9dc8bb266649a618ac974010292bf36fb213
Gitweb:     https://git.kernel.org/tip/948e9dc8bb266649a618ac974010292bf36fb213
Author:     Adrian Hunter <adrian.hunter@intel.com>
AuthorDate: Mon, 20 May 2019 14:37:10 +0300
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Wed, 5 Jun 2019 09:47:54 -0300

perf intel-pt: Factor out intel_pt_update_sample_time

To eliminate some duplication and make the code more understandable,
factor out intel_pt_update_sample_time.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20190520113728.14389-5-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/intel-pt-decoder/intel-pt-decoder.c | 18 ++++++++++--------
 1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
index f4c3c84b090f..1ab4070b5633 100644
--- a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
+++ b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
@@ -479,6 +479,12 @@ static int intel_pt_bad_packet(struct intel_pt_decoder *decoder)
 	return -EBADMSG;
 }
 
+static inline void intel_pt_update_sample_time(struct intel_pt_decoder *decoder)
+{
+	decoder->sample_timestamp = decoder->timestamp;
+	decoder->sample_insn_cnt = decoder->timestamp_insn_cnt;
+}
+
 static int intel_pt_get_data(struct intel_pt_decoder *decoder)
 {
 	struct intel_pt_buffer buffer = { .buf = 0, };
@@ -1319,8 +1325,7 @@ static int intel_pt_walk_tnt(struct intel_pt_decoder *decoder)
 			}
 			decoder->ip += intel_pt_insn.length;
 			if (!decoder->tnt.count) {
-				decoder->sample_timestamp = decoder->timestamp;
-				decoder->sample_insn_cnt = decoder->timestamp_insn_cnt;
+				intel_pt_update_sample_time(decoder);
 				return -EAGAIN;
 			}
 			decoder->tnt.payload <<= 1;
@@ -2413,8 +2418,7 @@ const struct intel_pt_state *intel_pt_decode(struct intel_pt_decoder *decoder)
 	if (err) {
 		decoder->state.err = intel_pt_ext_err(err);
 		decoder->state.from_ip = decoder->ip;
-		decoder->sample_timestamp = decoder->timestamp;
-		decoder->sample_insn_cnt = decoder->timestamp_insn_cnt;
+		intel_pt_update_sample_time(decoder);
 	} else {
 		decoder->state.err = 0;
 		if (decoder->cbr != decoder->cbr_seen && decoder->state.type) {
@@ -2422,10 +2426,8 @@ const struct intel_pt_state *intel_pt_decode(struct intel_pt_decoder *decoder)
 			decoder->state.type |= INTEL_PT_CBR_CHG;
 			decoder->state.cbr_payload = decoder->cbr_payload;
 		}
-		if (intel_pt_sample_time(decoder->pkt_state)) {
-			decoder->sample_timestamp = decoder->timestamp;
-			decoder->sample_insn_cnt = decoder->timestamp_insn_cnt;
-		}
+		if (intel_pt_sample_time(decoder->pkt_state))
+			intel_pt_update_sample_time(decoder);
 	}
 
 	decoder->state.timestamp = decoder->sample_timestamp;

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [tip:perf/core] perf intel-pt: Accumulate cycle count from CYC packets
  2019-05-20 11:37 ` [PATCH 05/22] perf intel-pt: Accumulate cycle count from CYC packets Adrian Hunter
@ 2019-06-17 18:59   ` tip-bot for Adrian Hunter
  0 siblings, 0 replies; 52+ messages in thread
From: tip-bot for Adrian Hunter @ 2019-06-17 18:59 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: adrian.hunter, jolsa, hpa, linux-kernel, acme, tglx, mingo

Commit-ID:  7b4b4f83881e11b1fe5d8743953f81addb0871de
Gitweb:     https://git.kernel.org/tip/7b4b4f83881e11b1fe5d8743953f81addb0871de
Author:     Adrian Hunter <adrian.hunter@intel.com>
AuthorDate: Mon, 20 May 2019 14:37:11 +0300
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Wed, 5 Jun 2019 09:47:54 -0300

perf intel-pt: Accumulate cycle count from CYC packets

In preparation for providing instructions-per-cycle (IPC) information,
accumulate cycle count from CYC packets.

Although CYC packets are optional (requires config term 'cyc' to enable
cycle-accurate mode when recording), the simplest way to count cycles is
with CYC packets.

The first complication is that cycles must be counted only when also
counting instructions.

That means when control flow packet generation is enabled i.e. between
TIP.PGE and TIP.PGD packets.

Also, sampling the cycle count follows the same rules as sampling the
timestamp, that is, not before the instruction to which the decoder is
walking is reached.

In addition, the cycle count is not accurate for any but the first
branch of a TNT packet.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20190520113728.14389-6-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/intel-pt-decoder/intel-pt-decoder.c | 14 +++++++++++++-
 tools/perf/util/intel-pt-decoder/intel-pt-decoder.h |  1 +
 2 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
index 1ab4070b5633..ef3a1c1cd250 100644
--- a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
+++ b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
@@ -160,6 +160,8 @@ struct intel_pt_decoder {
 	uint64_t period_mask;
 	uint64_t period_ticks;
 	uint64_t last_masked_timestamp;
+	uint64_t tot_cyc_cnt;
+	uint64_t sample_tot_cyc_cnt;
 	bool continuous_period;
 	bool overflow;
 	bool set_fup_tx_flags;
@@ -167,6 +169,7 @@ struct intel_pt_decoder {
 	bool set_fup_mwait;
 	bool set_fup_pwre;
 	bool set_fup_exstop;
+	bool sample_cyc;
 	unsigned int fup_tx_flags;
 	unsigned int tx_flags;
 	uint64_t fup_ptw_payload;
@@ -1323,6 +1326,7 @@ static int intel_pt_walk_tnt(struct intel_pt_decoder *decoder)
 				decoder->ip += intel_pt_insn.length;
 				return 0;
 			}
+			decoder->sample_cyc = false;
 			decoder->ip += intel_pt_insn.length;
 			if (!decoder->tnt.count) {
 				intel_pt_update_sample_time(decoder);
@@ -1515,6 +1519,9 @@ static void intel_pt_calc_cyc_timestamp(struct intel_pt_decoder *decoder)
 	decoder->have_cyc = true;
 
 	decoder->cycle_cnt += decoder->packet.payload;
+	if (decoder->pge)
+		decoder->tot_cyc_cnt += decoder->packet.payload;
+	decoder->sample_cyc = true;
 
 	if (!decoder->cyc_ref_timestamp)
 		return;
@@ -2419,6 +2426,7 @@ const struct intel_pt_state *intel_pt_decode(struct intel_pt_decoder *decoder)
 		decoder->state.err = intel_pt_ext_err(err);
 		decoder->state.from_ip = decoder->ip;
 		intel_pt_update_sample_time(decoder);
+		decoder->sample_tot_cyc_cnt = decoder->tot_cyc_cnt;
 	} else {
 		decoder->state.err = 0;
 		if (decoder->cbr != decoder->cbr_seen && decoder->state.type) {
@@ -2426,14 +2434,18 @@ const struct intel_pt_state *intel_pt_decode(struct intel_pt_decoder *decoder)
 			decoder->state.type |= INTEL_PT_CBR_CHG;
 			decoder->state.cbr_payload = decoder->cbr_payload;
 		}
-		if (intel_pt_sample_time(decoder->pkt_state))
+		if (intel_pt_sample_time(decoder->pkt_state)) {
 			intel_pt_update_sample_time(decoder);
+			if (decoder->sample_cyc)
+				decoder->sample_tot_cyc_cnt = decoder->tot_cyc_cnt;
+		}
 	}
 
 	decoder->state.timestamp = decoder->sample_timestamp;
 	decoder->state.est_timestamp = intel_pt_est_timestamp(decoder);
 	decoder->state.cr3 = decoder->cr3;
 	decoder->state.tot_insn_cnt = decoder->tot_insn_cnt;
+	decoder->state.tot_cyc_cnt = decoder->sample_tot_cyc_cnt;
 
 	return &decoder->state;
 }
diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.h b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.h
index ed088d4726ba..6a61773dc44b 100644
--- a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.h
+++ b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.h
@@ -77,6 +77,7 @@ struct intel_pt_state {
 	uint64_t to_ip;
 	uint64_t cr3;
 	uint64_t tot_insn_cnt;
+	uint64_t tot_cyc_cnt;
 	uint64_t timestamp;
 	uint64_t est_timestamp;
 	uint64_t trace_nr;

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [tip:perf/core] perf tools: Add IPC information to perf_sample
  2019-05-20 11:37 ` [PATCH 06/22] perf tools: Add IPC information to perf_sample Adrian Hunter
@ 2019-06-17 19:00   ` tip-bot for Adrian Hunter
  0 siblings, 0 replies; 52+ messages in thread
From: tip-bot for Adrian Hunter @ 2019-06-17 19:00 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: tglx, jolsa, adrian.hunter, acme, linux-kernel, hpa, mingo

Commit-ID:  61d276f428a11f0e4ce5203462fa488e6570684f
Gitweb:     https://git.kernel.org/tip/61d276f428a11f0e4ce5203462fa488e6570684f
Author:     Adrian Hunter <adrian.hunter@intel.com>
AuthorDate: Mon, 20 May 2019 14:37:12 +0300
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Wed, 5 Jun 2019 09:47:55 -0300

perf tools: Add IPC information to perf_sample

Add counts of instructions and cycles, in order to represent
instructions-per-cycle (IPC).

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20190520113728.14389-7-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/event.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
index 9e999550f247..1f1da6082806 100644
--- a/tools/perf/util/event.h
+++ b/tools/perf/util/event.h
@@ -204,6 +204,8 @@ struct perf_sample {
 	u64 period;
 	u64 weight;
 	u64 transaction;
+	u64 insn_cnt;
+	u64 cyc_cnt;
 	u32 cpu;
 	u32 raw_size;
 	u64 data_src;

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [tip:perf/core] perf intel-pt: Add support for samples to contain IPC ratio
  2019-05-20 11:37 ` [PATCH 07/22] perf intel-pt: Add support for samples to contain IPC ratio Adrian Hunter
@ 2019-06-17 19:01   ` tip-bot for Adrian Hunter
  0 siblings, 0 replies; 52+ messages in thread
From: tip-bot for Adrian Hunter @ 2019-06-17 19:01 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, acme, mingo, jolsa, hpa, tglx, adrian.hunter

Commit-ID:  5b1dc0fd1da06d6e89f1ca8736cfe0ee84e34cc7
Gitweb:     https://git.kernel.org/tip/5b1dc0fd1da06d6e89f1ca8736cfe0ee84e34cc7
Author:     Adrian Hunter <adrian.hunter@intel.com>
AuthorDate: Mon, 20 May 2019 14:37:13 +0300
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Wed, 5 Jun 2019 09:47:56 -0300

perf intel-pt: Add support for samples to contain IPC ratio

Copy the incremental instruction count and cycle count onto 'instructions'
and 'branches' samples.

Because Intel PT does not update the cycle count on every branch or
instruction, the incremental values will often be zero.

When there are values, they will be the number of instructions and
number of cycles since the last update, and thus represent the average
IPC since the last IPC value.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20190520113728.14389-8-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/intel-pt.c | 29 +++++++++++++++++++++++++++++
 1 file changed, 29 insertions(+)

diff --git a/tools/perf/util/intel-pt.c b/tools/perf/util/intel-pt.c
index 7a70693c1b91..3cff8fe2eaa0 100644
--- a/tools/perf/util/intel-pt.c
+++ b/tools/perf/util/intel-pt.c
@@ -157,6 +157,12 @@ struct intel_pt_queue {
 	u32 flags;
 	u16 insn_len;
 	u64 last_insn_cnt;
+	u64 ipc_insn_cnt;
+	u64 ipc_cyc_cnt;
+	u64 last_in_insn_cnt;
+	u64 last_in_cyc_cnt;
+	u64 last_br_insn_cnt;
+	u64 last_br_cyc_cnt;
 	char insn[INTEL_PT_INSN_BUF_SZ];
 };
 
@@ -1162,6 +1168,13 @@ static int intel_pt_synth_branch_sample(struct intel_pt_queue *ptq)
 		sample.branch_stack = (struct branch_stack *)&dummy_bs;
 	}
 
+	sample.cyc_cnt = ptq->ipc_cyc_cnt - ptq->last_br_cyc_cnt;
+	if (sample.cyc_cnt) {
+		sample.insn_cnt = ptq->ipc_insn_cnt - ptq->last_br_insn_cnt;
+		ptq->last_br_insn_cnt = ptq->ipc_insn_cnt;
+		ptq->last_br_cyc_cnt = ptq->ipc_cyc_cnt;
+	}
+
 	return intel_pt_deliver_synth_b_event(pt, event, &sample,
 					      pt->branches_sample_type);
 }
@@ -1217,6 +1230,13 @@ static int intel_pt_synth_instruction_sample(struct intel_pt_queue *ptq)
 	sample.stream_id = ptq->pt->instructions_id;
 	sample.period = ptq->state->tot_insn_cnt - ptq->last_insn_cnt;
 
+	sample.cyc_cnt = ptq->ipc_cyc_cnt - ptq->last_in_cyc_cnt;
+	if (sample.cyc_cnt) {
+		sample.insn_cnt = ptq->ipc_insn_cnt - ptq->last_in_insn_cnt;
+		ptq->last_in_insn_cnt = ptq->ipc_insn_cnt;
+		ptq->last_in_cyc_cnt = ptq->ipc_cyc_cnt;
+	}
+
 	ptq->last_insn_cnt = ptq->state->tot_insn_cnt;
 
 	return intel_pt_deliver_synth_event(pt, ptq, event, &sample,
@@ -1488,6 +1508,15 @@ static int intel_pt_sample(struct intel_pt_queue *ptq)
 
 	ptq->have_sample = false;
 
+	if (ptq->state->tot_cyc_cnt > ptq->ipc_cyc_cnt) {
+		/*
+		 * Cycle count and instruction count only go together to create
+		 * a valid IPC ratio when the cycle count changes.
+		 */
+		ptq->ipc_insn_cnt = ptq->state->tot_insn_cnt;
+		ptq->ipc_cyc_cnt = ptq->state->tot_cyc_cnt;
+	}
+
 	if (pt->sample_pwr_events && (state->type & INTEL_PT_PWR_EVT)) {
 		if (state->type & INTEL_PT_CBR_CHG) {
 			err = intel_pt_synth_cbr_sample(ptq);

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [tip:perf/core] perf script: Add output of IPC ratio
  2019-05-20 11:37 ` [PATCH 08/22] perf script: Add output of " Adrian Hunter
@ 2019-06-17 19:01   ` tip-bot for Adrian Hunter
  0 siblings, 0 replies; 52+ messages in thread
From: tip-bot for Adrian Hunter @ 2019-06-17 19:01 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: acme, adrian.hunter, mingo, hpa, jolsa, tglx, linux-kernel

Commit-ID:  68fb45bf175e702aec6668c776050e5dbd2a6f1f
Gitweb:     https://git.kernel.org/tip/68fb45bf175e702aec6668c776050e5dbd2a6f1f
Author:     Adrian Hunter <adrian.hunter@intel.com>
AuthorDate: Mon, 20 May 2019 14:37:14 +0300
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Wed, 5 Jun 2019 09:47:56 -0300

perf script: Add output of IPC ratio

Add field 'ipc' to display instructions-per-cycle.

Example:

 perf record -e intel_pt/cyc/u ls
 perf script --insn-trace --xed -F+ipc,-dso,-cpu,-tid

 ls  2670177.697113434:  7f0dfdbcd090 _start+0x0      mov %rsp, %rdi   IPC: 0.00 (1/877)
 ls  2670177.697113434:  7f0dfdbcd093 _start+0x3      callq  0x7f0dfdbce030
 ls  2670177.697113434:  7f0dfdbce030 _dl_start+0x0   pushq  %rbp
 ls  2670177.697113434:  7f0dfdbce031 _dl_start+0x1   mov %rsp, %rbp
 ls  2670177.697113434:  7f0dfdbce034 _dl_start+0x4   pushq  %r15
 ls  2670177.697113434:  7f0dfdbce036 _dl_start+0x6   pushq  %r14
 ls  2670177.697113434:  7f0dfdbce038 _dl_start+0x8   pushq  %r13
 ls  2670177.697113434:  7f0dfdbce03a _dl_start+0xa   pushq  %r12
 ls  2670177.697113434:  7f0dfdbce03c _dl_start+0xc   mov %rdi, %r12
 ls  2670177.697113434:  7f0dfdbce03f _dl_start+0xf   pushq  %rbx
 ls  2670177.697113434:  7f0dfdbce040 _dl_start+0x10  sub $0x38, %rsp
 ls  2670177.697113434:  7f0dfdbce044 _dl_start+0x14  rdtsc
 ls  2670177.697113434:  7f0dfdbce046 _dl_start+0x16  mov %eax, %eax
 ls  2670177.697113434:  7f0dfdbce048 _dl_start+0x18  shl $0x20, %rdx
 ls  2670177.697113434:  7f0dfdbce04c _dl_start+0x1c  or %rax, %rdx
 ls  2670177.697114471:  7f0dfdbce04f _dl_start+0x1f  movq  0x27e22(%rip), %rax        IPC: 0.00 (15/1685)
 ls  2670177.697116177:  7f0dfdbce056 _dl_start+0x26  movq  %rdx, 0x27683(%rip)        IPC: 0.00 (1/881)

Note, the IPC values are low due to page faults at the beginning of
execution. The additional cycles are due to the time to enter the
kernel, not the actual kernel page fault handler.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20190520113728.14389-9-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/Documentation/perf-script.txt |  5 ++++-
 tools/perf/builtin-script.c              | 23 ++++++++++++++++++++++-
 2 files changed, 26 insertions(+), 2 deletions(-)

diff --git a/tools/perf/Documentation/perf-script.txt b/tools/perf/Documentation/perf-script.txt
index af8282782911..c59fd52e9e91 100644
--- a/tools/perf/Documentation/perf-script.txt
+++ b/tools/perf/Documentation/perf-script.txt
@@ -117,7 +117,7 @@ OPTIONS
         Comma separated list of fields to print. Options are:
         comm, tid, pid, time, cpu, event, trace, ip, sym, dso, addr, symoff,
         srcline, period, iregs, uregs, brstack, brstacksym, flags, bpf-output, brstackinsn,
-        brstackoff, callindent, insn, insnlen, synth, phys_addr, metric, misc, srccode.
+        brstackoff, callindent, insn, insnlen, synth, phys_addr, metric, misc, srccode, ipc.
         Field list can be prepended with the type, trace, sw or hw,
         to indicate to which event type the field list applies.
         e.g., -F sw:comm,tid,time,ip,sym  and -F trace:time,cpu,trace
@@ -203,6 +203,9 @@ OPTIONS
 	The synth field is used by synthesized events which may be created when
 	Instruction Trace decoding.
 
+	The ipc (instructions per cycle) field is synthesized and may have a value when
+	Instruction Trace decoding.
+
 	Finally, a user may not set fields to none for all event types.
 	i.e., -F "" is not allowed.
 
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 3a48a2627670..80c722ade852 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -102,6 +102,7 @@ enum perf_output_field {
 	PERF_OUTPUT_METRIC	    = 1U << 28,
 	PERF_OUTPUT_MISC            = 1U << 29,
 	PERF_OUTPUT_SRCCODE	    = 1U << 30,
+	PERF_OUTPUT_IPC             = 1U << 31,
 };
 
 struct output_option {
@@ -139,6 +140,7 @@ struct output_option {
 	{.str = "metric", .field = PERF_OUTPUT_METRIC},
 	{.str = "misc", .field = PERF_OUTPUT_MISC},
 	{.str = "srccode", .field = PERF_OUTPUT_SRCCODE},
+	{.str = "ipc", .field = PERF_OUTPUT_IPC},
 };
 
 enum {
@@ -1268,6 +1270,20 @@ static int perf_sample__fprintf_insn(struct perf_sample *sample,
 	return printed;
 }
 
+static int perf_sample__fprintf_ipc(struct perf_sample *sample,
+				    struct perf_event_attr *attr, FILE *fp)
+{
+	unsigned int ipc;
+
+	if (!PRINT_FIELD(IPC) || !sample->cyc_cnt || !sample->insn_cnt)
+		return 0;
+
+	ipc = (sample->insn_cnt * 100) / sample->cyc_cnt;
+
+	return fprintf(fp, " \t IPC: %u.%02u (%" PRIu64 "/%" PRIu64 ") ",
+		       ipc / 100, ipc % 100, sample->insn_cnt, sample->cyc_cnt);
+}
+
 static int perf_sample__fprintf_bts(struct perf_sample *sample,
 				    struct perf_evsel *evsel,
 				    struct thread *thread,
@@ -1312,6 +1328,8 @@ static int perf_sample__fprintf_bts(struct perf_sample *sample,
 		printed += perf_sample__fprintf_addr(sample, thread, attr, fp);
 	}
 
+	printed += perf_sample__fprintf_ipc(sample, attr, fp);
+
 	if (print_srcline_last)
 		printed += map__fprintf_srcline(al->map, al->addr, "\n  ", fp);
 
@@ -1859,6 +1877,9 @@ static void process_event(struct perf_script *script,
 
 	if (PRINT_FIELD(PHYS_ADDR))
 		fprintf(fp, "%16" PRIx64, sample->phys_addr);
+
+	perf_sample__fprintf_ipc(sample, attr, fp);
+
 	fprintf(fp, "\n");
 
 	if (PRINT_FIELD(SRCCODE)) {
@@ -3433,7 +3454,7 @@ int cmd_script(int argc, const char **argv)
 		     "Fields: comm,tid,pid,time,cpu,event,trace,ip,sym,dso,"
 		     "addr,symoff,srcline,period,iregs,uregs,brstack,"
 		     "brstacksym,flags,bpf-output,brstackinsn,brstackoff,"
-		     "callindent,insn,insnlen,synth,phys_addr,metric,misc",
+		     "callindent,insn,insnlen,synth,phys_addr,metric,misc,ipc",
 		     parse_output_fields),
 	OPT_BOOLEAN('a', "all-cpus", &system_wide,
 		    "system-wide collection from all CPUs"),

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [tip:perf/core] perf intel-pt: Record when decoding PSB+ packets
  2019-05-20 11:37 ` [PATCH 09/22] perf intel-pt: Record when decoding PSB+ packets Adrian Hunter
@ 2019-06-17 19:02   ` tip-bot for Adrian Hunter
  0 siblings, 0 replies; 52+ messages in thread
From: tip-bot for Adrian Hunter @ 2019-06-17 19:02 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, acme, mingo, hpa, adrian.hunter, jolsa, tglx

Commit-ID:  9bc668e3bca8fadc50d5a121a1992a72ada0d3f4
Gitweb:     https://git.kernel.org/tip/9bc668e3bca8fadc50d5a121a1992a72ada0d3f4
Author:     Adrian Hunter <adrian.hunter@intel.com>
AuthorDate: Mon, 20 May 2019 14:37:15 +0300
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Wed, 5 Jun 2019 09:47:56 -0300

perf intel-pt: Record when decoding PSB+ packets

In preparation for using MTC packets to count cycles, record whether
decoding is between a PSB and PSBEND packets.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20190520113728.14389-10-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 .../perf/util/intel-pt-decoder/intel-pt-decoder.c  | 41 ++++++++++++++++------
 1 file changed, 31 insertions(+), 10 deletions(-)

diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
index ef3a1c1cd250..a2384a314990 100644
--- a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
+++ b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
@@ -116,6 +116,7 @@ struct intel_pt_decoder {
 	bool have_cyc;
 	bool fixup_last_mtc;
 	bool have_last_ip;
+	bool in_psb;
 	enum intel_pt_param_flags flags;
 	uint64_t pos;
 	uint64_t last_ip;
@@ -1549,14 +1550,17 @@ static int intel_pt_walk_psbend(struct intel_pt_decoder *decoder)
 {
 	int err;
 
+	decoder->in_psb = true;
+
 	while (1) {
 		err = intel_pt_get_next_packet(decoder);
 		if (err)
-			return err;
+			goto out;
 
 		switch (decoder->packet.type) {
 		case INTEL_PT_PSBEND:
-			return 0;
+			err = 0;
+			goto out;
 
 		case INTEL_PT_TIP_PGD:
 		case INTEL_PT_TIP_PGE:
@@ -1574,10 +1578,12 @@ static int intel_pt_walk_psbend(struct intel_pt_decoder *decoder)
 		case INTEL_PT_PWRX:
 			decoder->have_tma = false;
 			intel_pt_log("ERROR: Unexpected packet\n");
-			return -EAGAIN;
+			err = -EAGAIN;
+			goto out;
 
 		case INTEL_PT_OVF:
-			return intel_pt_overflow(decoder);
+			err = intel_pt_overflow(decoder);
+			goto out;
 
 		case INTEL_PT_TSC:
 			intel_pt_calc_tsc_timestamp(decoder);
@@ -1623,6 +1629,10 @@ static int intel_pt_walk_psbend(struct intel_pt_decoder *decoder)
 			break;
 		}
 	}
+out:
+	decoder->in_psb = false;
+
+	return err;
 }
 
 static int intel_pt_walk_fup_tip(struct intel_pt_decoder *decoder)
@@ -1996,10 +2006,12 @@ static int intel_pt_walk_psb(struct intel_pt_decoder *decoder)
 {
 	int err;
 
+	decoder->in_psb = true;
+
 	while (1) {
 		err = intel_pt_get_next_packet(decoder);
 		if (err)
-			return err;
+			goto out;
 
 		switch (decoder->packet.type) {
 		case INTEL_PT_TIP_PGD:
@@ -2015,7 +2027,8 @@ static int intel_pt_walk_psb(struct intel_pt_decoder *decoder)
 		case INTEL_PT_PWRE:
 		case INTEL_PT_PWRX:
 			intel_pt_log("ERROR: Unexpected packet\n");
-			return -ENOENT;
+			err = -ENOENT;
+			goto out;
 
 		case INTEL_PT_FUP:
 			decoder->pge = true;
@@ -2074,16 +2087,20 @@ static int intel_pt_walk_psb(struct intel_pt_decoder *decoder)
 				decoder->pkt_state = INTEL_PT_STATE_ERR4;
 			else
 				decoder->pkt_state = INTEL_PT_STATE_ERR3;
-			return -ENOENT;
+			err = -ENOENT;
+			goto out;
 
 		case INTEL_PT_BAD: /* Does not happen */
-			return intel_pt_bug(decoder);
+			err = intel_pt_bug(decoder);
+			goto out;
 
 		case INTEL_PT_OVF:
-			return intel_pt_overflow(decoder);
+			err = intel_pt_overflow(decoder);
+			goto out;
 
 		case INTEL_PT_PSBEND:
-			return 0;
+			err = 0;
+			goto out;
 
 		case INTEL_PT_PSB:
 		case INTEL_PT_VMCS:
@@ -2093,6 +2110,10 @@ static int intel_pt_walk_psb(struct intel_pt_decoder *decoder)
 			break;
 		}
 	}
+out:
+	decoder->in_psb = false;
+
+	return err;
 }
 
 static int intel_pt_walk_to_ip(struct intel_pt_decoder *decoder)

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [tip:perf/core] perf intel-pt: Re-factor TIP cases in intel_pt_walk_to_ip
  2019-05-20 11:37 ` [PATCH 10/22] perf intel-pt: Re-factor TIP cases in intel_pt_walk_to_ip Adrian Hunter
@ 2019-06-17 19:03   ` tip-bot for Adrian Hunter
  0 siblings, 0 replies; 52+ messages in thread
From: tip-bot for Adrian Hunter @ 2019-06-17 19:03 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: acme, hpa, linux-kernel, adrian.hunter, mingo, tglx, jolsa

Commit-ID:  f3c98c4b5ac831f29b1cc19fa84d3c8401f846d6
Gitweb:     https://git.kernel.org/tip/f3c98c4b5ac831f29b1cc19fa84d3c8401f846d6
Author:     Adrian Hunter <adrian.hunter@intel.com>
AuthorDate: Mon, 20 May 2019 14:37:16 +0300
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Wed, 5 Jun 2019 09:47:56 -0300

perf intel-pt: Re-factor TIP cases in intel_pt_walk_to_ip

To make it easier to add new code for different TIP cases, separate each
case.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20190520113728.14389-11-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 .../perf/util/intel-pt-decoder/intel-pt-decoder.c  | 23 ++++++++++++++++------
 1 file changed, 17 insertions(+), 6 deletions(-)

diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
index a2384a314990..99773445872d 100644
--- a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
+++ b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
@@ -2128,18 +2128,29 @@ static int intel_pt_walk_to_ip(struct intel_pt_decoder *decoder)
 		switch (decoder->packet.type) {
 		case INTEL_PT_TIP_PGD:
 			decoder->continuous_period = false;
-			__fallthrough;
+			decoder->pge = false;
+			if (intel_pt_have_ip(decoder))
+				intel_pt_set_ip(decoder);
+			if (!decoder->ip)
+				break;
+			decoder->state.type |= INTEL_PT_TRACE_END;
+			return 0;
+
 		case INTEL_PT_TIP_PGE:
+			decoder->pge = true;
+			if (intel_pt_have_ip(decoder))
+				intel_pt_set_ip(decoder);
+			if (!decoder->ip)
+				break;
+			decoder->state.type |= INTEL_PT_TRACE_BEGIN;
+			return 0;
+
 		case INTEL_PT_TIP:
-			decoder->pge = decoder->packet.type != INTEL_PT_TIP_PGD;
+			decoder->pge = true;
 			if (intel_pt_have_ip(decoder))
 				intel_pt_set_ip(decoder);
 			if (!decoder->ip)
 				break;
-			if (decoder->packet.type == INTEL_PT_TIP_PGE)
-				decoder->state.type |= INTEL_PT_TRACE_BEGIN;
-			if (decoder->packet.type == INTEL_PT_TIP_PGD)
-				decoder->state.type |= INTEL_PT_TRACE_END;
 			return 0;
 
 		case INTEL_PT_FUP:

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [tip:perf/core] perf intel-pt: Accumulate cycle count from TSC/TMA/MTC packets
  2019-05-20 11:37 ` [PATCH 11/22] perf intel-pt: Accumulate cycle count from TSC/TMA/MTC packets Adrian Hunter
@ 2019-06-17 19:04   ` tip-bot for Adrian Hunter
  0 siblings, 0 replies; 52+ messages in thread
From: tip-bot for Adrian Hunter @ 2019-06-17 19:04 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: mingo, tglx, linux-kernel, hpa, adrian.hunter, jolsa, acme

Commit-ID:  3f05516758bef438cef7adc47599f8b8faad7c3a
Gitweb:     https://git.kernel.org/tip/3f05516758bef438cef7adc47599f8b8faad7c3a
Author:     Adrian Hunter <adrian.hunter@intel.com>
AuthorDate: Mon, 20 May 2019 14:37:17 +0300
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Wed, 5 Jun 2019 09:47:56 -0300

perf intel-pt: Accumulate cycle count from TSC/TMA/MTC packets

When CYC packets are not available, it is still possible to count cycles
using TSC/TMA/MTC timestamps.

As the timestamp increments in TSC ticks, convert to CPU cycles using
the current core-to-bus ratio.

Do not accumulate cycles when control flow packet generation is not
enabled, nor when time has been "lost", typically due to mwait, which is
indicated by a TSC/TMA packet that is not part of PSB+.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20190520113728.14389-12-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 .../perf/util/intel-pt-decoder/intel-pt-decoder.c  | 51 ++++++++++++++++++++++
 1 file changed, 51 insertions(+)

diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
index 99773445872d..9eb778f9c911 100644
--- a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
+++ b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
@@ -163,6 +163,9 @@ struct intel_pt_decoder {
 	uint64_t last_masked_timestamp;
 	uint64_t tot_cyc_cnt;
 	uint64_t sample_tot_cyc_cnt;
+	uint64_t base_cyc_cnt;
+	uint64_t cyc_cnt_timestamp;
+	double tsc_to_cyc;
 	bool continuous_period;
 	bool overflow;
 	bool set_fup_tx_flags;
@@ -1423,6 +1426,42 @@ static int intel_pt_overflow(struct intel_pt_decoder *decoder)
 	return -EOVERFLOW;
 }
 
+static inline void intel_pt_mtc_cyc_cnt_pge(struct intel_pt_decoder *decoder)
+{
+	if (decoder->have_cyc)
+		return;
+
+	decoder->cyc_cnt_timestamp = decoder->timestamp;
+	decoder->base_cyc_cnt = decoder->tot_cyc_cnt;
+}
+
+static inline void intel_pt_mtc_cyc_cnt_cbr(struct intel_pt_decoder *decoder)
+{
+	decoder->tsc_to_cyc = decoder->cbr / decoder->max_non_turbo_ratio_fp;
+
+	if (decoder->pge)
+		intel_pt_mtc_cyc_cnt_pge(decoder);
+}
+
+static inline void intel_pt_mtc_cyc_cnt_upd(struct intel_pt_decoder *decoder)
+{
+	uint64_t tot_cyc_cnt, tsc_delta;
+
+	if (decoder->have_cyc)
+		return;
+
+	decoder->sample_cyc = true;
+
+	if (!decoder->pge || decoder->timestamp <= decoder->cyc_cnt_timestamp)
+		return;
+
+	tsc_delta = decoder->timestamp - decoder->cyc_cnt_timestamp;
+	tot_cyc_cnt = tsc_delta * decoder->tsc_to_cyc + decoder->base_cyc_cnt;
+
+	if (tot_cyc_cnt > decoder->tot_cyc_cnt)
+		decoder->tot_cyc_cnt = tot_cyc_cnt;
+}
+
 static void intel_pt_calc_tma(struct intel_pt_decoder *decoder)
 {
 	uint32_t ctc = decoder->packet.payload;
@@ -1432,6 +1471,11 @@ static void intel_pt_calc_tma(struct intel_pt_decoder *decoder)
 	if (!decoder->tsc_ctc_ratio_d)
 		return;
 
+	if (decoder->pge && !decoder->in_psb)
+		intel_pt_mtc_cyc_cnt_pge(decoder);
+	else
+		intel_pt_mtc_cyc_cnt_upd(decoder);
+
 	decoder->last_mtc = (ctc >> decoder->mtc_shift) & 0xff;
 	decoder->ctc_timestamp = decoder->tsc_timestamp - fc;
 	if (decoder->tsc_ctc_mult) {
@@ -1487,6 +1531,8 @@ static void intel_pt_calc_mtc_timestamp(struct intel_pt_decoder *decoder)
 	else
 		decoder->timestamp = timestamp;
 
+	intel_pt_mtc_cyc_cnt_upd(decoder);
+
 	decoder->timestamp_insn_cnt = 0;
 	decoder->last_mtc = mtc;
 
@@ -1511,6 +1557,8 @@ static void intel_pt_calc_cbr(struct intel_pt_decoder *decoder)
 
 	decoder->cbr = cbr;
 	decoder->cbr_cyc_to_tsc = decoder->max_non_turbo_ratio_fp / cbr;
+
+	intel_pt_mtc_cyc_cnt_cbr(decoder);
 }
 
 static void intel_pt_calc_cyc_timestamp(struct intel_pt_decoder *decoder)
@@ -1706,6 +1754,7 @@ static int intel_pt_walk_fup_tip(struct intel_pt_decoder *decoder)
 				decoder->state.to_ip = decoder->ip;
 			}
 			decoder->state.type |= INTEL_PT_TRACE_BEGIN;
+			intel_pt_mtc_cyc_cnt_pge(decoder);
 			return 0;
 
 		case INTEL_PT_TIP:
@@ -1776,6 +1825,7 @@ next:
 
 		case INTEL_PT_TIP_PGE: {
 			decoder->pge = true;
+			intel_pt_mtc_cyc_cnt_pge(decoder);
 			if (decoder->packet.count == 0) {
 				intel_pt_log_at("Skipping zero TIP.PGE",
 						decoder->pos);
@@ -2138,6 +2188,7 @@ static int intel_pt_walk_to_ip(struct intel_pt_decoder *decoder)
 
 		case INTEL_PT_TIP_PGE:
 			decoder->pge = true;
+			intel_pt_mtc_cyc_cnt_pge(decoder);
 			if (intel_pt_have_ip(decoder))
 				intel_pt_set_ip(decoder);
 			if (!decoder->ip)

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [tip:perf/core] perf intel-pt: Document IPC usage
  2019-05-20 11:37 ` [PATCH 12/22] perf intel-pt: Document IPC usage Adrian Hunter
@ 2019-06-17 19:04   ` tip-bot for Adrian Hunter
  0 siblings, 0 replies; 52+ messages in thread
From: tip-bot for Adrian Hunter @ 2019-06-17 19:04 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: jolsa, mingo, linux-kernel, tglx, adrian.hunter, acme, hpa

Commit-ID:  5db47f43ccbbdee8c48f76ace4c287187a28b87f
Gitweb:     https://git.kernel.org/tip/5db47f43ccbbdee8c48f76ace4c287187a28b87f
Author:     Adrian Hunter <adrian.hunter@intel.com>
AuthorDate: Mon, 20 May 2019 14:37:18 +0300
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Wed, 5 Jun 2019 09:47:57 -0300

perf intel-pt: Document IPC usage

Add brief documentation about instructions-per-cycle (IPC) information
derived from Intel PT.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20190520113728.14389-13-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/Documentation/intel-pt.txt | 30 ++++++++++++++++++++++++++++++
 1 file changed, 30 insertions(+)

diff --git a/tools/perf/Documentation/intel-pt.txt b/tools/perf/Documentation/intel-pt.txt
index 60d99e5e7921..50c5b60101bd 100644
--- a/tools/perf/Documentation/intel-pt.txt
+++ b/tools/perf/Documentation/intel-pt.txt
@@ -103,6 +103,36 @@ The flags are "bcrosyiABEx" which stand for branch, call, return, conditional,
 system, asynchronous, interrupt, transaction abort, trace begin, trace end, and
 in transaction, respectively.
 
+Another interesting field that is not printed by default is 'ipc' which can be
+displayed as follows:
+
+	perf script --itrace=be -F+ipc
+
+There are two ways that instructions-per-cycle (IPC) can be calculated depending
+on the recording.
+
+If the 'cyc' config term (see config terms section below) was used, then IPC is
+calculated using the cycle count from CYC packets, otherwise MTC packets are
+used - refer to the 'mtc' config term.  When MTC is used, however, the values
+are less accurate because the timing is less accurate.
+
+Because Intel PT does not update the cycle count on every branch or instruction,
+the values will often be zero.  When there are values, they will be the number
+of instructions and number of cycles since the last update, and thus represent
+the average IPC since the last IPC for that event type.  Note IPC for "branches"
+events is calculated separately from IPC for "instructions" events.
+
+Also note that the IPC instruction count may or may not include the current
+instruction.  If the cycle count is associated with an asynchronous branch
+(e.g. page fault or interrupt), then the instruction count does not include the
+current instruction, otherwise it does.  That is consistent with whether or not
+that instruction has retired when the cycle count is updated.
+
+Another note, in the case of "branches" events, non-taken branches are not
+presently sampled, so IPC values for them do not appear e.g. a CYC packet with a
+TNT packet that starts with a non-taken branch.  To see every possible IPC
+value, "instructions" events can be used e.g. --itrace=i0ns
+
 While it is possible to create scripts to analyze the data, an alternative
 approach is available to export the data to a sqlite or postgresql database.
 Refer to script export-to-sqlite.py or export-to-postgresql.py for more details,

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [tip:perf/core] perf thread-stack: Accumulate IPC information
  2019-05-20 11:37 ` [PATCH 13/22] perf thread-stack: Accumulate IPC information Adrian Hunter
@ 2019-06-17 19:05   ` tip-bot for Adrian Hunter
  0 siblings, 0 replies; 52+ messages in thread
From: tip-bot for Adrian Hunter @ 2019-06-17 19:05 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: hpa, linux-kernel, jolsa, tglx, adrian.hunter, acme, mingo

Commit-ID:  003ccdc7165accee073ce261fc670f64cc98d0f7
Gitweb:     https://git.kernel.org/tip/003ccdc7165accee073ce261fc670f64cc98d0f7
Author:     Adrian Hunter <adrian.hunter@intel.com>
AuthorDate: Mon, 20 May 2019 14:37:19 +0300
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Wed, 5 Jun 2019 09:47:57 -0300

perf thread-stack: Accumulate IPC information

Cycle and instruction counts are added to the stack. The IPC of a
function and all functions it calls, is also recorded.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20190520113728.14389-14-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/thread-stack.c | 14 ++++++++++++++
 tools/perf/util/thread-stack.h |  4 ++++
 2 files changed, 18 insertions(+)

diff --git a/tools/perf/util/thread-stack.c b/tools/perf/util/thread-stack.c
index 41942c2aaa18..8e390f78486f 100644
--- a/tools/perf/util/thread-stack.c
+++ b/tools/perf/util/thread-stack.c
@@ -49,6 +49,8 @@ enum retpoline_state_t {
  * @timestamp: timestamp (if known)
  * @ref: external reference (e.g. db_id of sample)
  * @branch_count: the branch count when the entry was created
+ * @insn_count: the instruction count when the entry was created
+ * @cyc_count the cycle count when the entry was created
  * @db_id: id used for db-export
  * @cp: call path
  * @no_call: a 'call' was not seen
@@ -60,6 +62,8 @@ struct thread_stack_entry {
 	u64 timestamp;
 	u64 ref;
 	u64 branch_count;
+	u64 insn_count;
+	u64 cyc_count;
 	u64 db_id;
 	struct call_path *cp;
 	bool no_call;
@@ -75,6 +79,8 @@ struct thread_stack_entry {
  * @sz: current maximum stack size
  * @trace_nr: current trace number
  * @branch_count: running branch count
+ * @insn_count: running  instruction count
+ * @cyc_count running  cycle count
  * @kernel_start: kernel start address
  * @last_time: last timestamp
  * @crp: call/return processor
@@ -88,6 +94,8 @@ struct thread_stack {
 	size_t sz;
 	u64 trace_nr;
 	u64 branch_count;
+	u64 insn_count;
+	u64 cyc_count;
 	u64 kernel_start;
 	u64 last_time;
 	struct call_return_processor *crp;
@@ -289,6 +297,8 @@ static int thread_stack__call_return(struct thread *thread,
 	cr.call_time = tse->timestamp;
 	cr.return_time = timestamp;
 	cr.branch_count = ts->branch_count - tse->branch_count;
+	cr.insn_count = ts->insn_count - tse->insn_count;
+	cr.cyc_count = ts->cyc_count - tse->cyc_count;
 	cr.db_id = tse->db_id;
 	cr.call_ref = tse->ref;
 	cr.return_ref = ref;
@@ -544,6 +554,8 @@ static int thread_stack__push_cp(struct thread_stack *ts, u64 ret_addr,
 	tse->timestamp = timestamp;
 	tse->ref = ref;
 	tse->branch_count = ts->branch_count;
+	tse->insn_count = ts->insn_count;
+	tse->cyc_count = ts->cyc_count;
 	tse->cp = cp;
 	tse->no_call = no_call;
 	tse->trace_end = trace_end;
@@ -874,6 +886,8 @@ int thread_stack__process(struct thread *thread, struct comm *comm,
 	}
 
 	ts->branch_count += 1;
+	ts->insn_count += sample->insn_cnt;
+	ts->cyc_count += sample->cyc_cnt;
 	ts->last_time = sample->time;
 
 	if (sample->flags & PERF_IP_FLAG_CALL) {
diff --git a/tools/perf/util/thread-stack.h b/tools/perf/util/thread-stack.h
index 9c45f947f5a9..bddb1daf6453 100644
--- a/tools/perf/util/thread-stack.h
+++ b/tools/perf/util/thread-stack.h
@@ -52,6 +52,8 @@ enum {
  * @call_time: timestamp of call (if known)
  * @return_time: timestamp of return (if known)
  * @branch_count: number of branches seen between call and return
+ * @insn_count: approx. number of instructions between call and return
+ * @cyc_count: approx. number of cycles between call and return
  * @call_ref: external reference to 'call' sample (e.g. db_id)
  * @return_ref:  external reference to 'return' sample (e.g. db_id)
  * @db_id: id used for db-export
@@ -65,6 +67,8 @@ struct call_return {
 	u64 call_time;
 	u64 return_time;
 	u64 branch_count;
+	u64 insn_count;
+	u64 cyc_count;
 	u64 call_ref;
 	u64 return_ref;
 	u64 db_id;

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [tip:perf/core] perf db-export: Add brief documentation
  2019-05-20 11:37 ` [PATCH 14/22] perf db-export: Add brief documentation Adrian Hunter
@ 2019-06-17 19:06   ` tip-bot for Adrian Hunter
  0 siblings, 0 replies; 52+ messages in thread
From: tip-bot for Adrian Hunter @ 2019-06-17 19:06 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: acme, linux-kernel, hpa, tglx, adrian.hunter, jolsa, mingo

Commit-ID:  1159facee97fe184a434db3086604c7572fd7dfa
Gitweb:     https://git.kernel.org/tip/1159facee97fe184a434db3086604c7572fd7dfa
Author:     Adrian Hunter <adrian.hunter@intel.com>
AuthorDate: Mon, 20 May 2019 14:37:20 +0300
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Wed, 5 Jun 2019 09:47:57 -0300

perf db-export: Add brief documentation

Add brief documentation to explain how the database export maintains
backward and forward compatibility.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20190520113728.14389-15-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/Documentation/db-export.txt | 41 ++++++++++++++++++++++++++++++++++
 1 file changed, 41 insertions(+)

diff --git a/tools/perf/Documentation/db-export.txt b/tools/perf/Documentation/db-export.txt
new file mode 100644
index 000000000000..52ffccb02d55
--- /dev/null
+++ b/tools/perf/Documentation/db-export.txt
@@ -0,0 +1,41 @@
+Database Export
+===============
+
+perf tool's python scripting engine:
+
+	tools/perf/util/scripting-engines/trace-event-python.c
+
+supports scripts:
+
+	tools/perf/scripts/python/export-to-sqlite.py
+	tools/perf/scripts/python/export-to-postgresql.py
+
+which export data to a SQLite3 or PostgreSQL database.
+
+The export process provides records with unique sequential ids which allows the
+data to be imported directly to a database and provides the relationships
+between tables.
+
+Over time it is possible to continue to expand the export while maintaining
+backward and forward compatibility, by following some simple rules:
+
+1. Because of the nature of SQL, existing tables and columns can continue to be
+used so long as the names and meanings (and to some extent data types) remain
+the same.
+
+2. New tables and columns can be added, without affecting existing SQL queries,
+so long as the new names are unique.
+
+3. Scripts that use a database (e.g. exported-sql-viewer.py) can maintain
+backward compatibility by testing for the presence of new tables and columns
+before using them. e.g. function IsSelectable() in exported-sql-viewer.py
+
+4. The export scripts themselves maintain forward compatibility (i.e. an existing
+script will continue to work with new versions of perf) by accepting a variable
+number of arguments (e.g. def call_return_table(*x)) i.e. perf can pass more
+arguments which old scripts will ignore.
+
+5. The scripting engine tests for the existence of script handler functions
+before calling them.  The scripting engine can also test for the support of new
+or optional features by checking for the existence and value of script global
+variables.

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [tip:perf/core] perf db-export: Export IPC information
  2019-05-20 11:37 ` [PATCH 15/22] perf db-export: Export IPC information Adrian Hunter
@ 2019-06-17 19:07   ` tip-bot for Adrian Hunter
  0 siblings, 0 replies; 52+ messages in thread
From: tip-bot for Adrian Hunter @ 2019-06-17 19:07 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: acme, mingo, adrian.hunter, hpa, jolsa, tglx, linux-kernel

Commit-ID:  52a2ab6fa99df9288f2c8c7f461b815550b9b366
Gitweb:     https://git.kernel.org/tip/52a2ab6fa99df9288f2c8c7f461b815550b9b366
Author:     Adrian Hunter <adrian.hunter@intel.com>
AuthorDate: Mon, 20 May 2019 14:37:21 +0300
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Wed, 5 Jun 2019 09:47:57 -0300

perf db-export: Export IPC information

Export cycle and instruction counts on samples and call-returns.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20190520113728.14389-16-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/scripting-engines/trace-event-python.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/scripting-engines/trace-event-python.c b/tools/perf/util/scripting-engines/trace-event-python.c
index 22f52b669871..6acb379b53ec 100644
--- a/tools/perf/util/scripting-engines/trace-event-python.c
+++ b/tools/perf/util/scripting-engines/trace-event-python.c
@@ -1111,7 +1111,7 @@ static int python_export_sample(struct db_export *dbe,
 	struct tables *tables = container_of(dbe, struct tables, dbe);
 	PyObject *t;
 
-	t = tuple_new(22);
+	t = tuple_new(24);
 
 	tuple_set_u64(t, 0, es->db_id);
 	tuple_set_u64(t, 1, es->evsel->db_id);
@@ -1135,6 +1135,8 @@ static int python_export_sample(struct db_export *dbe,
 	tuple_set_s32(t, 19, es->sample->flags & PERF_BRANCH_MASK);
 	tuple_set_s32(t, 20, !!(es->sample->flags & PERF_IP_FLAG_IN_TX));
 	tuple_set_u64(t, 21, es->call_path_id);
+	tuple_set_u64(t, 22, es->sample->insn_cnt);
+	tuple_set_u64(t, 23, es->sample->cyc_cnt);
 
 	call_object(tables->sample_handler, t, "sample_table");
 
@@ -1173,7 +1175,7 @@ static int python_export_call_return(struct db_export *dbe,
 	u64 comm_db_id = cr->comm ? cr->comm->db_id : 0;
 	PyObject *t;
 
-	t = tuple_new(12);
+	t = tuple_new(14);
 
 	tuple_set_u64(t, 0, cr->db_id);
 	tuple_set_u64(t, 1, cr->thread->db_id);
@@ -1187,6 +1189,8 @@ static int python_export_call_return(struct db_export *dbe,
 	tuple_set_u64(t, 9, cr->cp->parent->db_id);
 	tuple_set_s32(t, 10, cr->flags);
 	tuple_set_u64(t, 11, cr->parent_db_id);
+	tuple_set_u64(t, 12, cr->insn_count);
+	tuple_set_u64(t, 13, cr->cyc_count);
 
 	call_object(tables->call_return_handler, t, "call_return_table");
 

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [tip:perf/core] perf scripts python: export-to-sqlite.py: Export IPC information
  2019-05-20 11:37 ` [PATCH 16/22] perf scripts python: export-to-sqlite.py: " Adrian Hunter
@ 2019-06-17 19:07   ` tip-bot for Adrian Hunter
  0 siblings, 0 replies; 52+ messages in thread
From: tip-bot for Adrian Hunter @ 2019-06-17 19:07 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: tglx, jolsa, linux-kernel, adrian.hunter, hpa, mingo, acme

Commit-ID:  64adadb3f9dbaaae3d14ea75fa71a3b877cbe82e
Gitweb:     https://git.kernel.org/tip/64adadb3f9dbaaae3d14ea75fa71a3b877cbe82e
Author:     Adrian Hunter <adrian.hunter@intel.com>
AuthorDate: Mon, 20 May 2019 14:37:22 +0300
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Wed, 5 Jun 2019 09:47:57 -0300

perf scripts python: export-to-sqlite.py: Export IPC information

Export cycle and instruction counts on samples and calls tables.

Committer testing:

First runs some workload collecting intel_pt with the 'cyc' ter just for
userspace:

  [root@quaco adrian.hunter]# perf record -o simple-retpoline.perf.data -e intel_pt/cyc/u ./simple-retpoline
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.035 MB simple-retpoline.perf.data ]
  [root@quaco adrian.hunter]#

Then use the export-to-sqlite.py script to see if the changes in this
cset don't make it to break and if the changes in the db schema are the
ones expected:

  [root@quaco adrian.hunter]# perf script -i simple-retpoline.perf.data --itrace=be -s ~acme/libexec/perf-core/scripts/python/export-to-sqlite.py simple-retpoline.db branches calls
  2019-05-31 11:50:46.942710 Creating database ...
  2019-05-31 11:50:46.949663 Writing records...
  2019-05-31 11:50:47.224033 Adding indexes
  2019-05-31 11:50:47.231599 Done
  [root@quaco adrian.hunter]#

Now lets use the db:

  [root@quaco adrian.hunter]# sqlite3 simple-retpoline.db
  SQLite version 3.26.0 2018-12-01 12:34:55
  Enter ".help" for usage hints.
  sqlite> .schema samples
  CREATE TABLE samples (id integer NOT NULL PRIMARY KEY,evsel_id bigint,machine_id bigint,thread_id bigint,comm_id bigint,dso_id bigint,symbol_id bigint,sym_offset bigint,ip bigint,time bigint,cpuinteger,to_dso_id bigint,to_symbol_id bigint,to_sym_offset bigint,to_ip bigint,branch_type integer,in_tx boolean,call_path_id bigint,insn_count bigint,cyc_count bigint);
  sqlite>

Cool, the 'insn_count' and 'cyc_count' are there, now lets see if we can
use them in a query:

  sqlite> select insn_count,cyc_count from samples where cyc_count > 1500 and insn_count < 10;
  6|1507
  sqlite> select insn_count,cyc_count from samples where cyc_count > 1500;
  118|2210
  140|1516
  3783|1861
  132|1521
  6|1507
  sqlite>

Seems to work :-)

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20190520113728.14389-17-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/scripts/python/export-to-sqlite.py | 36 ++++++++++++++++++---------
 1 file changed, 24 insertions(+), 12 deletions(-)

diff --git a/tools/perf/scripts/python/export-to-sqlite.py b/tools/perf/scripts/python/export-to-sqlite.py
index f617e518332f..4542ce89034b 100644
--- a/tools/perf/scripts/python/export-to-sqlite.py
+++ b/tools/perf/scripts/python/export-to-sqlite.py
@@ -218,7 +218,9 @@ if branches:
 		'to_ip		bigint,'
 		'branch_type	integer,'
 		'in_tx		boolean,'
-		'call_path_id	bigint)')
+		'call_path_id	bigint,'
+		'insn_count	bigint,'
+		'cyc_count	bigint)')
 else:
 	do_query(query, 'CREATE TABLE samples ('
 		'id		integer		NOT NULL	PRIMARY KEY,'
@@ -242,7 +244,9 @@ else:
 		'data_src	bigint,'
 		'branch_type	integer,'
 		'in_tx		boolean,'
-		'call_path_id	bigint)')
+		'call_path_id	bigint,'
+		'insn_count	bigint,'
+		'cyc_count	bigint)')
 
 if perf_db_export_calls or perf_db_export_callchains:
 	do_query(query, 'CREATE TABLE call_paths ('
@@ -263,7 +267,9 @@ if perf_db_export_calls:
 		'return_id	bigint,'
 		'parent_call_path_id	bigint,'
 		'flags		integer,'
-		'parent_id	bigint)')
+		'parent_id	bigint,'
+		'insn_count	bigint,'
+		'cyc_count	bigint)')
 
 # printf was added to sqlite in version 3.8.3
 sqlite_has_printf = False
@@ -359,6 +365,9 @@ if perf_db_export_calls:
 			'return_time,'
 			'return_time - call_time AS elapsed_time,'
 			'branch_count,'
+			'insn_count,'
+			'cyc_count,'
+			'CASE WHEN cyc_count=0 THEN CAST(0 AS FLOAT) ELSE ROUND(CAST(insn_count AS FLOAT) / cyc_count, 2) END AS IPC,'
 			'call_id,'
 			'return_id,'
 			'CASE WHEN flags=0 THEN \'\' WHEN flags=1 THEN \'no call\' WHEN flags=2 THEN \'no return\' WHEN flags=3 THEN \'no call/return\' WHEN flags=6 THEN \'jump\' ELSE flags END AS flags,'
@@ -384,7 +393,10 @@ do_query(query, 'CREATE VIEW samples_view AS '
 		'to_sym_offset,'
 		'(SELECT short_name FROM dsos WHERE id = to_dso_id) AS to_dso_short_name,'
 		'(SELECT name FROM branch_types WHERE id = branch_type) AS branch_type_name,'
-		'in_tx'
+		'in_tx,'
+		'insn_count,'
+		'cyc_count,'
+		'CASE WHEN cyc_count=0 THEN CAST(0 AS FLOAT) ELSE ROUND(CAST(insn_count AS FLOAT) / cyc_count, 2) END AS IPC'
 	' FROM samples')
 
 do_query(query, 'END TRANSACTION')
@@ -407,15 +419,15 @@ branch_type_query = QSqlQuery(db)
 branch_type_query.prepare("INSERT INTO branch_types VALUES (?, ?)")
 sample_query = QSqlQuery(db)
 if branches:
-	sample_query.prepare("INSERT INTO samples VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)")
+	sample_query.prepare("INSERT INTO samples VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)")
 else:
-	sample_query.prepare("INSERT INTO samples VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)")
+	sample_query.prepare("INSERT INTO samples VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)")
 if perf_db_export_calls or perf_db_export_callchains:
 	call_path_query = QSqlQuery(db)
 	call_path_query.prepare("INSERT INTO call_paths VALUES (?, ?, ?, ?)")
 if perf_db_export_calls:
 	call_query = QSqlQuery(db)
-	call_query.prepare("INSERT INTO calls VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)")
+	call_query.prepare("INSERT INTO calls VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)")
 
 def trace_begin():
 	printdate("Writing records...")
@@ -427,10 +439,10 @@ def trace_begin():
 	comm_table(0, "unknown")
 	dso_table(0, 0, "unknown", "unknown", "")
 	symbol_table(0, 0, 0, 0, 0, "unknown")
-	sample_table(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
+	sample_table(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
 	if perf_db_export_calls or perf_db_export_callchains:
 		call_path_table(0, 0, 0, 0)
-		call_return_table(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
+		call_return_table(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
 
 unhandled_count = 0
 
@@ -486,14 +498,14 @@ def sample_table(*x):
 	if branches:
 		for xx in x[0:15]:
 			sample_query.addBindValue(str(xx))
-		for xx in x[19:22]:
+		for xx in x[19:24]:
 			sample_query.addBindValue(str(xx))
 		do_query_(sample_query)
 	else:
-		bind_exec(sample_query, 22, x)
+		bind_exec(sample_query, 24, x)
 
 def call_path_table(*x):
 	bind_exec(call_path_query, 4, x)
 
 def call_return_table(*x):
-	bind_exec(call_query, 12, x)
+	bind_exec(call_query, 14, x)

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [tip:perf/core] perf scripts python: export-to-postgresql.py: Export IPC information
  2019-05-20 11:37 ` [PATCH 17/22] perf scripts python: export-to-postgresql.py: " Adrian Hunter
@ 2019-06-17 19:08   ` tip-bot for Adrian Hunter
  0 siblings, 0 replies; 52+ messages in thread
From: tip-bot for Adrian Hunter @ 2019-06-17 19:08 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, acme, mingo, hpa, tglx, adrian.hunter, jolsa

Commit-ID:  ec7f448e2b2e13d1629300c5881cb3b5e0a99c2f
Gitweb:     https://git.kernel.org/tip/ec7f448e2b2e13d1629300c5881cb3b5e0a99c2f
Author:     Adrian Hunter <adrian.hunter@intel.com>
AuthorDate: Mon, 20 May 2019 14:37:23 +0300
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Wed, 5 Jun 2019 09:47:57 -0300

perf scripts python: export-to-postgresql.py: Export IPC information

Export cycle and instruction counts on samples and calls tables.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20190520113728.14389-18-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/scripts/python/export-to-postgresql.py | 36 +++++++++++++++--------
 1 file changed, 24 insertions(+), 12 deletions(-)

diff --git a/tools/perf/scripts/python/export-to-postgresql.py b/tools/perf/scripts/python/export-to-postgresql.py
index b2f481b0d28d..93225c02117e 100644
--- a/tools/perf/scripts/python/export-to-postgresql.py
+++ b/tools/perf/scripts/python/export-to-postgresql.py
@@ -394,7 +394,9 @@ if branches:
 		'to_ip		bigint,'
 		'branch_type	integer,'
 		'in_tx		boolean,'
-		'call_path_id	bigint)')
+		'call_path_id	bigint,'
+		'insn_count	bigint,'
+		'cyc_count	bigint)')
 else:
 	do_query(query, 'CREATE TABLE samples ('
 		'id		bigint		NOT NULL,'
@@ -418,7 +420,9 @@ else:
 		'data_src	bigint,'
 		'branch_type	integer,'
 		'in_tx		boolean,'
-		'call_path_id	bigint)')
+		'call_path_id	bigint,'
+		'insn_count	bigint,'
+		'cyc_count	bigint)')
 
 if perf_db_export_calls or perf_db_export_callchains:
 	do_query(query, 'CREATE TABLE call_paths ('
@@ -439,7 +443,9 @@ if perf_db_export_calls:
 		'return_id	bigint,'
 		'parent_call_path_id	bigint,'
 		'flags		integer,'
-		'parent_id	bigint)')
+		'parent_id	bigint,'
+		'insn_count	bigint,'
+		'cyc_count	bigint)')
 
 do_query(query, 'CREATE VIEW machines_view AS '
 	'SELECT '
@@ -521,6 +527,9 @@ if perf_db_export_calls:
 			'return_time,'
 			'return_time - call_time AS elapsed_time,'
 			'branch_count,'
+			'insn_count,'
+			'cyc_count,'
+			'CASE WHEN cyc_count=0 THEN CAST(0 AS NUMERIC(20, 2)) ELSE CAST((CAST(insn_count AS FLOAT) / cyc_count) AS NUMERIC(20, 2)) END AS IPC,'
 			'call_id,'
 			'return_id,'
 			'CASE WHEN flags=0 THEN \'\' WHEN flags=1 THEN \'no call\' WHEN flags=2 THEN \'no return\' WHEN flags=3 THEN \'no call/return\' WHEN flags=6 THEN \'jump\' ELSE CAST ( flags AS VARCHAR(6) ) END AS flags,'
@@ -546,7 +555,10 @@ do_query(query, 'CREATE VIEW samples_view AS '
 		'to_sym_offset,'
 		'(SELECT short_name FROM dsos WHERE id = to_dso_id) AS to_dso_short_name,'
 		'(SELECT name FROM branch_types WHERE id = branch_type) AS branch_type_name,'
-		'in_tx'
+		'in_tx,'
+		'insn_count,'
+		'cyc_count,'
+		'CASE WHEN cyc_count=0 THEN CAST(0 AS NUMERIC(20, 2)) ELSE CAST((CAST(insn_count AS FLOAT) / cyc_count) AS NUMERIC(20, 2)) END AS IPC'
 	' FROM samples')
 
 
@@ -618,10 +630,10 @@ def trace_begin():
 	comm_table(0, "unknown")
 	dso_table(0, 0, "unknown", "unknown", "")
 	symbol_table(0, 0, 0, 0, 0, "unknown")
-	sample_table(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
+	sample_table(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
 	if perf_db_export_calls or perf_db_export_callchains:
 		call_path_table(0, 0, 0, 0)
-		call_return_table(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
+		call_return_table(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
 
 unhandled_count = 0
 
@@ -772,11 +784,11 @@ def branch_type_table(branch_type, name, *x):
 	value = struct.pack(fmt, 2, 4, branch_type, n, name)
 	branch_type_file.write(value)
 
-def sample_table(sample_id, evsel_id, machine_id, thread_id, comm_id, dso_id, symbol_id, sym_offset, ip, time, cpu, to_dso_id, to_symbol_id, to_sym_offset, to_ip, period, weight, transaction, data_src, branch_type, in_tx, call_path_id, *x):
+def sample_table(sample_id, evsel_id, machine_id, thread_id, comm_id, dso_id, symbol_id, sym_offset, ip, time, cpu, to_dso_id, to_symbol_id, to_sym_offset, to_ip, period, weight, transaction, data_src, branch_type, in_tx, call_path_id, insn_cnt, cyc_cnt, *x):
 	if branches:
-		value = struct.pack("!hiqiqiqiqiqiqiqiqiqiqiiiqiqiqiqiiiBiq", 18, 8, sample_id, 8, evsel_id, 8, machine_id, 8, thread_id, 8, comm_id, 8, dso_id, 8, symbol_id, 8, sym_offset, 8, ip, 8, time, 4, cpu, 8, to_dso_id, 8, to_symbol_id, 8, to_sym_offset, 8, to_ip, 4, branch_type, 1, in_tx, 8, call_path_id)
+		value = struct.pack("!hiqiqiqiqiqiqiqiqiqiqiiiqiqiqiqiiiBiqiqiq", 20, 8, sample_id, 8, evsel_id, 8, machine_id, 8, thread_id, 8, comm_id, 8, dso_id, 8, symbol_id, 8, sym_offset, 8, ip, 8, time, 4, cpu, 8, to_dso_id, 8, to_symbol_id, 8, to_sym_offset, 8, to_ip, 4, branch_type, 1, in_tx, 8, call_path_id, 8, insn_cnt, 8, cyc_cnt)
 	else:
-		value = struct.pack("!hiqiqiqiqiqiqiqiqiqiqiiiqiqiqiqiqiqiqiqiiiBiq", 22, 8, sample_id, 8, evsel_id, 8, machine_id, 8, thread_id, 8, comm_id, 8, dso_id, 8, symbol_id, 8, sym_offset, 8, ip, 8, time, 4, cpu, 8, to_dso_id, 8, to_symbol_id, 8, to_sym_offset, 8, to_ip, 8, period, 8, weight, 8, transaction, 8, data_src, 4, branch_type, 1, in_tx, 8, call_path_id)
+		value = struct.pack("!hiqiqiqiqiqiqiqiqiqiqiiiqiqiqiqiqiqiqiqiiiBiqiqiq", 24, 8, sample_id, 8, evsel_id, 8, machine_id, 8, thread_id, 8, comm_id, 8, dso_id, 8, symbol_id, 8, sym_offset, 8, ip, 8, time, 4, cpu, 8, to_dso_id, 8, to_symbol_id, 8, to_sym_offset, 8, to_ip, 8, period, 8, weight, 8, transaction, 8, data_src, 4, branch_type, 1, in_tx, 8, call_path_id, 8, insn_cnt, 8, cyc_cnt)
 	sample_file.write(value)
 
 def call_path_table(cp_id, parent_id, symbol_id, ip, *x):
@@ -784,7 +796,7 @@ def call_path_table(cp_id, parent_id, symbol_id, ip, *x):
 	value = struct.pack(fmt, 4, 8, cp_id, 8, parent_id, 8, symbol_id, 8, ip)
 	call_path_file.write(value)
 
-def call_return_table(cr_id, thread_id, comm_id, call_path_id, call_time, return_time, branch_count, call_id, return_id, parent_call_path_id, flags, parent_id, *x):
-	fmt = "!hiqiqiqiqiqiqiqiqiqiqiiiq"
-	value = struct.pack(fmt, 12, 8, cr_id, 8, thread_id, 8, comm_id, 8, call_path_id, 8, call_time, 8, return_time, 8, branch_count, 8, call_id, 8, return_id, 8, parent_call_path_id, 4, flags, 8, parent_id)
+def call_return_table(cr_id, thread_id, comm_id, call_path_id, call_time, return_time, branch_count, call_id, return_id, parent_call_path_id, flags, parent_id, insn_cnt, cyc_cnt, *x):
+	fmt = "!hiqiqiqiqiqiqiqiqiqiqiiiqiqiq"
+	value = struct.pack(fmt, 14, 8, cr_id, 8, thread_id, 8, comm_id, 8, call_path_id, 8, call_time, 8, return_time, 8, branch_count, 8, call_id, 8, return_id, 8, parent_call_path_id, 4, flags, 8, parent_id, 8, insn_cnt, 8, cyc_cnt)
 	call_file.write(value)

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [tip:perf/core] perf scripts python: exported-sql-viewer.py: Add IPC information to the Branch reports
  2019-05-20 11:37 ` [PATCH 18/22] perf scripts python: exported-sql-viewer.py: Add IPC information to the Branch reports Adrian Hunter
  2019-05-31 16:44   ` Arnaldo Carvalho de Melo
@ 2019-06-17 19:09   ` tip-bot for Adrian Hunter
  1 sibling, 0 replies; 52+ messages in thread
From: tip-bot for Adrian Hunter @ 2019-06-17 19:09 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: adrian.hunter, mingo, tglx, jolsa, hpa, linux-kernel, acme

Commit-ID:  530e22fd5c6d2c572b1bbdda23eafa01a005fce0
Gitweb:     https://git.kernel.org/tip/530e22fd5c6d2c572b1bbdda23eafa01a005fce0
Author:     Adrian Hunter <adrian.hunter@intel.com>
AuthorDate: Mon, 20 May 2019 14:37:24 +0300
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Wed, 5 Jun 2019 09:47:57 -0300

perf scripts python: exported-sql-viewer.py: Add IPC information to the Branch reports

Enhance the "All branches" and "Selected branches" reports to display IPC
information if it is available.

Committer testing:

So, testing this I noticed that it all starts with the left arrow in every
line, that should mean there is some tree there, i.e. look at all those ▶
symbols:

Reports -> All Branches:

Time              CPU Command         PID   TID   Branch Type  In Tx  Insn Cnt  Cyc Cnt  IPC  Branch
▶ 187836112195670 7   simple-retpolin 23003 23003 trace begin  No     0         0        0               0 unknown (unknown) -> 7f6f33d4f110
+_start (ld-2.28.so)
▶ 187836112195987 7   simple-retpolin 23003 23003 trace end    No     0         883      0    7f6f33d4f110 _start (ld-2.28.so) -> 0 unknown
+(unknown)
▶ 187836112199189 7   simple-retpolin 23003 23003 trace begin  No     0         0        0               0 unknown (unknown) -> 7f6f33d4f110
+_start (ld-2.28.so)
▶ 187836112199189 7   simple-retpolin 23003 23003 call         No     0         0        0    7f6f33d4f113 _start+0x3 (ld-2.28.so) -> 7f6f33d4ff50
+_dl_start (ld-2.28.so)
▶ 187836112199544 7   simple-retpolin 23003 23003 trace end    No     17        996      0.02 7f6f33d4ff73 _dl_start+0x23 (ld-2.28.so) -> 0
+unknown (unknown)
▶ 187836112200939 7   simple-retpolin 23003 23003 trace begin  No     0         0        0               0 unknown (unknown) -> 7f6f33d4ff73
+_dl_start+0x23 (ld-2.28.so)
▶ 187836112201229 7   simple-retpolin 23003 23003 trace end    No     1         816      0.00 7f6f33d4ff7a _dl_start+0x2a (ld-2.28.so) -> 0
+unknown (unknown)
▶ 187836112203500 7   simple-retpolin 23003 23003 trace begin  No     0         0        0               0 unknown (unknown) -> 7f6f33d4ff7a
+_dl_start+0x2a (ld-2.28.so)

But if you click on it, that ▶ disappears and a new click doesn't make
it reappear, looks buggy, minor oddity, reported to Adrian.

Reports -> Selected Branches, then ask for branches in the ld-2.28.so
DSO:

Time               CPU  Command          PID    TID    Branch Type        In Tx  Insn Cnt  Cyc Cnt  IPC   Branch
▶ 187836112195987  7    simple-retpolin  23003  23003  trace end          No     0         883      0     7f6f33d4f110 _start (ld-2.28.so) -> 0 unknown (unknown)
▶ 187836112199189  7    simple-retpolin  23003  23003  trace begin        No     0         0        0                0 unknown (unknown) -> 7f6f33d4f110 _start (ld-2.28.so)
▶ 187836112199189  7    simple-retpolin  23003  23003  call               No     0         0        0     7f6f33d4f113 _start+0x3 (ld-2.28.so) -> 7f6f33d4ff50 _dl_start (ld-2.28.so)
▶ 187836112199544  7    simple-retpolin  23003  23003  trace end          No     17        996      0.02  7f6f33d4ff73 _dl_start+0x23 (ld-2.28.so) -> 0 unknown (unknown)
▶ 187836112200939  7    simple-retpolin  23003  23003  trace begin        No     0         0        0                0 unknown (unknown) -> 7f6f33d4ff73 _dl_start+0x23 (ld-2.28.so)
▶ 187836112201229  7    simple-retpolin  23003  23003  trace end          No     1         816      0.00  7f6f33d4ff7a _dl_start+0x2a (ld-2.28.so) -> 0 unknown (unknown)
▶ 187836112203500  7    simple-retpolin  23003  23003  trace begin        No     0         0        0                0 unknown (unknown) -> 7f6f33d4ff7a _dl_start+0x2a (ld-2.28.so)
▶ 187836112203528  7    simple-retpolin  23003  23003  unconditional jump No     0         0        0     7f6f33d4ffe7 _dl_start+0x97 (ld-2.28.so) -> 7f6f33d5000b _dl_start+0xbb (ld-2.28.so)
▶ 187836112203528  7    simple-retpolin  23003  23003  conditional jump   No     0         0        0     7f6f33d5000f _dl_start+0xbf (ld-2.28.so) -> 7f6f33d4fffb _dl_start+0xab (ld-2.28.so)
▶ 187836112203528  7    simple-retpolin  23003  23003  conditional jump   No     0         0        0     7f6f33d5000f _dl_start+0xbf (ld-2.28.so) -> 7f6f33d4fffb _dl_start+0xab (ld-2.28.so)
▶ 187836112203539  7    simple-retpolin  23003  23003  conditional jump   No     0         0        0     7f6f33d50025 _dl_start+0xd5 (ld-2.28.so) -> 7f6f33d50210 _dl_start+0x2c0 (ld-2.28.so)
▶ 187836112203539  7    simple-retpolin  23003  23003  conditional jump   No     0         0        0     7f6f33d5021a _dl_start+0x2ca (ld-2.28.so) -> 7f6f33d50360 _dl_start+0x410 (ld-2.28.so)
▶ 187836112203539  7    simple-retpolin  23003  23003  unconditional jump No     0         0        0     7f6f33d50377 _dl_start+0x427 (ld-2.28.so) -> 7f6f33d4ffff _dl_start+0xaf (ld-2.28.so)
▶ 187836112203539  7    simple-retpolin  23003  23003  conditional jump   No     0         0        0     7f6f33d5000f _dl_start+0xbf (ld-2.28.so) -> 7f6f33d4fffb _dl_start+0xab (ld-2.28.so)
▶ 187836112203562  7    simple-retpolin  23003  23003  conditional jump   No     0         0        0     7f6f33d5000f _dl_start+0xbf (ld-2.28.so) -> 7f6f33d4fffb _dl_start+0xab (ld-2.28.so)

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20190520113728.14389-19-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/scripts/python/exported-sql-viewer.py | 102 ++++++++++++++++++-----
 1 file changed, 83 insertions(+), 19 deletions(-)

diff --git a/tools/perf/scripts/python/exported-sql-viewer.py b/tools/perf/scripts/python/exported-sql-viewer.py
index 6fe553258ce5..a607235c8cd9 100755
--- a/tools/perf/scripts/python/exported-sql-viewer.py
+++ b/tools/perf/scripts/python/exported-sql-viewer.py
@@ -1369,11 +1369,11 @@ class FetchMoreRecordsBar():
 
 class BranchLevelTwoItem():
 
-	def __init__(self, row, text, parent_item):
+	def __init__(self, row, col, text, parent_item):
 		self.row = row
 		self.parent_item = parent_item
-		self.data = [""] * 8
-		self.data[7] = text
+		self.data = [""] * (col + 1)
+		self.data[col] = text
 		self.level = 2
 
 	def getParentItem(self):
@@ -1405,6 +1405,7 @@ class BranchLevelOneItem():
 		self.dbid = data[0]
 		self.level = 1
 		self.query_done = False
+		self.br_col = len(self.data) - 1
 
 	def getChildItem(self, row):
 		return self.child_items[row]
@@ -1485,7 +1486,7 @@ class BranchLevelOneItem():
 				while k < 15:
 					byte_str += "   "
 					k += 1
-				self.child_items.append(BranchLevelTwoItem(0, byte_str + " " + text, self))
+				self.child_items.append(BranchLevelTwoItem(0, self.br_col, byte_str + " " + text, self))
 				self.child_count += 1
 			else:
 				return
@@ -1536,16 +1537,37 @@ class BranchRootItem():
 	def getData(self, column):
 		return ""
 
+# Calculate instructions per cycle
+
+def CalcIPC(cyc_cnt, insn_cnt):
+	if cyc_cnt and insn_cnt:
+		ipc = Decimal(float(insn_cnt) / cyc_cnt)
+		ipc = str(ipc.quantize(Decimal(".01"), rounding=ROUND_HALF_UP))
+	else:
+		ipc = "0"
+	return ipc
+
 # Branch data preparation
 
-def BranchDataPrep(query):
-	data = []
-	for i in xrange(0, 8):
-		data.append(query.value(i))
+def BranchDataPrepBr(query, data):
 	data.append(tohex(query.value(8)).rjust(16) + " " + query.value(9) + offstr(query.value(10)) +
 			" (" + dsoname(query.value(11)) + ")" + " -> " +
 			tohex(query.value(12)) + " " + query.value(13) + offstr(query.value(14)) +
 			" (" + dsoname(query.value(15)) + ")")
+
+def BranchDataPrepIPC(query, data):
+	insn_cnt = query.value(16)
+	cyc_cnt = query.value(17)
+	ipc = CalcIPC(cyc_cnt, insn_cnt)
+	data.append(insn_cnt)
+	data.append(cyc_cnt)
+	data.append(ipc)
+
+def BranchDataPrep(query):
+	data = []
+	for i in xrange(0, 8):
+		data.append(query.value(i))
+	BranchDataPrepBr(query, data)
 	return data
 
 def BranchDataPrepWA(query):
@@ -1555,10 +1577,26 @@ def BranchDataPrepWA(query):
 	data.append("{:>19}".format(query.value(1)))
 	for i in xrange(2, 8):
 		data.append(query.value(i))
-	data.append(tohex(query.value(8)).rjust(16) + " " + query.value(9) + offstr(query.value(10)) +
-			" (" + dsoname(query.value(11)) + ")" + " -> " +
-			tohex(query.value(12)) + " " + query.value(13) + offstr(query.value(14)) +
-			" (" + dsoname(query.value(15)) + ")")
+	BranchDataPrepBr(query, data)
+	return data
+
+def BranchDataWithIPCPrep(query):
+	data = []
+	for i in xrange(0, 8):
+		data.append(query.value(i))
+	BranchDataPrepIPC(query, data)
+	BranchDataPrepBr(query, data)
+	return data
+
+def BranchDataWithIPCPrepWA(query):
+	data = []
+	data.append(query.value(0))
+	# Workaround pyside failing to handle large integers (i.e. time) in python3 by converting to a string
+	data.append("{:>19}".format(query.value(1)))
+	for i in xrange(2, 8):
+		data.append(query.value(i))
+	BranchDataPrepIPC(query, data)
+	BranchDataPrepBr(query, data)
 	return data
 
 # Branch data model
@@ -1572,10 +1610,20 @@ class BranchModel(TreeModel):
 		self.event_id = event_id
 		self.more = True
 		self.populated = 0
+		self.have_ipc = IsSelectable(glb.db, "samples", columns = "insn_count, cyc_count")
+		if self.have_ipc:
+			select_ipc = ", insn_count, cyc_count"
+			prep_fn = BranchDataWithIPCPrep
+			prep_wa_fn = BranchDataWithIPCPrepWA
+		else:
+			select_ipc = ""
+			prep_fn = BranchDataPrep
+			prep_wa_fn = BranchDataPrepWA
 		sql = ("SELECT samples.id, time, cpu, comm, pid, tid, branch_types.name,"
 			" CASE WHEN in_tx = '0' THEN 'No' ELSE 'Yes' END,"
 			" ip, symbols.name, sym_offset, dsos.short_name,"
 			" to_ip, to_symbols.name, to_sym_offset, to_dsos.short_name"
+			+ select_ipc +
 			" FROM samples"
 			" INNER JOIN comms ON comm_id = comms.id"
 			" INNER JOIN threads ON thread_id = threads.id"
@@ -1589,9 +1637,9 @@ class BranchModel(TreeModel):
 			" ORDER BY samples.id"
 			" LIMIT " + str(glb_chunk_sz))
 		if pyside_version_1 and sys.version_info[0] == 3:
-			prep = BranchDataPrepWA
+			prep = prep_fn
 		else:
-			prep = BranchDataPrep
+			prep = prep_wa_fn
 		self.fetcher = SQLFetcher(glb, sql, prep, self.AddSample)
 		self.fetcher.done.connect(self.Update)
 		self.fetcher.Fetch(glb_chunk_sz)
@@ -1600,13 +1648,23 @@ class BranchModel(TreeModel):
 		return BranchRootItem()
 
 	def columnCount(self, parent=None):
-		return 8
+		if self.have_ipc:
+			return 11
+		else:
+			return 8
 
 	def columnHeader(self, column):
-		return ("Time", "CPU", "Command", "PID", "TID", "Branch Type", "In Tx", "Branch")[column]
+		if self.have_ipc:
+			return ("Time", "CPU", "Command", "PID", "TID", "Branch Type", "In Tx", "Insn Cnt", "Cyc Cnt", "IPC", "Branch")[column]
+		else:
+			return ("Time", "CPU", "Command", "PID", "TID", "Branch Type", "In Tx", "Branch")[column]
 
 	def columnFont(self, column):
-		if column != 7:
+		if self.have_ipc:
+			br_col = 10
+		else:
+			br_col = 7
+		if column != br_col:
 			return None
 		return QFont("Monospace")
 
@@ -2114,10 +2172,10 @@ def GetEventList(db):
 
 # Is a table selectable
 
-def IsSelectable(db, table, sql = ""):
+def IsSelectable(db, table, sql = "", columns = "*"):
 	query = QSqlQuery(db)
 	try:
-		QueryExec(query, "SELECT * FROM " + table + " " + sql + " LIMIT 1")
+		QueryExec(query, "SELECT " + columns + " FROM " + table + " " + sql + " LIMIT 1")
 	except:
 		return False
 	return True
@@ -2854,6 +2912,12 @@ cd xed
 sudo ./mfile.py --prefix=/usr/local install
 sudo ldconfig
 </pre>
+<h3>Instructions per Cycle (IPC)</h3>
+If available, IPC information is displayed in columns 'insn_cnt', 'cyc_cnt' and 'IPC'.
+<p><b>Intel PT note:</b> The information applies to the blocks of code ending with, and including, that branch.
+Due to the granularity of timing information, the number of cycles for some code blocks will not be known.
+In that case, 'insn_cnt', 'cyc_cnt' and 'IPC' are zero, but when 'IPC' is displayed it covers the period
+since the previous displayed 'IPC'.
 <h3>Find</h3>
 Ctrl-F displays a Find bar which finds substrings by either an exact match or a regular expression match.
 Refer to Python documentation for the regular expression syntax.

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [tip:perf/core] perf scripts python: exported-sql-viewer.py: Add CallGraphModelParams
  2019-05-20 11:37 ` [PATCH 19/22] perf scripts python: exported-sql-viewer.py: Add CallGraphModelParams Adrian Hunter
@ 2019-06-17 19:09   ` tip-bot for Adrian Hunter
  0 siblings, 0 replies; 52+ messages in thread
From: tip-bot for Adrian Hunter @ 2019-06-17 19:09 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: mingo, tglx, linux-kernel, jolsa, hpa, acme, adrian.hunter

Commit-ID:  4a0979d4b4feee67a7f9a5605b5bfae3b0a2b6a9
Gitweb:     https://git.kernel.org/tip/4a0979d4b4feee67a7f9a5605b5bfae3b0a2b6a9
Author:     Adrian Hunter <adrian.hunter@intel.com>
AuthorDate: Mon, 20 May 2019 14:37:25 +0300
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Wed, 5 Jun 2019 09:47:57 -0300

perf scripts python: exported-sql-viewer.py: Add CallGraphModelParams

Add a parameter to call graph and call tree, to determine whether IPC
information is available.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20190520113728.14389-20-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/scripts/python/exported-sql-viewer.py | 73 +++++++++++++-----------
 1 file changed, 41 insertions(+), 32 deletions(-)

diff --git a/tools/perf/scripts/python/exported-sql-viewer.py b/tools/perf/scripts/python/exported-sql-viewer.py
index a607235c8cd9..b3508bd4eb00 100755
--- a/tools/perf/scripts/python/exported-sql-viewer.py
+++ b/tools/perf/scripts/python/exported-sql-viewer.py
@@ -200,9 +200,10 @@ class Thread(QThread):
 
 class TreeModel(QAbstractItemModel):
 
-	def __init__(self, glb, parent=None):
+	def __init__(self, glb, params, parent=None):
 		super(TreeModel, self).__init__(parent)
 		self.glb = glb
+		self.params = params
 		self.root = self.GetRoot()
 		self.last_row_read = 0
 
@@ -463,8 +464,9 @@ class FindBar():
 
 class CallGraphLevelItemBase(object):
 
-	def __init__(self, glb, row, parent_item):
+	def __init__(self, glb, params, row, parent_item):
 		self.glb = glb
+		self.params = params
 		self.row = row
 		self.parent_item = parent_item
 		self.query_done = False;
@@ -503,8 +505,8 @@ class CallGraphLevelItemBase(object):
 
 class CallGraphLevelTwoPlusItemBase(CallGraphLevelItemBase):
 
-	def __init__(self, glb, row, comm_id, thread_id, call_path_id, time, branch_count, parent_item):
-		super(CallGraphLevelTwoPlusItemBase, self).__init__(glb, row, parent_item)
+	def __init__(self, glb, params, row, comm_id, thread_id, call_path_id, time, branch_count, parent_item):
+		super(CallGraphLevelTwoPlusItemBase, self).__init__(glb, params, row, parent_item)
 		self.comm_id = comm_id
 		self.thread_id = thread_id
 		self.call_path_id = call_path_id
@@ -525,7 +527,7 @@ class CallGraphLevelTwoPlusItemBase(CallGraphLevelItemBase):
 					" GROUP BY call_path_id, name, short_name"
 					" ORDER BY call_path_id")
 		while query.next():
-			child_item = CallGraphLevelThreeItem(self.glb, self.child_count, self.comm_id, self.thread_id, query.value(0), query.value(1), query.value(2), query.value(3), int(query.value(4)), int(query.value(5)), self)
+			child_item = CallGraphLevelThreeItem(self.glb, self.params, self.child_count, self.comm_id, self.thread_id, query.value(0), query.value(1), query.value(2), query.value(3), int(query.value(4)), int(query.value(5)), self)
 			self.child_items.append(child_item)
 			self.child_count += 1
 
@@ -533,8 +535,8 @@ class CallGraphLevelTwoPlusItemBase(CallGraphLevelItemBase):
 
 class CallGraphLevelThreeItem(CallGraphLevelTwoPlusItemBase):
 
-	def __init__(self, glb, row, comm_id, thread_id, call_path_id, name, dso, count, time, branch_count, parent_item):
-		super(CallGraphLevelThreeItem, self).__init__(glb, row, comm_id, thread_id, call_path_id, time, branch_count, parent_item)
+	def __init__(self, glb, params, row, comm_id, thread_id, call_path_id, name, dso, count, time, branch_count, parent_item):
+		super(CallGraphLevelThreeItem, self).__init__(glb, params, row, comm_id, thread_id, call_path_id, time, branch_count, parent_item)
 		dso = dsoname(dso)
 		self.data = [ name, dso, str(count), str(time), PercentToOneDP(time, parent_item.time), str(branch_count), PercentToOneDP(branch_count, parent_item.branch_count) ]
 		self.dbid = call_path_id
@@ -543,8 +545,8 @@ class CallGraphLevelThreeItem(CallGraphLevelTwoPlusItemBase):
 
 class CallGraphLevelTwoItem(CallGraphLevelTwoPlusItemBase):
 
-	def __init__(self, glb, row, comm_id, thread_id, pid, tid, parent_item):
-		super(CallGraphLevelTwoItem, self).__init__(glb, row, comm_id, thread_id, 1, 0, 0, parent_item)
+	def __init__(self, glb, params, row, comm_id, thread_id, pid, tid, parent_item):
+		super(CallGraphLevelTwoItem, self).__init__(glb, params, row, comm_id, thread_id, 1, 0, 0, parent_item)
 		self.data = [str(pid) + ":" + str(tid), "", "", "", "", "", ""]
 		self.dbid = thread_id
 
@@ -561,8 +563,8 @@ class CallGraphLevelTwoItem(CallGraphLevelTwoPlusItemBase):
 
 class CallGraphLevelOneItem(CallGraphLevelItemBase):
 
-	def __init__(self, glb, row, comm_id, comm, parent_item):
-		super(CallGraphLevelOneItem, self).__init__(glb, row, parent_item)
+	def __init__(self, glb, params, row, comm_id, comm, parent_item):
+		super(CallGraphLevelOneItem, self).__init__(glb, params, row, parent_item)
 		self.data = [comm, "", "", "", "", "", ""]
 		self.dbid = comm_id
 
@@ -574,7 +576,7 @@ class CallGraphLevelOneItem(CallGraphLevelItemBase):
 					" INNER JOIN threads ON thread_id = threads.id"
 					" WHERE comm_id = " + str(self.dbid))
 		while query.next():
-			child_item = CallGraphLevelTwoItem(self.glb, self.child_count, self.dbid, query.value(0), query.value(1), query.value(2), self)
+			child_item = CallGraphLevelTwoItem(self.glb, self.params, self.child_count, self.dbid, query.value(0), query.value(1), query.value(2), self)
 			self.child_items.append(child_item)
 			self.child_count += 1
 
@@ -582,8 +584,8 @@ class CallGraphLevelOneItem(CallGraphLevelItemBase):
 
 class CallGraphRootItem(CallGraphLevelItemBase):
 
-	def __init__(self, glb):
-		super(CallGraphRootItem, self).__init__(glb, 0, None)
+	def __init__(self, glb, params):
+		super(CallGraphRootItem, self).__init__(glb, params, 0, None)
 		self.dbid = 0
 		self.query_done = True;
 		query = QSqlQuery(glb.db)
@@ -591,16 +593,23 @@ class CallGraphRootItem(CallGraphLevelItemBase):
 		while query.next():
 			if not query.value(0):
 				continue
-			child_item = CallGraphLevelOneItem(glb, self.child_count, query.value(0), query.value(1), self)
+			child_item = CallGraphLevelOneItem(glb, params, self.child_count, query.value(0), query.value(1), self)
 			self.child_items.append(child_item)
 			self.child_count += 1
 
+# Call graph model parameters
+
+class CallGraphModelParams():
+
+	def __init__(self, glb, parent=None):
+		self.have_ipc = IsSelectable(glb.db, "calls", columns = "insn_count, cyc_count")
+
 # Context-sensitive call graph data model base
 
 class CallGraphModelBase(TreeModel):
 
 	def __init__(self, glb, parent=None):
-		super(CallGraphModelBase, self).__init__(glb, parent)
+		super(CallGraphModelBase, self).__init__(glb, CallGraphModelParams(glb), parent)
 
 	def FindSelect(self, value, pattern, query):
 		if pattern:
@@ -682,7 +691,7 @@ class CallGraphModel(CallGraphModelBase):
 		super(CallGraphModel, self).__init__(glb, parent)
 
 	def GetRoot(self):
-		return CallGraphRootItem(self.glb)
+		return CallGraphRootItem(self.glb, self.params)
 
 	def columnCount(self, parent=None):
 		return 7
@@ -729,8 +738,8 @@ class CallGraphModel(CallGraphModelBase):
 
 class CallTreeLevelTwoPlusItemBase(CallGraphLevelItemBase):
 
-	def __init__(self, glb, row, comm_id, thread_id, calls_id, time, branch_count, parent_item):
-		super(CallTreeLevelTwoPlusItemBase, self).__init__(glb, row, parent_item)
+	def __init__(self, glb, params, row, comm_id, thread_id, calls_id, time, branch_count, parent_item):
+		super(CallTreeLevelTwoPlusItemBase, self).__init__(glb, params, row, parent_item)
 		self.comm_id = comm_id
 		self.thread_id = thread_id
 		self.calls_id = calls_id
@@ -752,7 +761,7 @@ class CallTreeLevelTwoPlusItemBase(CallGraphLevelItemBase):
 					" WHERE calls.parent_id = " + str(self.calls_id) + comm_thread +
 					" ORDER BY call_time, calls.id")
 		while query.next():
-			child_item = CallTreeLevelThreeItem(self.glb, self.child_count, self.comm_id, self.thread_id, query.value(0), query.value(1), query.value(2), query.value(3), int(query.value(4)), int(query.value(5)), self)
+			child_item = CallTreeLevelThreeItem(self.glb, self.params, self.child_count, self.comm_id, self.thread_id, query.value(0), query.value(1), query.value(2), query.value(3), int(query.value(4)), int(query.value(5)), self)
 			self.child_items.append(child_item)
 			self.child_count += 1
 
@@ -760,8 +769,8 @@ class CallTreeLevelTwoPlusItemBase(CallGraphLevelItemBase):
 
 class CallTreeLevelThreeItem(CallTreeLevelTwoPlusItemBase):
 
-	def __init__(self, glb, row, comm_id, thread_id, calls_id, name, dso, count, time, branch_count, parent_item):
-		super(CallTreeLevelThreeItem, self).__init__(glb, row, comm_id, thread_id, calls_id, time, branch_count, parent_item)
+	def __init__(self, glb, params, row, comm_id, thread_id, calls_id, name, dso, count, time, branch_count, parent_item):
+		super(CallTreeLevelThreeItem, self).__init__(glb, params, row, comm_id, thread_id, calls_id, time, branch_count, parent_item)
 		dso = dsoname(dso)
 		self.data = [ name, dso, str(count), str(time), PercentToOneDP(time, parent_item.time), str(branch_count), PercentToOneDP(branch_count, parent_item.branch_count) ]
 		self.dbid = calls_id
@@ -770,8 +779,8 @@ class CallTreeLevelThreeItem(CallTreeLevelTwoPlusItemBase):
 
 class CallTreeLevelTwoItem(CallTreeLevelTwoPlusItemBase):
 
-	def __init__(self, glb, row, comm_id, thread_id, pid, tid, parent_item):
-		super(CallTreeLevelTwoItem, self).__init__(glb, row, comm_id, thread_id, 0, 0, 0, parent_item)
+	def __init__(self, glb, params, row, comm_id, thread_id, pid, tid, parent_item):
+		super(CallTreeLevelTwoItem, self).__init__(glb, params, row, comm_id, thread_id, 0, 0, 0, parent_item)
 		self.data = [str(pid) + ":" + str(tid), "", "", "", "", "", ""]
 		self.dbid = thread_id
 
@@ -788,8 +797,8 @@ class CallTreeLevelTwoItem(CallTreeLevelTwoPlusItemBase):
 
 class CallTreeLevelOneItem(CallGraphLevelItemBase):
 
-	def __init__(self, glb, row, comm_id, comm, parent_item):
-		super(CallTreeLevelOneItem, self).__init__(glb, row, parent_item)
+	def __init__(self, glb, params, row, comm_id, comm, parent_item):
+		super(CallTreeLevelOneItem, self).__init__(glb, params, row, parent_item)
 		self.data = [comm, "", "", "", "", "", ""]
 		self.dbid = comm_id
 
@@ -801,7 +810,7 @@ class CallTreeLevelOneItem(CallGraphLevelItemBase):
 					" INNER JOIN threads ON thread_id = threads.id"
 					" WHERE comm_id = " + str(self.dbid))
 		while query.next():
-			child_item = CallTreeLevelTwoItem(self.glb, self.child_count, self.dbid, query.value(0), query.value(1), query.value(2), self)
+			child_item = CallTreeLevelTwoItem(self.glb, self.params, self.child_count, self.dbid, query.value(0), query.value(1), query.value(2), self)
 			self.child_items.append(child_item)
 			self.child_count += 1
 
@@ -809,8 +818,8 @@ class CallTreeLevelOneItem(CallGraphLevelItemBase):
 
 class CallTreeRootItem(CallGraphLevelItemBase):
 
-	def __init__(self, glb):
-		super(CallTreeRootItem, self).__init__(glb, 0, None)
+	def __init__(self, glb, params):
+		super(CallTreeRootItem, self).__init__(glb, params, 0, None)
 		self.dbid = 0
 		self.query_done = True;
 		query = QSqlQuery(glb.db)
@@ -818,7 +827,7 @@ class CallTreeRootItem(CallGraphLevelItemBase):
 		while query.next():
 			if not query.value(0):
 				continue
-			child_item = CallTreeLevelOneItem(glb, self.child_count, query.value(0), query.value(1), self)
+			child_item = CallTreeLevelOneItem(glb, params, self.child_count, query.value(0), query.value(1), self)
 			self.child_items.append(child_item)
 			self.child_count += 1
 
@@ -830,7 +839,7 @@ class CallTreeModel(CallGraphModelBase):
 		super(CallTreeModel, self).__init__(glb, parent)
 
 	def GetRoot(self):
-		return CallTreeRootItem(self.glb)
+		return CallTreeRootItem(self.glb, self.params)
 
 	def columnCount(self, parent=None):
 		return 7
@@ -1606,7 +1615,7 @@ class BranchModel(TreeModel):
 	progress = Signal(object)
 
 	def __init__(self, glb, event_id, where_clause, parent=None):
-		super(BranchModel, self).__init__(glb, parent)
+		super(BranchModel, self).__init__(glb, None, parent)
 		self.event_id = event_id
 		self.more = True
 		self.populated = 0

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [tip:perf/core] perf scripts python: exported-sql-viewer.py: Add IPC information to Call Graph Graph
  2019-05-20 11:37 ` [PATCH 20/22] perf scripts python: exported-sql-viewer.py: Add IPC information to Call Graph Graph Adrian Hunter
@ 2019-06-17 19:10   ` tip-bot for Adrian Hunter
  0 siblings, 0 replies; 52+ messages in thread
From: tip-bot for Adrian Hunter @ 2019-06-17 19:10 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: acme, linux-kernel, adrian.hunter, mingo, jolsa, hpa, tglx

Commit-ID:  38a846d47f3d2fe6783e2df7bc5c2415239e6a63
Gitweb:     https://git.kernel.org/tip/38a846d47f3d2fe6783e2df7bc5c2415239e6a63
Author:     Adrian Hunter <adrian.hunter@intel.com>
AuthorDate: Mon, 20 May 2019 14:37:26 +0300
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Wed, 5 Jun 2019 09:47:57 -0300

perf scripts python: exported-sql-viewer.py: Add IPC information to Call Graph Graph

Enhance the call graph to display IPC information if it is available.

Committer testing:

[acme@quaco adrian.hunter]$ python ~acme/libexec/perf-core/scripts/python/exported-sql-viewer.py ~/c/adrian.hunter/simple-retpoline.db

Reports -> Context Sensitive Callgraph, then expand a few trees, then
select with the mouse and press control+C:

Call Path                     Object          Count Time(ns) Time(%) Insn Insn   Cyc   Cyc    IPC Branch Branch
▼ simple-retpolin                                                    Cnt  Cnt(%) Cnt   Cnt(%)     Cnt    Cnt(%)
  ▼ 23003:23003
    ▼ _start                  ld-2.28.so         1 218295   100.0  127746 100.0 207320 100.0 0.62 13046  100.0
      ▶ unknown               unknown            1   3202     1.5       0   0.0      0   0.0    0     1    0.0
      ▶ _dl_start             ld-2.28.so         1 188471    86.3  123394  96.6 180007  86.8 0.69 12529   96.0
      ▶ _dl_init              ld-2.28.so         1  13406     6.1    3207   2.5  14868   7.2 0.22   327    2.5
      ▼ _start                simple-retpoline   1  12899     5.9    1142   0.9  11561   5.6 0.10   184    1.4
        ▶ unknown             unknown            1    846     6.6       0   0.0      0   0.0    0     1    0.5
        ▼ __libc_start_main   libc-2.28.so       1  11621    90.1    1129  98.9  10350  89.5 0.11   181   98.4
          ▶ __cxa_atexit      libc-2.28.so       1   2302    19.8     101   8.9   1817  17.6 0.06    13    7.2
          ▶ __libc_csu_init   simple-retpoline   1    121     1.0      43   3.8    340   3.3 0.13     8    4.4
          ▼ _setjmp           libc-2.28.so       1     74     0.6      46   4.1    206   2.0 0.22     4    2.2
            ▼ __sigsetjmp     libc-2.28.so       1     74   100.0      46 100.0    206 100.0 0.22     3   75.0
              ▶ __sigjmp_save libc-2.28.so       1      0     0.0       0   0.0      0   0.0    0     1   33.3
          ▼ main              simple-retpoline   1     44     0.4      23   2.0    126   1.2 0.18    12    6.6
            ▼ foo             simple-retpoline   2     44   100.0      23 100.0    126 100.0 0.18    10   83.3
                bar           simple-retpoline   2     22    50.0       6  26.1     61  48.4 0.10     2   20.0
          ▶ exit              libc-2.28.so       1   9029    77.7     878  77.8   7765  75.0 0.11   139   76.8

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20190520113728.14389-21-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/scripts/python/exported-sql-viewer.py | 69 +++++++++++++++++++-----
 1 file changed, 56 insertions(+), 13 deletions(-)

diff --git a/tools/perf/scripts/python/exported-sql-viewer.py b/tools/perf/scripts/python/exported-sql-viewer.py
index b3508bd4eb00..f5b1b63995b0 100755
--- a/tools/perf/scripts/python/exported-sql-viewer.py
+++ b/tools/perf/scripts/python/exported-sql-viewer.py
@@ -505,18 +505,24 @@ class CallGraphLevelItemBase(object):
 
 class CallGraphLevelTwoPlusItemBase(CallGraphLevelItemBase):
 
-	def __init__(self, glb, params, row, comm_id, thread_id, call_path_id, time, branch_count, parent_item):
+	def __init__(self, glb, params, row, comm_id, thread_id, call_path_id, time, insn_cnt, cyc_cnt, branch_count, parent_item):
 		super(CallGraphLevelTwoPlusItemBase, self).__init__(glb, params, row, parent_item)
 		self.comm_id = comm_id
 		self.thread_id = thread_id
 		self.call_path_id = call_path_id
+		self.insn_cnt = insn_cnt
+		self.cyc_cnt = cyc_cnt
 		self.branch_count = branch_count
 		self.time = time
 
 	def Select(self):
 		self.query_done = True;
 		query = QSqlQuery(self.glb.db)
-		QueryExec(query, "SELECT call_path_id, name, short_name, COUNT(calls.id), SUM(return_time - call_time), SUM(branch_count)"
+		if self.params.have_ipc:
+			ipc_str = ", SUM(insn_count), SUM(cyc_count)"
+		else:
+			ipc_str = ""
+		QueryExec(query, "SELECT call_path_id, name, short_name, COUNT(calls.id), SUM(return_time - call_time)" + ipc_str + ", SUM(branch_count)"
 					" FROM calls"
 					" INNER JOIN call_paths ON calls.call_path_id = call_paths.id"
 					" INNER JOIN symbols ON call_paths.symbol_id = symbols.id"
@@ -527,7 +533,15 @@ class CallGraphLevelTwoPlusItemBase(CallGraphLevelItemBase):
 					" GROUP BY call_path_id, name, short_name"
 					" ORDER BY call_path_id")
 		while query.next():
-			child_item = CallGraphLevelThreeItem(self.glb, self.params, self.child_count, self.comm_id, self.thread_id, query.value(0), query.value(1), query.value(2), query.value(3), int(query.value(4)), int(query.value(5)), self)
+			if self.params.have_ipc:
+				insn_cnt = int(query.value(5))
+				cyc_cnt = int(query.value(6))
+				branch_count = int(query.value(7))
+			else:
+				insn_cnt = 0
+				cyc_cnt = 0
+				branch_count = int(query.value(5))
+			child_item = CallGraphLevelThreeItem(self.glb, self.params, self.child_count, self.comm_id, self.thread_id, query.value(0), query.value(1), query.value(2), query.value(3), int(query.value(4)), insn_cnt, cyc_cnt, branch_count, self)
 			self.child_items.append(child_item)
 			self.child_count += 1
 
@@ -535,10 +549,17 @@ class CallGraphLevelTwoPlusItemBase(CallGraphLevelItemBase):
 
 class CallGraphLevelThreeItem(CallGraphLevelTwoPlusItemBase):
 
-	def __init__(self, glb, params, row, comm_id, thread_id, call_path_id, name, dso, count, time, branch_count, parent_item):
-		super(CallGraphLevelThreeItem, self).__init__(glb, params, row, comm_id, thread_id, call_path_id, time, branch_count, parent_item)
+	def __init__(self, glb, params, row, comm_id, thread_id, call_path_id, name, dso, count, time, insn_cnt, cyc_cnt, branch_count, parent_item):
+		super(CallGraphLevelThreeItem, self).__init__(glb, params, row, comm_id, thread_id, call_path_id, time, insn_cnt, cyc_cnt, branch_count, parent_item)
 		dso = dsoname(dso)
-		self.data = [ name, dso, str(count), str(time), PercentToOneDP(time, parent_item.time), str(branch_count), PercentToOneDP(branch_count, parent_item.branch_count) ]
+		if self.params.have_ipc:
+			insn_pcnt = PercentToOneDP(insn_cnt, parent_item.insn_cnt)
+			cyc_pcnt = PercentToOneDP(cyc_cnt, parent_item.cyc_cnt)
+			br_pcnt = PercentToOneDP(branch_count, parent_item.branch_count)
+			ipc = CalcIPC(cyc_cnt, insn_cnt)
+			self.data = [ name, dso, str(count), str(time), PercentToOneDP(time, parent_item.time), str(insn_cnt), insn_pcnt, str(cyc_cnt), cyc_pcnt, ipc, str(branch_count), br_pcnt ]
+		else:
+			self.data = [ name, dso, str(count), str(time), PercentToOneDP(time, parent_item.time), str(branch_count), PercentToOneDP(branch_count, parent_item.branch_count) ]
 		self.dbid = call_path_id
 
 # Context-sensitive call graph data model level two item
@@ -546,18 +567,28 @@ class CallGraphLevelThreeItem(CallGraphLevelTwoPlusItemBase):
 class CallGraphLevelTwoItem(CallGraphLevelTwoPlusItemBase):
 
 	def __init__(self, glb, params, row, comm_id, thread_id, pid, tid, parent_item):
-		super(CallGraphLevelTwoItem, self).__init__(glb, params, row, comm_id, thread_id, 1, 0, 0, parent_item)
-		self.data = [str(pid) + ":" + str(tid), "", "", "", "", "", ""]
+		super(CallGraphLevelTwoItem, self).__init__(glb, params, row, comm_id, thread_id, 1, 0, 0, 0, 0, parent_item)
+		if self.params.have_ipc:
+			self.data = [str(pid) + ":" + str(tid), "", "", "", "", "", "", "", "", "", "", ""]
+		else:
+			self.data = [str(pid) + ":" + str(tid), "", "", "", "", "", ""]
 		self.dbid = thread_id
 
 	def Select(self):
 		super(CallGraphLevelTwoItem, self).Select()
 		for child_item in self.child_items:
 			self.time += child_item.time
+			self.insn_cnt += child_item.insn_cnt
+			self.cyc_cnt += child_item.cyc_cnt
 			self.branch_count += child_item.branch_count
 		for child_item in self.child_items:
 			child_item.data[4] = PercentToOneDP(child_item.time, self.time)
-			child_item.data[6] = PercentToOneDP(child_item.branch_count, self.branch_count)
+			if self.params.have_ipc:
+				child_item.data[6] = PercentToOneDP(child_item.insn_cnt, self.insn_cnt)
+				child_item.data[8] = PercentToOneDP(child_item.cyc_cnt, self.cyc_cnt)
+				child_item.data[11] = PercentToOneDP(child_item.branch_count, self.branch_count)
+			else:
+				child_item.data[6] = PercentToOneDP(child_item.branch_count, self.branch_count)
 
 # Context-sensitive call graph data model level one item
 
@@ -565,7 +596,10 @@ class CallGraphLevelOneItem(CallGraphLevelItemBase):
 
 	def __init__(self, glb, params, row, comm_id, comm, parent_item):
 		super(CallGraphLevelOneItem, self).__init__(glb, params, row, parent_item)
-		self.data = [comm, "", "", "", "", "", ""]
+		if self.params.have_ipc:
+			self.data = [comm, "", "", "", "", "", "", "", "", "", "", ""]
+		else:
+			self.data = [comm, "", "", "", "", "", ""]
 		self.dbid = comm_id
 
 	def Select(self):
@@ -694,14 +728,23 @@ class CallGraphModel(CallGraphModelBase):
 		return CallGraphRootItem(self.glb, self.params)
 
 	def columnCount(self, parent=None):
-		return 7
+		if self.params.have_ipc:
+			return 12
+		else:
+			return 7
 
 	def columnHeader(self, column):
-		headers = ["Call Path", "Object", "Count ", "Time (ns) ", "Time (%) ", "Branch Count ", "Branch Count (%) "]
+		if self.params.have_ipc:
+			headers = ["Call Path", "Object", "Count ", "Time (ns) ", "Time (%) ", "Insn Cnt", "Insn Cnt (%)", "Cyc Cnt", "Cyc Cnt (%)", "IPC", "Branch Count ", "Branch Count (%) "]
+		else:
+			headers = ["Call Path", "Object", "Count ", "Time (ns) ", "Time (%) ", "Branch Count ", "Branch Count (%) "]
 		return headers[column]
 
 	def columnAlignment(self, column):
-		alignment = [ Qt.AlignLeft, Qt.AlignLeft, Qt.AlignRight, Qt.AlignRight, Qt.AlignRight, Qt.AlignRight, Qt.AlignRight ]
+		if self.params.have_ipc:
+			alignment = [ Qt.AlignLeft, Qt.AlignLeft, Qt.AlignRight, Qt.AlignRight, Qt.AlignRight, Qt.AlignRight, Qt.AlignRight, Qt.AlignRight, Qt.AlignRight, Qt.AlignRight, Qt.AlignRight, Qt.AlignRight ]
+		else:
+			alignment = [ Qt.AlignLeft, Qt.AlignLeft, Qt.AlignRight, Qt.AlignRight, Qt.AlignRight, Qt.AlignRight, Qt.AlignRight ]
 		return alignment[column]
 
 	def DoFindSelect(self, query, match):

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [tip:perf/core] perf scripts python: exported-sql-viewer.py: Add IPC information to Call Tree
  2019-05-20 11:37 ` [PATCH 21/22] perf scripts python: exported-sql-viewer.py: Add IPC information to Call Tree Adrian Hunter
@ 2019-06-17 19:11   ` tip-bot for Adrian Hunter
  0 siblings, 0 replies; 52+ messages in thread
From: tip-bot for Adrian Hunter @ 2019-06-17 19:11 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, jolsa, acme, mingo, tglx, hpa, adrian.hunter

Commit-ID:  b3b660792e049c7ef4a40c4caa7008efd4777b3c
Gitweb:     https://git.kernel.org/tip/b3b660792e049c7ef4a40c4caa7008efd4777b3c
Author:     Adrian Hunter <adrian.hunter@intel.com>
AuthorDate: Mon, 20 May 2019 14:37:27 +0300
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Wed, 5 Jun 2019 09:47:57 -0300

perf scripts python: exported-sql-viewer.py: Add IPC information to Call Tree

Enhance the call tree to display IPC information if it is available.

Committer testing:

[acme@quaco adrian.hunter]$ python ~acme/libexec/perf-core/scripts/python/exported-sql-viewer.py ~/c/adrian.hunter/simple-retpoline.db

Reports -> Call Tree, then expand a few trees, then select with the
mouse and press control+C (copy):

Call Path                   Object        Call Time Time  Time(%) Insn  Insn   Cyc   Cyc   IPC Branch Branch
▼ simple-retpolin                                   (ns)          Cnt   Cnt(%) Cnt   Cnt(%)     Count Count(%)
  ▼ 23003:23003
    ▼ _start                ld-2.28.so    112195670 218295 100.0 127746 100.0 207320 100.0 0.62 13046 100.0
      ▶ unknown             unknown       112195987   3202   1.5      0   0.0      0   0.0    0     1   0.0
      ▶ _dl_start           ld-2.28.so    112199189 188471  86.3 123394  96.6 180007  86.8 0.69 12529  96.0
      ▼ _dl_init            ld-2.28.so    112387660  13406   6.1   3207   2.5  14868   7.2 0.22   327   2.5
        ▶ call_init.part.0  ld-2.28.so    112387773    117   0.9     70   2.2    639   4.3 0.11     3   0.9
        ▶ call_init.part.0  ld-2.28.so    112387890  13129  97.9   3103  96.8  14100  94.8 0.22   315  96.3
        ▶ call_init.part.0  ld-2.28.so    112401020      0   0.0      0   0.0      0   0.0    0     2   0.6
      ▼ _start              simple-retpol 112401066  12899   5.9   1142   0.9  11561   5.6 0.10   184   1.4
        ▶ unknown           unknown       112401388    846   6.6      0   0.0      0   0.0    0     1   0.5
        ▼ __libc_start_main libc-2.28.so  112402344  11621  90.1   1129  98.9  10350  89.5 0.11   181  98.4
          ▶ __cxa_atexit    libc-2.28.so  112402360   2302  19.8    101   8.9   1817  17.6 0.06    13   7.2
          ▶ __libc_csu_init simple-retpol 112404673    121   1.0     43   3.8    340   3.3 0.13     8   4.4
          ▶ _setjmp         libc-2.28.so  112404794     74   0.6     46   4.1    206   2.0 0.22     4   2.2
          ▼ main            simple-retpol 112404892     44   0.4     23   2.0    126   1.2 0.18    12   6.6
            ▼ foo           simple-retpol 112404892     19  43.2     12  52.2     55  43.7 0.22     5  41.7
                bar         simple-retpol 112404896     12  63.2      3  25.0     34  61.8 0.09     1  20.0
            ▼ foo           simple-retpol 112404911     25  56.8     11  47.8     71  56.3 0.15     5  41.7
              ▶ bar         simple-retpol 112404924     10  40.0      3  27.3     27  38.0 0.11     1  20.0
          ▶ exit            libc-2.28.so  112404936   9029  77.7    878  77.8   7765  75.0 0.11   139  76.8

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20190520113728.14389-22-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/scripts/python/exported-sql-viewer.py | 69 +++++++++++++++++++-----
 1 file changed, 56 insertions(+), 13 deletions(-)

diff --git a/tools/perf/scripts/python/exported-sql-viewer.py b/tools/perf/scripts/python/exported-sql-viewer.py
index f5b1b63995b0..94489cf2ce0e 100755
--- a/tools/perf/scripts/python/exported-sql-viewer.py
+++ b/tools/perf/scripts/python/exported-sql-viewer.py
@@ -781,11 +781,13 @@ class CallGraphModel(CallGraphModelBase):
 
 class CallTreeLevelTwoPlusItemBase(CallGraphLevelItemBase):
 
-	def __init__(self, glb, params, row, comm_id, thread_id, calls_id, time, branch_count, parent_item):
+	def __init__(self, glb, params, row, comm_id, thread_id, calls_id, time, insn_cnt, cyc_cnt, branch_count, parent_item):
 		super(CallTreeLevelTwoPlusItemBase, self).__init__(glb, params, row, parent_item)
 		self.comm_id = comm_id
 		self.thread_id = thread_id
 		self.calls_id = calls_id
+		self.insn_cnt = insn_cnt
+		self.cyc_cnt = cyc_cnt
 		self.branch_count = branch_count
 		self.time = time
 
@@ -795,8 +797,12 @@ class CallTreeLevelTwoPlusItemBase(CallGraphLevelItemBase):
 			comm_thread = " AND comm_id = " + str(self.comm_id) + " AND thread_id = " + str(self.thread_id)
 		else:
 			comm_thread = ""
+		if self.params.have_ipc:
+			ipc_str = ", insn_count, cyc_count"
+		else:
+			ipc_str = ""
 		query = QSqlQuery(self.glb.db)
-		QueryExec(query, "SELECT calls.id, name, short_name, call_time, return_time - call_time, branch_count"
+		QueryExec(query, "SELECT calls.id, name, short_name, call_time, return_time - call_time" + ipc_str + ", branch_count"
 					" FROM calls"
 					" INNER JOIN call_paths ON calls.call_path_id = call_paths.id"
 					" INNER JOIN symbols ON call_paths.symbol_id = symbols.id"
@@ -804,7 +810,15 @@ class CallTreeLevelTwoPlusItemBase(CallGraphLevelItemBase):
 					" WHERE calls.parent_id = " + str(self.calls_id) + comm_thread +
 					" ORDER BY call_time, calls.id")
 		while query.next():
-			child_item = CallTreeLevelThreeItem(self.glb, self.params, self.child_count, self.comm_id, self.thread_id, query.value(0), query.value(1), query.value(2), query.value(3), int(query.value(4)), int(query.value(5)), self)
+			if self.params.have_ipc:
+				insn_cnt = int(query.value(5))
+				cyc_cnt = int(query.value(6))
+				branch_count = int(query.value(7))
+			else:
+				insn_cnt = 0
+				cyc_cnt = 0
+				branch_count = int(query.value(5))
+			child_item = CallTreeLevelThreeItem(self.glb, self.params, self.child_count, self.comm_id, self.thread_id, query.value(0), query.value(1), query.value(2), query.value(3), int(query.value(4)), insn_cnt, cyc_cnt, branch_count, self)
 			self.child_items.append(child_item)
 			self.child_count += 1
 
@@ -812,10 +826,17 @@ class CallTreeLevelTwoPlusItemBase(CallGraphLevelItemBase):
 
 class CallTreeLevelThreeItem(CallTreeLevelTwoPlusItemBase):
 
-	def __init__(self, glb, params, row, comm_id, thread_id, calls_id, name, dso, count, time, branch_count, parent_item):
-		super(CallTreeLevelThreeItem, self).__init__(glb, params, row, comm_id, thread_id, calls_id, time, branch_count, parent_item)
+	def __init__(self, glb, params, row, comm_id, thread_id, calls_id, name, dso, count, time, insn_cnt, cyc_cnt, branch_count, parent_item):
+		super(CallTreeLevelThreeItem, self).__init__(glb, params, row, comm_id, thread_id, calls_id, time, insn_cnt, cyc_cnt, branch_count, parent_item)
 		dso = dsoname(dso)
-		self.data = [ name, dso, str(count), str(time), PercentToOneDP(time, parent_item.time), str(branch_count), PercentToOneDP(branch_count, parent_item.branch_count) ]
+		if self.params.have_ipc:
+			insn_pcnt = PercentToOneDP(insn_cnt, parent_item.insn_cnt)
+			cyc_pcnt = PercentToOneDP(cyc_cnt, parent_item.cyc_cnt)
+			br_pcnt = PercentToOneDP(branch_count, parent_item.branch_count)
+			ipc = CalcIPC(cyc_cnt, insn_cnt)
+			self.data = [ name, dso, str(count), str(time), PercentToOneDP(time, parent_item.time), str(insn_cnt), insn_pcnt, str(cyc_cnt), cyc_pcnt, ipc, str(branch_count), br_pcnt ]
+		else:
+			self.data = [ name, dso, str(count), str(time), PercentToOneDP(time, parent_item.time), str(branch_count), PercentToOneDP(branch_count, parent_item.branch_count) ]
 		self.dbid = calls_id
 
 # Call tree data model level two item
@@ -823,18 +844,28 @@ class CallTreeLevelThreeItem(CallTreeLevelTwoPlusItemBase):
 class CallTreeLevelTwoItem(CallTreeLevelTwoPlusItemBase):
 
 	def __init__(self, glb, params, row, comm_id, thread_id, pid, tid, parent_item):
-		super(CallTreeLevelTwoItem, self).__init__(glb, params, row, comm_id, thread_id, 0, 0, 0, parent_item)
-		self.data = [str(pid) + ":" + str(tid), "", "", "", "", "", ""]
+		super(CallTreeLevelTwoItem, self).__init__(glb, params, row, comm_id, thread_id, 0, 0, 0, 0, 0, parent_item)
+		if self.params.have_ipc:
+			self.data = [str(pid) + ":" + str(tid), "", "", "", "", "", "", "", "", "", "", ""]
+		else:
+			self.data = [str(pid) + ":" + str(tid), "", "", "", "", "", ""]
 		self.dbid = thread_id
 
 	def Select(self):
 		super(CallTreeLevelTwoItem, self).Select()
 		for child_item in self.child_items:
 			self.time += child_item.time
+			self.insn_cnt += child_item.insn_cnt
+			self.cyc_cnt += child_item.cyc_cnt
 			self.branch_count += child_item.branch_count
 		for child_item in self.child_items:
 			child_item.data[4] = PercentToOneDP(child_item.time, self.time)
-			child_item.data[6] = PercentToOneDP(child_item.branch_count, self.branch_count)
+			if self.params.have_ipc:
+				child_item.data[6] = PercentToOneDP(child_item.insn_cnt, self.insn_cnt)
+				child_item.data[8] = PercentToOneDP(child_item.cyc_cnt, self.cyc_cnt)
+				child_item.data[11] = PercentToOneDP(child_item.branch_count, self.branch_count)
+			else:
+				child_item.data[6] = PercentToOneDP(child_item.branch_count, self.branch_count)
 
 # Call tree data model level one item
 
@@ -842,7 +873,10 @@ class CallTreeLevelOneItem(CallGraphLevelItemBase):
 
 	def __init__(self, glb, params, row, comm_id, comm, parent_item):
 		super(CallTreeLevelOneItem, self).__init__(glb, params, row, parent_item)
-		self.data = [comm, "", "", "", "", "", ""]
+		if self.params.have_ipc:
+			self.data = [comm, "", "", "", "", "", "", "", "", "", "", ""]
+		else:
+			self.data = [comm, "", "", "", "", "", ""]
 		self.dbid = comm_id
 
 	def Select(self):
@@ -885,14 +919,23 @@ class CallTreeModel(CallGraphModelBase):
 		return CallTreeRootItem(self.glb, self.params)
 
 	def columnCount(self, parent=None):
-		return 7
+		if self.params.have_ipc:
+			return 12
+		else:
+			return 7
 
 	def columnHeader(self, column):
-		headers = ["Call Path", "Object", "Call Time", "Time (ns) ", "Time (%) ", "Branch Count ", "Branch Count (%) "]
+		if self.params.have_ipc:
+			headers = ["Call Path", "Object", "Call Time", "Time (ns) ", "Time (%) ", "Insn Cnt", "Insn Cnt (%)", "Cyc Cnt", "Cyc Cnt (%)", "IPC", "Branch Count ", "Branch Count (%) "]
+		else:
+			headers = ["Call Path", "Object", "Call Time", "Time (ns) ", "Time (%) ", "Branch Count ", "Branch Count (%) "]
 		return headers[column]
 
 	def columnAlignment(self, column):
-		alignment = [ Qt.AlignLeft, Qt.AlignLeft, Qt.AlignRight, Qt.AlignRight, Qt.AlignRight, Qt.AlignRight, Qt.AlignRight ]
+		if self.params.have_ipc:
+			alignment = [ Qt.AlignLeft, Qt.AlignLeft, Qt.AlignRight, Qt.AlignRight, Qt.AlignRight, Qt.AlignRight, Qt.AlignRight, Qt.AlignRight, Qt.AlignRight, Qt.AlignRight, Qt.AlignRight, Qt.AlignRight ]
+		else:
+			alignment = [ Qt.AlignLeft, Qt.AlignLeft, Qt.AlignRight, Qt.AlignRight, Qt.AlignRight, Qt.AlignRight, Qt.AlignRight ]
 		return alignment[column]
 
 	def DoFindSelect(self, query, match):

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [tip:perf/core] perf scripts python: exported-sql-viewer.py: Select find text when find bar is activated
  2019-05-20 11:37 ` [PATCH 22/22] perf scripts python: exported-sql-viewer.py: Select find text when find bar is activated Adrian Hunter
@ 2019-06-17 19:11   ` tip-bot for Adrian Hunter
  0 siblings, 0 replies; 52+ messages in thread
From: tip-bot for Adrian Hunter @ 2019-06-17 19:11 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: acme, mingo, jolsa, tglx, linux-kernel, adrian.hunter, hpa

Commit-ID:  80b3fb64a55a7e4ba1ef8f9a7e87fbe1a26dc709
Gitweb:     https://git.kernel.org/tip/80b3fb64a55a7e4ba1ef8f9a7e87fbe1a26dc709
Author:     Adrian Hunter <adrian.hunter@intel.com>
AuthorDate: Mon, 20 May 2019 14:37:28 +0300
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Wed, 5 Jun 2019 09:47:58 -0300

perf scripts python: exported-sql-viewer.py: Select find text when find bar is activated

The user probably wants to replace the find text, so select the find
text when the find bar is activated.

That is fairly standard behaviour for search text entry.

Entering text will replace the current text, but using edit keys
(arrows, home, end etc) cancels the selection and enables editing.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20190520113728.14389-23-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/scripts/python/exported-sql-viewer.py | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tools/perf/scripts/python/exported-sql-viewer.py b/tools/perf/scripts/python/exported-sql-viewer.py
index 94489cf2ce0e..6e7934f2ac9a 100755
--- a/tools/perf/scripts/python/exported-sql-viewer.py
+++ b/tools/perf/scripts/python/exported-sql-viewer.py
@@ -400,6 +400,7 @@ class FindBar():
 
 	def Activate(self):
 		self.bar.show()
+		self.textbox.lineEdit().selectAll()
 		self.textbox.setFocus()
 
 	def Deactivate(self):

^ permalink raw reply related	[flat|nested] 52+ messages in thread

end of thread, other threads:[~2019-06-17 19:12 UTC | newest]

Thread overview: 52+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-20 11:37 [PATCH 00/22] perf intel-pt: Add support for instructions-per-cycle (IPC) Adrian Hunter
2019-05-20 11:37 ` [PATCH 01/22] perf intel-pt: Fix itrace defaults for perf script Adrian Hunter
2019-05-20 13:58   ` Sasha Levin
2019-05-20 14:45   ` Arnaldo Carvalho de Melo
2019-05-31 16:45     ` Arnaldo Carvalho de Melo
2019-06-04 11:32       ` Adrian Hunter
2019-06-04 13:20         ` Arnaldo Carvalho de Melo
2019-05-30  7:53   ` [tip:perf/core] " tip-bot for Adrian Hunter
2019-05-20 11:37 ` [PATCH 02/22] perf auxtrace: " Adrian Hunter
2019-05-30  7:54   ` [tip:perf/core] " tip-bot for Adrian Hunter
2019-05-20 11:37 ` [PATCH 03/22] perf intel-pt: Fix itrace defaults for perf script intel-pt documentation Adrian Hunter
2019-05-30  7:54   ` [tip:perf/core] " tip-bot for Adrian Hunter
2019-05-20 11:37 ` [PATCH 04/22] perf intel-pt: Factor out intel_pt_update_sample_time Adrian Hunter
2019-06-17 18:59   ` [tip:perf/core] " tip-bot for Adrian Hunter
2019-05-20 11:37 ` [PATCH 05/22] perf intel-pt: Accumulate cycle count from CYC packets Adrian Hunter
2019-06-17 18:59   ` [tip:perf/core] " tip-bot for Adrian Hunter
2019-05-20 11:37 ` [PATCH 06/22] perf tools: Add IPC information to perf_sample Adrian Hunter
2019-06-17 19:00   ` [tip:perf/core] " tip-bot for Adrian Hunter
2019-05-20 11:37 ` [PATCH 07/22] perf intel-pt: Add support for samples to contain IPC ratio Adrian Hunter
2019-06-17 19:01   ` [tip:perf/core] " tip-bot for Adrian Hunter
2019-05-20 11:37 ` [PATCH 08/22] perf script: Add output of " Adrian Hunter
2019-06-17 19:01   ` [tip:perf/core] " tip-bot for Adrian Hunter
2019-05-20 11:37 ` [PATCH 09/22] perf intel-pt: Record when decoding PSB+ packets Adrian Hunter
2019-06-17 19:02   ` [tip:perf/core] " tip-bot for Adrian Hunter
2019-05-20 11:37 ` [PATCH 10/22] perf intel-pt: Re-factor TIP cases in intel_pt_walk_to_ip Adrian Hunter
2019-06-17 19:03   ` [tip:perf/core] " tip-bot for Adrian Hunter
2019-05-20 11:37 ` [PATCH 11/22] perf intel-pt: Accumulate cycle count from TSC/TMA/MTC packets Adrian Hunter
2019-06-17 19:04   ` [tip:perf/core] " tip-bot for Adrian Hunter
2019-05-20 11:37 ` [PATCH 12/22] perf intel-pt: Document IPC usage Adrian Hunter
2019-06-17 19:04   ` [tip:perf/core] " tip-bot for Adrian Hunter
2019-05-20 11:37 ` [PATCH 13/22] perf thread-stack: Accumulate IPC information Adrian Hunter
2019-06-17 19:05   ` [tip:perf/core] " tip-bot for Adrian Hunter
2019-05-20 11:37 ` [PATCH 14/22] perf db-export: Add brief documentation Adrian Hunter
2019-06-17 19:06   ` [tip:perf/core] " tip-bot for Adrian Hunter
2019-05-20 11:37 ` [PATCH 15/22] perf db-export: Export IPC information Adrian Hunter
2019-06-17 19:07   ` [tip:perf/core] " tip-bot for Adrian Hunter
2019-05-20 11:37 ` [PATCH 16/22] perf scripts python: export-to-sqlite.py: " Adrian Hunter
2019-06-17 19:07   ` [tip:perf/core] " tip-bot for Adrian Hunter
2019-05-20 11:37 ` [PATCH 17/22] perf scripts python: export-to-postgresql.py: " Adrian Hunter
2019-06-17 19:08   ` [tip:perf/core] " tip-bot for Adrian Hunter
2019-05-20 11:37 ` [PATCH 18/22] perf scripts python: exported-sql-viewer.py: Add IPC information to the Branch reports Adrian Hunter
2019-05-31 16:44   ` Arnaldo Carvalho de Melo
2019-06-03  5:50     ` Adrian Hunter
2019-06-17 19:09   ` [tip:perf/core] " tip-bot for Adrian Hunter
2019-05-20 11:37 ` [PATCH 19/22] perf scripts python: exported-sql-viewer.py: Add CallGraphModelParams Adrian Hunter
2019-06-17 19:09   ` [tip:perf/core] " tip-bot for Adrian Hunter
2019-05-20 11:37 ` [PATCH 20/22] perf scripts python: exported-sql-viewer.py: Add IPC information to Call Graph Graph Adrian Hunter
2019-06-17 19:10   ` [tip:perf/core] " tip-bot for Adrian Hunter
2019-05-20 11:37 ` [PATCH 21/22] perf scripts python: exported-sql-viewer.py: Add IPC information to Call Tree Adrian Hunter
2019-06-17 19:11   ` [tip:perf/core] " tip-bot for Adrian Hunter
2019-05-20 11:37 ` [PATCH 22/22] perf scripts python: exported-sql-viewer.py: Select find text when find bar is activated Adrian Hunter
2019-06-17 19:11   ` [tip:perf/core] " tip-bot for Adrian Hunter

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.