linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Arnaldo Carvalho de Melo <acme@kernel.org>
To: Ingo Molnar <mingo@kernel.org>, Thomas Gleixner <tglx@linutronix.de>
Cc: Jiri Olsa <jolsa@kernel.org>, Namhyung Kim <namhyung@kernel.org>,
	Clark Williams <williams@redhat.com>,
	linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org,
	Adrian Hunter <adrian.hunter@intel.com>,
	Jiri Olsa <jolsa@redhat.com>,
	Arnaldo Carvalho de Melo <acme@redhat.com>
Subject: [PATCH 14/85] perf intel-pt: Accumulate cycle count from TSC/TMA/MTC packets
Date: Tue, 11 Jun 2019 15:58:00 -0300	[thread overview]
Message-ID: <20190611185911.11645-15-acme@kernel.org> (raw)
In-Reply-To: <20190611185911.11645-1-acme@kernel.org>

From: Adrian Hunter <adrian.hunter@intel.com>

When CYC packets are not available, it is still possible to count cycles
using TSC/TMA/MTC timestamps.

As the timestamp increments in TSC ticks, convert to CPU cycles using
the current core-to-bus ratio.

Do not accumulate cycles when control flow packet generation is not
enabled, nor when time has been "lost", typically due to mwait, which is
indicated by a TSC/TMA packet that is not part of PSB+.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20190520113728.14389-12-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 .../util/intel-pt-decoder/intel-pt-decoder.c  | 51 +++++++++++++++++++
 1 file changed, 51 insertions(+)

diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
index 99773445872d..9eb778f9c911 100644
--- a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
+++ b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
@@ -163,6 +163,9 @@ struct intel_pt_decoder {
 	uint64_t last_masked_timestamp;
 	uint64_t tot_cyc_cnt;
 	uint64_t sample_tot_cyc_cnt;
+	uint64_t base_cyc_cnt;
+	uint64_t cyc_cnt_timestamp;
+	double tsc_to_cyc;
 	bool continuous_period;
 	bool overflow;
 	bool set_fup_tx_flags;
@@ -1423,6 +1426,42 @@ static int intel_pt_overflow(struct intel_pt_decoder *decoder)
 	return -EOVERFLOW;
 }
 
+static inline void intel_pt_mtc_cyc_cnt_pge(struct intel_pt_decoder *decoder)
+{
+	if (decoder->have_cyc)
+		return;
+
+	decoder->cyc_cnt_timestamp = decoder->timestamp;
+	decoder->base_cyc_cnt = decoder->tot_cyc_cnt;
+}
+
+static inline void intel_pt_mtc_cyc_cnt_cbr(struct intel_pt_decoder *decoder)
+{
+	decoder->tsc_to_cyc = decoder->cbr / decoder->max_non_turbo_ratio_fp;
+
+	if (decoder->pge)
+		intel_pt_mtc_cyc_cnt_pge(decoder);
+}
+
+static inline void intel_pt_mtc_cyc_cnt_upd(struct intel_pt_decoder *decoder)
+{
+	uint64_t tot_cyc_cnt, tsc_delta;
+
+	if (decoder->have_cyc)
+		return;
+
+	decoder->sample_cyc = true;
+
+	if (!decoder->pge || decoder->timestamp <= decoder->cyc_cnt_timestamp)
+		return;
+
+	tsc_delta = decoder->timestamp - decoder->cyc_cnt_timestamp;
+	tot_cyc_cnt = tsc_delta * decoder->tsc_to_cyc + decoder->base_cyc_cnt;
+
+	if (tot_cyc_cnt > decoder->tot_cyc_cnt)
+		decoder->tot_cyc_cnt = tot_cyc_cnt;
+}
+
 static void intel_pt_calc_tma(struct intel_pt_decoder *decoder)
 {
 	uint32_t ctc = decoder->packet.payload;
@@ -1432,6 +1471,11 @@ static void intel_pt_calc_tma(struct intel_pt_decoder *decoder)
 	if (!decoder->tsc_ctc_ratio_d)
 		return;
 
+	if (decoder->pge && !decoder->in_psb)
+		intel_pt_mtc_cyc_cnt_pge(decoder);
+	else
+		intel_pt_mtc_cyc_cnt_upd(decoder);
+
 	decoder->last_mtc = (ctc >> decoder->mtc_shift) & 0xff;
 	decoder->ctc_timestamp = decoder->tsc_timestamp - fc;
 	if (decoder->tsc_ctc_mult) {
@@ -1487,6 +1531,8 @@ static void intel_pt_calc_mtc_timestamp(struct intel_pt_decoder *decoder)
 	else
 		decoder->timestamp = timestamp;
 
+	intel_pt_mtc_cyc_cnt_upd(decoder);
+
 	decoder->timestamp_insn_cnt = 0;
 	decoder->last_mtc = mtc;
 
@@ -1511,6 +1557,8 @@ static void intel_pt_calc_cbr(struct intel_pt_decoder *decoder)
 
 	decoder->cbr = cbr;
 	decoder->cbr_cyc_to_tsc = decoder->max_non_turbo_ratio_fp / cbr;
+
+	intel_pt_mtc_cyc_cnt_cbr(decoder);
 }
 
 static void intel_pt_calc_cyc_timestamp(struct intel_pt_decoder *decoder)
@@ -1706,6 +1754,7 @@ static int intel_pt_walk_fup_tip(struct intel_pt_decoder *decoder)
 				decoder->state.to_ip = decoder->ip;
 			}
 			decoder->state.type |= INTEL_PT_TRACE_BEGIN;
+			intel_pt_mtc_cyc_cnt_pge(decoder);
 			return 0;
 
 		case INTEL_PT_TIP:
@@ -1776,6 +1825,7 @@ static int intel_pt_walk_trace(struct intel_pt_decoder *decoder)
 
 		case INTEL_PT_TIP_PGE: {
 			decoder->pge = true;
+			intel_pt_mtc_cyc_cnt_pge(decoder);
 			if (decoder->packet.count == 0) {
 				intel_pt_log_at("Skipping zero TIP.PGE",
 						decoder->pos);
@@ -2138,6 +2188,7 @@ static int intel_pt_walk_to_ip(struct intel_pt_decoder *decoder)
 
 		case INTEL_PT_TIP_PGE:
 			decoder->pge = true;
+			intel_pt_mtc_cyc_cnt_pge(decoder);
 			if (intel_pt_have_ip(decoder))
 				intel_pt_set_ip(decoder);
 			if (!decoder->ip)
-- 
2.20.1


  parent reply	other threads:[~2019-06-11 19:00 UTC|newest]

Thread overview: 87+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-11 18:57 [GIT PULL] perf/core improvements and fixes Arnaldo Carvalho de Melo
2019-06-11 18:57 ` [PATCH 01/85] perf data: Add description of header HEADER_BPF_PROG_INFO and HEADER_BPF_BTF Arnaldo Carvalho de Melo
2019-06-11 18:57 ` [PATCH 02/85] perf data: Document memory topology header: HEADER_MEM_TOPOLOGY Arnaldo Carvalho de Melo
2019-06-11 18:57 ` [PATCH 03/85] perf data: Document clockid header: HEADER_CLOCKID Arnaldo Carvalho de Melo
2019-06-11 18:57 ` [PATCH 04/85] perf data: Document directory format header: HEADER_DIR_FORMAT Arnaldo Carvalho de Melo
2019-06-11 18:57 ` [PATCH 05/85] perf symbols: Remove unused variable 'err' Arnaldo Carvalho de Melo
2019-06-11 18:57 ` [PATCH 06/85] perf record: Allow mixing --user-regs with --call-graph=dwarf Arnaldo Carvalho de Melo
2019-06-11 18:57 ` [PATCH 07/85] perf intel-pt: Factor out intel_pt_update_sample_time Arnaldo Carvalho de Melo
2019-06-11 18:57 ` [PATCH 08/85] perf intel-pt: Accumulate cycle count from CYC packets Arnaldo Carvalho de Melo
2019-06-11 18:57 ` [PATCH 09/85] perf tools: Add IPC information to perf_sample Arnaldo Carvalho de Melo
2019-06-11 18:57 ` [PATCH 10/85] perf intel-pt: Add support for samples to contain IPC ratio Arnaldo Carvalho de Melo
2019-06-11 18:57 ` [PATCH 11/85] perf script: Add output of " Arnaldo Carvalho de Melo
2019-06-11 18:57 ` [PATCH 12/85] perf intel-pt: Record when decoding PSB+ packets Arnaldo Carvalho de Melo
2019-06-11 18:57 ` [PATCH 13/85] perf intel-pt: Re-factor TIP cases in intel_pt_walk_to_ip Arnaldo Carvalho de Melo
2019-06-11 18:58 ` Arnaldo Carvalho de Melo [this message]
2019-06-11 18:58 ` [PATCH 15/85] perf intel-pt: Document IPC usage Arnaldo Carvalho de Melo
2019-06-11 18:58 ` [PATCH 16/85] perf thread-stack: Accumulate IPC information Arnaldo Carvalho de Melo
2019-06-11 18:58 ` [PATCH 17/85] perf db-export: Add brief documentation Arnaldo Carvalho de Melo
2019-06-11 18:58 ` [PATCH 18/85] perf db-export: Export IPC information Arnaldo Carvalho de Melo
2019-06-11 18:58 ` [PATCH 19/85] perf scripts python: export-to-sqlite.py: " Arnaldo Carvalho de Melo
2019-06-11 18:58 ` [PATCH 20/85] perf scripts python: export-to-postgresql.py: " Arnaldo Carvalho de Melo
2019-06-11 18:58 ` [PATCH 21/85] perf scripts python: exported-sql-viewer.py: Add IPC information to the Branch reports Arnaldo Carvalho de Melo
2019-06-11 18:58 ` [PATCH 22/85] perf scripts python: exported-sql-viewer.py: Add CallGraphModelParams Arnaldo Carvalho de Melo
2019-06-11 18:58 ` [PATCH 23/85] perf scripts python: exported-sql-viewer.py: Add IPC information to Call Graph Graph Arnaldo Carvalho de Melo
2019-06-11 18:58 ` [PATCH 24/85] perf scripts python: exported-sql-viewer.py: Add IPC information to Call Tree Arnaldo Carvalho de Melo
2019-06-11 18:58 ` [PATCH 25/85] perf scripts python: exported-sql-viewer.py: Select find text when find bar is activated Arnaldo Carvalho de Melo
2019-06-11 18:58 ` [PATCH 26/85] perf augmented_raw_syscalls: Tell which args are filenames and how many bytes to copy Arnaldo Carvalho de Melo
2019-06-11 18:58 ` [PATCH 27/85] perf augmented_raw_syscalls: Move the probe_read_str to a separate function Arnaldo Carvalho de Melo
2019-06-11 18:58 ` [PATCH 28/85] perf augmented_raw_syscalls: Change helper to consider just the augmented_filename part Arnaldo Carvalho de Melo
2019-06-11 18:58 ` [PATCH 29/85] perf augmented_raw_syscalls: Move reading filename to the loop Arnaldo Carvalho de Melo
2019-06-11 18:58 ` [PATCH 30/85] perf jvmti: Address gcc string overflow warning for strncpy() Arnaldo Carvalho de Melo
2019-06-11 18:58 ` [PATCH 31/85] perf trace: Consume the augmented_raw_syscalls payload Arnaldo Carvalho de Melo
2019-06-11 18:58 ` [PATCH 32/85] perf trace: Associate more argument names with the filename beautifier Arnaldo Carvalho de Melo
2019-06-11 18:58 ` [PATCH 33/85] perf trace: Exit when failing to build eBPF program Arnaldo Carvalho de Melo
2019-06-11 18:58 ` [PATCH 34/85] perf config: Bail out when a handler returns failure for a key-value pair Arnaldo Carvalho de Melo
2019-06-11 18:58 ` [PATCH 35/85] perf record: Add support to collect callchains from kernel or user space only Arnaldo Carvalho de Melo
2019-06-11 18:58 ` [PATCH 36/85] perf evsel: Remove superfluous nthreads system_wide setup in alloc_fd() Arnaldo Carvalho de Melo
2019-06-11 18:58 ` [PATCH 37/85] perf cs-etm: Configure contextID tracing in CPU-wide mode Arnaldo Carvalho de Melo
2019-06-11 18:58 ` [PATCH 38/85] perf cs-etm: Configure timestamp generation " Arnaldo Carvalho de Melo
2019-06-11 18:58 ` [PATCH 39/85] perf cs-etm: Configure SWITCH_EVENTS " Arnaldo Carvalho de Melo
2019-06-11 18:58 ` [PATCH 40/85] perf cs-etm: Add handling of itrace start events Arnaldo Carvalho de Melo
2019-06-11 18:58 ` [PATCH 41/85] perf cs-etm: Add handling of switch-CPU-wide events Arnaldo Carvalho de Melo
2019-06-11 18:58 ` [PATCH 42/85] perf cs-etm: Refactor error path in cs_etm_decoder__new() Arnaldo Carvalho de Melo
2019-06-11 18:58 ` [PATCH 43/85] perf cs-etm: Move packet queue out of decoder structure Arnaldo Carvalho de Melo
2019-06-11 18:58 ` [PATCH 44/85] perf cs-etm: Fix indentation in function cs_etm__process_decoder_queue() Arnaldo Carvalho de Melo
2019-06-11 18:58 ` [PATCH 45/85] perf cs-etm: Introduce the concept of trace ID queues Arnaldo Carvalho de Melo
2019-06-11 18:58 ` [PATCH 46/85] perf cs-etm: Get rid of unused cpu in struct cs_etm_queue Arnaldo Carvalho de Melo
2019-06-11 18:58 ` [PATCH 47/85] perf cs-etm: Move thread to traceid_queue Arnaldo Carvalho de Melo
2019-06-11 18:58 ` [PATCH 48/85] perf cs-etm: Move tid/pid " Arnaldo Carvalho de Melo
2019-06-11 18:58 ` [PATCH 49/85] perf cs-etm: Use traceID aware memory callback API Arnaldo Carvalho de Melo
2019-06-11 18:58 ` [PATCH 50/85] perf cs-etm: Add support for multiple traceID queues Arnaldo Carvalho de Melo
2019-06-11 18:58 ` [PATCH 51/85] perf cs-etm: Linking PE contextID with perf thread mechanic Arnaldo Carvalho de Melo
2019-06-11 18:58 ` [PATCH 52/85] perf cs-etm: Add notion of time to decoding code Arnaldo Carvalho de Melo
2019-06-11 18:58 ` [PATCH 53/85] perf cs-etm: Add support for CPU-wide trace scenarios Arnaldo Carvalho de Melo
2019-06-11 18:58 ` [PATCH 54/85] perf cpumap: Retrieve die id information Arnaldo Carvalho de Melo
2019-06-11 18:58 ` [PATCH 55/85] perf header: Add die information in CPU topology Arnaldo Carvalho de Melo
2019-06-11 18:58 ` [PATCH 56/85] perf stat: Support per-die aggregation Arnaldo Carvalho de Melo
2019-06-11 18:58 ` [PATCH 57/85] perf header: Rename "sibling cores" to "sibling sockets" Arnaldo Carvalho de Melo
2019-06-11 18:58 ` [PATCH 58/85] perf tools: Apply new CPU topology sysfs attributes Arnaldo Carvalho de Melo
2019-06-11 18:58 ` [PATCH 59/85] perf data: Fix perf.data documentation for HEADER_CPU_TOPOLOGY Arnaldo Carvalho de Melo
2019-06-11 18:58 ` [PATCH 60/85] perf cs-etm: Properly set the value of 'old' and 'head' in snapshot mode Arnaldo Carvalho de Melo
2019-06-11 18:58 ` [PATCH 61/85] perf cs-etm: Remove duplicate GENMASK() define, use linux/bits.h instead Arnaldo Carvalho de Melo
2019-06-11 18:58 ` [PATCH 62/85] perf config: Update default value for llvm.clang-bpf-cmd-template Arnaldo Carvalho de Melo
2019-06-11 18:58 ` [PATCH 63/85] perf auxtrace: Add perf time interval to itrace_synth_ops Arnaldo Carvalho de Melo
2019-06-11 18:58 ` [PATCH 64/85] perf script: Set perf time interval in itrace_synth_ops Arnaldo Carvalho de Melo
2019-06-11 18:58 ` [PATCH 65/85] perf report: " Arnaldo Carvalho de Melo
2019-06-11 18:58 ` [PATCH 66/85] perf intel-pt: Add lookahead callback Arnaldo Carvalho de Melo
2019-06-11 18:58 ` [PATCH 67/85] perf intel-pt: Factor out intel_pt_8b_tsc() Arnaldo Carvalho de Melo
2019-06-11 18:58 ` [PATCH 68/85] perf intel-pt: Factor out intel_pt_reposition() Arnaldo Carvalho de Melo
2019-06-11 18:58 ` [PATCH 69/85] perf intel-pt: Add reposition parameter to intel_pt_get_data() Arnaldo Carvalho de Melo
2019-06-11 18:58 ` [PATCH 70/85] perf intel-pt: Add intel_pt_fast_forward() Arnaldo Carvalho de Melo
2019-06-11 18:58 ` [PATCH 71/85] perf intel-pt: Factor out intel_pt_get_buffer() Arnaldo Carvalho de Melo
2019-06-11 18:58 ` [PATCH 72/85] perf intel-pt: Add support for lookahead Arnaldo Carvalho de Melo
2019-06-11 18:58 ` [PATCH 73/85] perf intel-pt: Add support for efficient time interval filtering Arnaldo Carvalho de Melo
2019-06-11 18:59 ` [PATCH 74/85] perf time-utils: Treat time ranges consistently Arnaldo Carvalho de Melo
2019-06-11 18:59 ` [PATCH 75/85] perf time-utils: Factor out set_percent_time() Arnaldo Carvalho de Melo
2019-06-11 18:59 ` [PATCH 76/85] perf time-utils: Prevent percentage time range overlap Arnaldo Carvalho de Melo
2019-06-11 18:59 ` [PATCH 77/85] perf time-utils: Fix --time documentation Arnaldo Carvalho de Melo
2019-06-11 18:59 ` [PATCH 78/85] perf time-utils: Simplify perf_time__parse_for_ranges() error paths slightly Arnaldo Carvalho de Melo
2019-06-11 18:59 ` [PATCH 79/85] perf time-utils: Make perf_time__parse_for_ranges() more logical Arnaldo Carvalho de Melo
2019-06-11 18:59 ` [PATCH 80/85] perf tests: Add a test for time-utils Arnaldo Carvalho de Melo
2019-06-11 18:59 ` [PATCH 81/85] perf time-utils: Add support for multiple explicit time intervals Arnaldo Carvalho de Melo
2019-06-11 18:59 ` [PATCH 82/85] perf test 6: Fix missing kvm module load for s390 Arnaldo Carvalho de Melo
2019-06-11 18:59 ` [PATCH 83/85] perf report: Fix OOM error in TUI mode on s390 Arnaldo Carvalho de Melo
2019-06-11 18:59 ` [PATCH 84/85] perf report: Support s390 diag event display on x86 Arnaldo Carvalho de Melo
2019-06-11 18:59 ` [PATCH 85/85] perf trace: Skip unknown syscalls when expanding strace like syscall groups Arnaldo Carvalho de Melo
2019-06-17 18:48 ` [GIT PULL] perf/core improvements and fixes Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190611185911.11645-15-acme@kernel.org \
    --to=acme@kernel.org \
    --cc=acme@redhat.com \
    --cc=adrian.hunter@intel.com \
    --cc=jolsa@kernel.org \
    --cc=jolsa@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=namhyung@kernel.org \
    --cc=tglx@linutronix.de \
    --cc=williams@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).