linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/7] perf arm-spe: Enable timestamp
@ 2021-04-03  7:23 Leo Yan
  2021-04-03  7:23 ` [PATCH v2 1/7] perf arm-spe: Remove unused enum value ARM_SPE_PER_CPU_MMAPS Leo Yan
                   ` (7 more replies)
  0 siblings, 8 replies; 12+ messages in thread
From: Leo Yan @ 2021-04-03  7:23 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, John Garry, Will Deacon,
	Mathieu Poirier, James Clark, Al Grant, Peter Zijlstra,
	Ingo Molnar, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, Adrian Hunter, linux-arm-kernel, linux-kernel
  Cc: Leo Yan

As we know, the timestamp is important for AUX trace; it's mainly used
to correlate between perf events and AUX trace, allows to generate
events with time ordered manner.  There have several good examples of
enabling timestamp for AUX trace (like Intel-pt, Intel-bts, etc).

Since the conversion between TSC and kernel timestamp has been supported
on Arm64, TSC is a naming convention from x86, but perf now has reused
it to support Arm arch timer counter.

This patch set is to enable timestamp for Arm SPE trace.  It reads out
TSC parameters from mmap page and stores into auxtrace info structure;
the TSC parameters are used for conversion between timer counter and
kernel time and which is applied for Arm SPE samples.

This patch set can be clearly applied on perf/core branch with:

  commit 6859bc0e78c6 ("perf stat: Improve readability of shadow stats")

Ths patch series has been tested on Hisilicon D06 platform.

After:

  # perf script -F comm,time,cpu,pid,dso,ip,sym

              perf  2408 [032]   168.680297:  ffffbd1253690a3c perf_event_exec ([kernel.kallsyms])
              perf  2408 [032]   168.680297:  ffffbd1253690a3c perf_event_exec ([kernel.kallsyms])
   false_sharing.e  2408 [032]   168.680317:  ffffbd1253683f50 perf_iterate_ctx.constprop.0 ([kernel.kallsyms])
   false_sharing.e  2408 [032]   168.680317:  ffffbd1253683f50 perf_iterate_ctx.constprop.0 ([kernel.kallsyms])
   false_sharing.e  2408 [032]   168.680319:  ffffbd1253683f70 perf_iterate_ctx.constprop.0 ([kernel.kallsyms])
   false_sharing.e  2408 [032]   168.680319:  ffffbd1253683f70 perf_iterate_ctx.constprop.0 ([kernel.kallsyms])
   false_sharing.e  2408 [032]   168.680367:  ffffbd12539b03ec __arch_clear_user ([kernel.kallsyms])
   false_sharing.e  2408 [032]   168.680375:  ffffbd1253721440 kmem_cache_alloc ([kernel.kallsyms])
   false_sharing.e  2408 [032]   168.680375:  ffffbd1253721440 kmem_cache_alloc ([kernel.kallsyms])
   false_sharing.e  2408 [032]   168.680375:  ffffbd1253721440 kmem_cache_alloc ([kernel.kallsyms])
   false_sharing.e  2408 [032]   168.680375:  ffffbd1253721440 kmem_cache_alloc ([kernel.kallsyms])
   false_sharing.e  2408 [032]   168.680376:  ffffbd1253683f70 perf_iterate_ctx.constprop.0 ([kernel.kallsyms])
   false_sharing.e  2408 [032]   168.680376:  ffffbd1253683f70 perf_iterate_ctx.constprop.0 ([kernel.kallsyms])
   false_sharing.e  2408 [032]   168.680376:  ffffbd1253683f70 perf_iterate_ctx.constprop.0 ([kernel.kallsyms])

Changes from v1:
* Rebased patch series on the latest perf/core branch;
* Fixed the patch for dumping TSC parameters to support both the
  older and new auxtrace info format.


Leo Yan (7):
  perf arm-spe: Remove unused enum value ARM_SPE_PER_CPU_MMAPS
  perf arm-spe: Store TSC parameters in auxtrace info
  perf arm-spe: Dump TSC parameters
  perf arm-spe: Convert event kernel time to counter value
  perf arm-spe: Assign kernel time to synthesized event
  perf arm-spe: Bail out if the trace is later than perf event
  perf arm-spe: Don't wait for PERF_RECORD_EXIT event

 tools/perf/arch/arm64/util/arm-spe.c | 23 +++++++
 tools/perf/util/arm-spe.c            | 89 +++++++++++++++++++++++-----
 tools/perf/util/arm-spe.h            |  7 ++-
 3 files changed, 103 insertions(+), 16 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v2 1/7] perf arm-spe: Remove unused enum value ARM_SPE_PER_CPU_MMAPS
  2021-04-03  7:23 [PATCH v2 0/7] perf arm-spe: Enable timestamp Leo Yan
@ 2021-04-03  7:23 ` Leo Yan
  2021-04-03  7:23 ` [PATCH v2 2/7] perf arm-spe: Store TSC parameters in auxtrace info Leo Yan
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 12+ messages in thread
From: Leo Yan @ 2021-04-03  7:23 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, John Garry, Will Deacon,
	Mathieu Poirier, James Clark, Al Grant, Peter Zijlstra,
	Ingo Molnar, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, Adrian Hunter, linux-arm-kernel, linux-kernel
  Cc: Leo Yan

The enum value 'ARM_SPE_PER_CPU_MMAPS' is never used so remove it.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 tools/perf/util/arm-spe.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/tools/perf/util/arm-spe.h b/tools/perf/util/arm-spe.h
index 98d3235781c3..105ce0ea0a01 100644
--- a/tools/perf/util/arm-spe.h
+++ b/tools/perf/util/arm-spe.h
@@ -11,7 +11,6 @@
 
 enum {
 	ARM_SPE_PMU_TYPE,
-	ARM_SPE_PER_CPU_MMAPS,
 	ARM_SPE_AUXTRACE_PRIV_MAX,
 };
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v2 2/7] perf arm-spe: Store TSC parameters in auxtrace info
  2021-04-03  7:23 [PATCH v2 0/7] perf arm-spe: Enable timestamp Leo Yan
  2021-04-03  7:23 ` [PATCH v2 1/7] perf arm-spe: Remove unused enum value ARM_SPE_PER_CPU_MMAPS Leo Yan
@ 2021-04-03  7:23 ` Leo Yan
  2021-04-03  7:23 ` [PATCH v2 3/7] perf arm-spe: Dump TSC parameters Leo Yan
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 12+ messages in thread
From: Leo Yan @ 2021-04-03  7:23 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, John Garry, Will Deacon,
	Mathieu Poirier, James Clark, Al Grant, Peter Zijlstra,
	Ingo Molnar, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, Adrian Hunter, linux-arm-kernel, linux-kernel
  Cc: Leo Yan

The TSC parameters are used for conversion between arch timer counter
and kernel timestamp, this patch stores the parameters into the struct
perf_record_auxtrace_info, and it is saved in perf data file.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 tools/perf/arch/arm64/util/arm-spe.c | 23 +++++++++++++++++++++++
 tools/perf/util/arm-spe.h            |  6 ++++++
 2 files changed, 29 insertions(+)

diff --git a/tools/perf/arch/arm64/util/arm-spe.c b/tools/perf/arch/arm64/util/arm-spe.c
index 414c8a5584b1..dd940cf16f49 100644
--- a/tools/perf/arch/arm64/util/arm-spe.c
+++ b/tools/perf/arch/arm64/util/arm-spe.c
@@ -15,7 +15,9 @@
 #include "../../../util/event.h"
 #include "../../../util/evsel.h"
 #include "../../../util/evlist.h"
+#include "../../../util/mmap.h"
 #include "../../../util/session.h"
+#include "../../../util/tsc.h"
 #include <internal/lib.h> // page_size
 #include "../../../util/pmu.h"
 #include "../../../util/debug.h"
@@ -47,6 +49,9 @@ static int arm_spe_info_fill(struct auxtrace_record *itr,
 	struct arm_spe_recording *sper =
 			container_of(itr, struct arm_spe_recording, itr);
 	struct perf_pmu *arm_spe_pmu = sper->arm_spe_pmu;
+	struct perf_event_mmap_page *pc;
+	struct perf_tsc_conversion tc = { .time_mult = 0, };
+	int err;
 
 	if (priv_size != ARM_SPE_AUXTRACE_PRIV_SIZE)
 		return -EINVAL;
@@ -54,8 +59,26 @@ static int arm_spe_info_fill(struct auxtrace_record *itr,
 	if (!session->evlist->core.nr_mmaps)
 		return -EINVAL;
 
+	pc = session->evlist->mmap[0].core.base;
+	if (pc) {
+		err = perf_read_tsc_conversion(pc, &tc);
+		if (err) {
+			if (err != -EOPNOTSUPP)
+				return err;
+		}
+
+		if (!tc.time_mult)
+			ui__warning("Arm SPE: arch timer not available\n");
+	}
+
 	auxtrace_info->type = PERF_AUXTRACE_ARM_SPE;
 	auxtrace_info->priv[ARM_SPE_PMU_TYPE] = arm_spe_pmu->type;
+	auxtrace_info->priv[ARM_SPE_TIME_SHIFT] = tc.time_shift;
+	auxtrace_info->priv[ARM_SPE_TIME_MULT] = tc.time_mult;
+	auxtrace_info->priv[ARM_SPE_TIME_ZERO] = tc.time_zero;
+	auxtrace_info->priv[ARM_SPE_TIME_CYCLES] = tc.time_cycles;
+	auxtrace_info->priv[ARM_SPE_TIME_MASK] = tc.time_mask;
+	auxtrace_info->priv[ARM_SPE_CAP_USER_TIME_SHORT] = tc.cap_user_time_short;
 
 	return 0;
 }
diff --git a/tools/perf/util/arm-spe.h b/tools/perf/util/arm-spe.h
index 105ce0ea0a01..5bf3e838d226 100644
--- a/tools/perf/util/arm-spe.h
+++ b/tools/perf/util/arm-spe.h
@@ -11,6 +11,12 @@
 
 enum {
 	ARM_SPE_PMU_TYPE,
+	ARM_SPE_TIME_SHIFT,
+	ARM_SPE_TIME_MULT,
+	ARM_SPE_TIME_ZERO,
+	ARM_SPE_TIME_CYCLES,
+	ARM_SPE_TIME_MASK,
+	ARM_SPE_CAP_USER_TIME_SHORT,
 	ARM_SPE_AUXTRACE_PRIV_MAX,
 };
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v2 3/7] perf arm-spe: Dump TSC parameters
  2021-04-03  7:23 [PATCH v2 0/7] perf arm-spe: Enable timestamp Leo Yan
  2021-04-03  7:23 ` [PATCH v2 1/7] perf arm-spe: Remove unused enum value ARM_SPE_PER_CPU_MMAPS Leo Yan
  2021-04-03  7:23 ` [PATCH v2 2/7] perf arm-spe: Store TSC parameters in auxtrace info Leo Yan
@ 2021-04-03  7:23 ` Leo Yan
  2021-04-03  7:23 ` [PATCH v2 4/7] perf arm-spe: Convert event kernel time to counter value Leo Yan
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 12+ messages in thread
From: Leo Yan @ 2021-04-03  7:23 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, John Garry, Will Deacon,
	Mathieu Poirier, James Clark, Al Grant, Peter Zijlstra,
	Ingo Molnar, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, Adrian Hunter, linux-arm-kernel, linux-kernel
  Cc: Leo Yan

The TSC parameters are stored in auxtrace info, this patch dumps these
parameters for reporting the raw data.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 tools/perf/util/arm-spe.c | 42 ++++++++++++++++++++++++++++++++++-----
 1 file changed, 37 insertions(+), 5 deletions(-)

diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c
index 2539d4baec44..69ce3483d1af 100644
--- a/tools/perf/util/arm-spe.c
+++ b/tools/perf/util/arm-spe.c
@@ -26,6 +26,7 @@
 #include "symbol.h"
 #include "thread.h"
 #include "thread-stack.h"
+#include "tsc.h"
 #include "tool.h"
 #include "util/synthetic-events.h"
 
@@ -45,6 +46,8 @@ struct arm_spe {
 	struct machine			*machine;
 	u32				pmu_type;
 
+	struct perf_tsc_conversion	tc;
+
 	u8				timeless_decoding;
 	u8				data_queued;
 
@@ -803,14 +806,23 @@ static bool arm_spe_evsel_is_auxtrace(struct perf_session *session,
 
 static const char * const arm_spe_info_fmts[] = {
 	[ARM_SPE_PMU_TYPE]		= "  PMU Type           %"PRId64"\n",
+	[ARM_SPE_TIME_SHIFT]		= "  Time Shift         %"PRIu64"\n",
+	[ARM_SPE_TIME_MULT]		= "  Time Muliplier     %"PRIu64"\n",
+	[ARM_SPE_TIME_ZERO]		= "  Time Zero          %"PRIu64"\n",
+	[ARM_SPE_TIME_CYCLES]		= "  Time Cycles        %"PRIu64"\n",
+	[ARM_SPE_TIME_MASK]		= "  Time Mask          %#"PRIx64"\n",
+	[ARM_SPE_CAP_USER_TIME_SHORT]	= "  Cap Time Short     %"PRId64"\n",
 };
 
-static void arm_spe_print_info(__u64 *arr)
+static void arm_spe_print_info(__u64 *arr, int start, int finish)
 {
+	int i;
+
 	if (!dump_trace)
 		return;
 
-	fprintf(stdout, arm_spe_info_fmts[ARM_SPE_PMU_TYPE], arr[ARM_SPE_PMU_TYPE]);
+	for (i = start; i <= finish; i++)
+		fprintf(stdout, arm_spe_info_fmts[i], arr[i]);
 }
 
 struct arm_spe_synth {
@@ -1001,11 +1013,19 @@ arm_spe_synth_events(struct arm_spe *spe, struct perf_session *session)
 	return 0;
 }
 
+static bool arm_spe_has(struct perf_record_auxtrace_info *auxtrace_info,
+			int pos)
+{
+	return auxtrace_info->header.size >=
+		(sizeof(struct perf_record_auxtrace_info) +
+		 (sizeof(u64) * (pos + 1)));
+}
+
 int arm_spe_process_auxtrace_info(union perf_event *event,
 				  struct perf_session *session)
 {
 	struct perf_record_auxtrace_info *auxtrace_info = &event->auxtrace_info;
-	size_t min_sz = sizeof(u64) * ARM_SPE_AUXTRACE_PRIV_MAX;
+	size_t min_sz = sizeof(u64) * ARM_SPE_TIME_SHIFT;
 	struct arm_spe *spe;
 	int err;
 
@@ -1025,6 +1045,20 @@ int arm_spe_process_auxtrace_info(union perf_event *event,
 	spe->machine = &session->machines.host; /* No kvm support */
 	spe->auxtrace_type = auxtrace_info->type;
 	spe->pmu_type = auxtrace_info->priv[ARM_SPE_PMU_TYPE];
+	arm_spe_print_info(&auxtrace_info->priv[0], ARM_SPE_PMU_TYPE,
+			   ARM_SPE_PMU_TYPE);
+
+	if (arm_spe_has(auxtrace_info, ARM_SPE_CAP_USER_TIME_SHORT)) {
+		spe->tc.time_shift = auxtrace_info->priv[ARM_SPE_TIME_SHIFT];
+		spe->tc.time_mult = auxtrace_info->priv[ARM_SPE_TIME_MULT];
+		spe->tc.time_zero = auxtrace_info->priv[ARM_SPE_TIME_ZERO];
+		spe->tc.time_cycles = auxtrace_info->priv[ARM_SPE_TIME_CYCLES];
+		spe->tc.time_mask = auxtrace_info->priv[ARM_SPE_TIME_MASK];
+		spe->tc.cap_user_time_short =
+			auxtrace_info->priv[ARM_SPE_CAP_USER_TIME_SHORT];
+		arm_spe_print_info(&auxtrace_info->priv[0], ARM_SPE_TIME_SHIFT,
+				   ARM_SPE_CAP_USER_TIME_SHORT);
+	}
 
 	spe->timeless_decoding = arm_spe__is_timeless_decoding(spe);
 	spe->auxtrace.process_event = arm_spe_process_event;
@@ -1035,8 +1069,6 @@ int arm_spe_process_auxtrace_info(union perf_event *event,
 	spe->auxtrace.evsel_is_auxtrace = arm_spe_evsel_is_auxtrace;
 	session->auxtrace = &spe->auxtrace;
 
-	arm_spe_print_info(&auxtrace_info->priv[0]);
-
 	if (dump_trace)
 		return 0;
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v2 4/7] perf arm-spe: Convert event kernel time to counter value
  2021-04-03  7:23 [PATCH v2 0/7] perf arm-spe: Enable timestamp Leo Yan
                   ` (2 preceding siblings ...)
  2021-04-03  7:23 ` [PATCH v2 3/7] perf arm-spe: Dump TSC parameters Leo Yan
@ 2021-04-03  7:23 ` Leo Yan
  2021-04-03  7:23 ` [PATCH v2 5/7] perf arm-spe: Assign kernel time to synthesized event Leo Yan
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 12+ messages in thread
From: Leo Yan @ 2021-04-03  7:23 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, John Garry, Will Deacon,
	Mathieu Poirier, James Clark, Al Grant, Peter Zijlstra,
	Ingo Molnar, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, Adrian Hunter, linux-arm-kernel, linux-kernel
  Cc: Leo Yan

When handle a perf event, Arm SPE decoder needs to decide if this perf
event is earlier or later than the samples from Arm SPE trace data; to
do comparision, it needs to use the same unit for the time.

This patch converts the event kernel time to arch timer's counter value,
thus it can be used to compare with counter value contained in Arm SPE
Timestamp packet.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 tools/perf/util/arm-spe.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c
index 69ce3483d1af..4cf558b0218a 100644
--- a/tools/perf/util/arm-spe.c
+++ b/tools/perf/util/arm-spe.c
@@ -669,7 +669,7 @@ static int arm_spe_process_event(struct perf_session *session,
 	}
 
 	if (sample->time && (sample->time != (u64) -1))
-		timestamp = sample->time;
+		timestamp = perf_time_to_tsc(sample->time, &spe->tc);
 	else
 		timestamp = 0;
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v2 5/7] perf arm-spe: Assign kernel time to synthesized event
  2021-04-03  7:23 [PATCH v2 0/7] perf arm-spe: Enable timestamp Leo Yan
                   ` (3 preceding siblings ...)
  2021-04-03  7:23 ` [PATCH v2 4/7] perf arm-spe: Convert event kernel time to counter value Leo Yan
@ 2021-04-03  7:23 ` Leo Yan
  2021-04-03  7:23 ` [PATCH v2 6/7] perf arm-spe: Bail out if the trace is later than perf event Leo Yan
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 12+ messages in thread
From: Leo Yan @ 2021-04-03  7:23 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, John Garry, Will Deacon,
	Mathieu Poirier, James Clark, Al Grant, Peter Zijlstra,
	Ingo Molnar, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, Adrian Hunter, linux-arm-kernel, linux-kernel
  Cc: Leo Yan

In current code, it assigns the arch timer counter to the synthesized
samples Arm SPE trace, thus the samples don't contain the kernel time
but only contain the raw counter value.

To fix the issue, this patch converts the timer counter to kernel time
and assigns it to sample timestamp.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 tools/perf/util/arm-spe.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c
index 4cf558b0218a..80f5659e7f7e 100644
--- a/tools/perf/util/arm-spe.c
+++ b/tools/perf/util/arm-spe.c
@@ -234,7 +234,7 @@ static void arm_spe_prep_sample(struct arm_spe *spe,
 	struct arm_spe_record *record = &speq->decoder->record;
 
 	if (!spe->timeless_decoding)
-		sample->time = speq->timestamp;
+		sample->time = tsc_to_perf_time(record->timestamp, &spe->tc);
 
 	sample->ip = record->from_ip;
 	sample->cpumode = arm_spe_cpumode(spe, sample->ip);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v2 6/7] perf arm-spe: Bail out if the trace is later than perf event
  2021-04-03  7:23 [PATCH v2 0/7] perf arm-spe: Enable timestamp Leo Yan
                   ` (4 preceding siblings ...)
  2021-04-03  7:23 ` [PATCH v2 5/7] perf arm-spe: Assign kernel time to synthesized event Leo Yan
@ 2021-04-03  7:23 ` Leo Yan
  2021-04-03  7:23 ` [PATCH v2 7/7] perf arm-spe: Don't wait for PERF_RECORD_EXIT event Leo Yan
  2021-04-06  9:38 ` [PATCH v2 0/7] perf arm-spe: Enable timestamp Al Grant
  7 siblings, 0 replies; 12+ messages in thread
From: Leo Yan @ 2021-04-03  7:23 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, John Garry, Will Deacon,
	Mathieu Poirier, James Clark, Al Grant, Peter Zijlstra,
	Ingo Molnar, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, Adrian Hunter, linux-arm-kernel, linux-kernel
  Cc: Leo Yan

It's possible that record in Arm SPE trace is later than perf event and
vice versa.  This asks to correlate the perf events and Arm SPE
synthesized events to be processed in the manner of correct timing.

To achieve the time ordering, this patch reverses the flow, it firstly
calls arm_spe_sample() and then calls arm_spe_decode().  By comparing
the timestamp value and detect the perf event is coming earlier than Arm
SPE trace data, it bails out from the decoding loop, the last record is
pushed into auxtrace stack and is deferred to generate sample.  To track
the timestamp, everytime it updates timestamp for the latest record.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 tools/perf/util/arm-spe.c | 37 ++++++++++++++++++++++++++++++++++---
 1 file changed, 34 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c
index 80f5659e7f7e..99a394c366e0 100644
--- a/tools/perf/util/arm-spe.c
+++ b/tools/perf/util/arm-spe.c
@@ -434,12 +434,36 @@ static int arm_spe_sample(struct arm_spe_queue *speq)
 static int arm_spe_run_decoder(struct arm_spe_queue *speq, u64 *timestamp)
 {
 	struct arm_spe *spe = speq->spe;
+	struct arm_spe_record *record;
 	int ret;
 
 	if (!spe->kernel_start)
 		spe->kernel_start = machine__kernel_start(spe->machine);
 
 	while (1) {
+		/*
+		 * The usual logic is firstly to decode the packets, and then
+		 * based the record to synthesize sample; but here the flow is
+		 * reversed: it calls arm_spe_sample() for synthesizing samples
+		 * prior to arm_spe_decode().
+		 *
+		 * Two reasons for this code logic:
+		 * 1. Firstly, when setup queue in arm_spe__setup_queue(), it
+		 * has decoded trace data and generated a record, but the record
+		 * is left to generate sample until run to here, so it's correct
+		 * to synthesize sample for the left record.
+		 * 2. After decoding trace data, it needs to compare the record
+		 * timestamp with the coming perf event, if the record timestamp
+		 * is later than the perf event, it needs bail out and pushs the
+		 * record into auxtrace heap, thus the record can be deferred to
+		 * synthesize sample until run to here at the next time; so this
+		 * can correlate samples between Arm SPE trace data and other
+		 * perf events with correct time ordering.
+		 */
+		ret = arm_spe_sample(speq);
+		if (ret)
+			return ret;
+
 		ret = arm_spe_decode(speq->decoder);
 		if (!ret) {
 			pr_debug("No data or all data has been processed.\n");
@@ -453,10 +477,17 @@ static int arm_spe_run_decoder(struct arm_spe_queue *speq, u64 *timestamp)
 		if (ret < 0)
 			continue;
 
-		ret = arm_spe_sample(speq);
-		if (ret)
-			return ret;
+		record = &speq->decoder->record;
 
+		/* Update timestamp for the last record */
+		if (record->timestamp > speq->timestamp)
+			speq->timestamp = record->timestamp;
+
+		/*
+		 * If the timestamp of the queue is later than timestamp of the
+		 * coming perf event, bail out so can allow the perf event to
+		 * be processed ahead.
+		 */
 		if (!spe->timeless_decoding && speq->timestamp >= *timestamp) {
 			*timestamp = speq->timestamp;
 			return 0;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v2 7/7] perf arm-spe: Don't wait for PERF_RECORD_EXIT event
  2021-04-03  7:23 [PATCH v2 0/7] perf arm-spe: Enable timestamp Leo Yan
                   ` (5 preceding siblings ...)
  2021-04-03  7:23 ` [PATCH v2 6/7] perf arm-spe: Bail out if the trace is later than perf event Leo Yan
@ 2021-04-03  7:23 ` Leo Yan
  2021-04-06  9:38 ` [PATCH v2 0/7] perf arm-spe: Enable timestamp Al Grant
  7 siblings, 0 replies; 12+ messages in thread
From: Leo Yan @ 2021-04-03  7:23 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, John Garry, Will Deacon,
	Mathieu Poirier, James Clark, Al Grant, Peter Zijlstra,
	Ingo Molnar, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, Adrian Hunter, linux-arm-kernel, linux-kernel
  Cc: Leo Yan

When decode Arm SPE trace, it waits for PERF_RECORD_EXIT event (the last
perf event) for processing trace data, which is needless and even might
cause logic error, e.g. it might fail to correlate perf events with Arm
SPE events correctly.

So this patch removes the condition checking for PERF_RECORD_EXIT event.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 tools/perf/util/arm-spe.c | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c
index 99a394c366e0..17dcad99912a 100644
--- a/tools/perf/util/arm-spe.c
+++ b/tools/perf/util/arm-spe.c
@@ -717,11 +717,7 @@ static int arm_spe_process_event(struct perf_session *session,
 					sample->time);
 		}
 	} else if (timestamp) {
-		if (event->header.type == PERF_RECORD_EXIT) {
-			err = arm_spe_process_queues(spe, timestamp);
-			if (err)
-				return err;
-		}
+		err = arm_spe_process_queues(spe, timestamp);
 	}
 
 	return err;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* RE: [PATCH v2 0/7] perf arm-spe: Enable timestamp
  2021-04-03  7:23 [PATCH v2 0/7] perf arm-spe: Enable timestamp Leo Yan
                   ` (6 preceding siblings ...)
  2021-04-03  7:23 ` [PATCH v2 7/7] perf arm-spe: Don't wait for PERF_RECORD_EXIT event Leo Yan
@ 2021-04-06  9:38 ` Al Grant
  2021-04-07 13:15   ` Leo Yan
  7 siblings, 1 reply; 12+ messages in thread
From: Al Grant @ 2021-04-06  9:38 UTC (permalink / raw)
  To: leo.yan, Arnaldo Carvalho de Melo, John Garry, Will Deacon,
	Mathieu Poirier, James Clark, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Adrian Hunter, linux-arm-kernel, linux-kernel
  Cc: leo.yan



> -----Original Message-----
> From: Leo Yan <leo.yan@linaro.org>
> Sent: 03 April 2021 08:24
> To: Arnaldo Carvalho de Melo <acme@kernel.org>; John Garry
> <john.garry@huawei.com>; Will Deacon <will@kernel.org>; Mathieu Poirier
> <mathieu.poirier@linaro.org>; James Clark <James.Clark@arm.com>; Al Grant
> <Al.Grant@arm.com>; Peter Zijlstra <peterz@infradead.org>; Ingo Molnar
> <mingo@redhat.com>; Mark Rutland <Mark.Rutland@arm.com>; Alexander
> Shishkin <alexander.shishkin@linux.intel.com>; Jiri Olsa <jolsa@redhat.com>;
> Namhyung Kim <namhyung@kernel.org>; Adrian Hunter
> <adrian.hunter@intel.com>; linux-arm-kernel@lists.infradead.org; linux-
> kernel@vger.kernel.org
> Cc: leo.yan@linaro.org
> Subject: [PATCH v2 0/7] perf arm-spe: Enable timestamp
> 
> As we know, the timestamp is important for AUX trace; it's mainly used to
> correlate between perf events and AUX trace, allows to generate events with
> time ordered manner.  There have several good examples of enabling timestamp
> for AUX trace (like Intel-pt, Intel-bts, etc).
> 
> Since the conversion between TSC and kernel timestamp has been supported on
> Arm64, TSC is a naming convention from x86, but perf now has reused it to
> support Arm arch timer counter.
> 
> This patch set is to enable timestamp for Arm SPE trace.  It reads out TSC
> parameters from mmap page and stores into auxtrace info structure;

Why not synthesize a PERF_RECORD_TIME_CONV - isn't that specifically to
capture the TSC parameters from the mmap page? If a generic mechanism
exists it would be better to use it, otherwise we'll have to do this again for
future trace formats.

perf_read_tsc_conversion and perf_event__synth_time_conv are currently
in arch/x86/util/tsc.c, but nothing in them is x86-specific and they could be
moved somewhere more generic.

Al


> the TSC
> parameters are used for conversion between timer counter and kernel time and
> which is applied for Arm SPE samples.
> 
> This patch set can be clearly applied on perf/core branch with:
> 
>   commit 6859bc0e78c6 ("perf stat: Improve readability of shadow stats")
> 
> Ths patch series has been tested on Hisilicon D06 platform.
> 
> After:
> 
>   # perf script -F comm,time,cpu,pid,dso,ip,sym
> 
>               perf  2408 [032]   168.680297:  ffffbd1253690a3c perf_event_exec
> ([kernel.kallsyms])
>               perf  2408 [032]   168.680297:  ffffbd1253690a3c perf_event_exec
> ([kernel.kallsyms])
>    false_sharing.e  2408 [032]   168.680317:  ffffbd1253683f50
> perf_iterate_ctx.constprop.0 ([kernel.kallsyms])
>    false_sharing.e  2408 [032]   168.680317:  ffffbd1253683f50
> perf_iterate_ctx.constprop.0 ([kernel.kallsyms])
>    false_sharing.e  2408 [032]   168.680319:  ffffbd1253683f70
> perf_iterate_ctx.constprop.0 ([kernel.kallsyms])
>    false_sharing.e  2408 [032]   168.680319:  ffffbd1253683f70
> perf_iterate_ctx.constprop.0 ([kernel.kallsyms])
>    false_sharing.e  2408 [032]   168.680367:  ffffbd12539b03ec
> __arch_clear_user ([kernel.kallsyms])
>    false_sharing.e  2408 [032]   168.680375:  ffffbd1253721440
> kmem_cache_alloc ([kernel.kallsyms])
>    false_sharing.e  2408 [032]   168.680375:  ffffbd1253721440
> kmem_cache_alloc ([kernel.kallsyms])
>    false_sharing.e  2408 [032]   168.680375:  ffffbd1253721440
> kmem_cache_alloc ([kernel.kallsyms])
>    false_sharing.e  2408 [032]   168.680375:  ffffbd1253721440
> kmem_cache_alloc ([kernel.kallsyms])
>    false_sharing.e  2408 [032]   168.680376:  ffffbd1253683f70
> perf_iterate_ctx.constprop.0 ([kernel.kallsyms])
>    false_sharing.e  2408 [032]   168.680376:  ffffbd1253683f70
> perf_iterate_ctx.constprop.0 ([kernel.kallsyms])
>    false_sharing.e  2408 [032]   168.680376:  ffffbd1253683f70
> perf_iterate_ctx.constprop.0 ([kernel.kallsyms])
> 
> Changes from v1:
> * Rebased patch series on the latest perf/core branch;
> * Fixed the patch for dumping TSC parameters to support both the
>   older and new auxtrace info format.
> 
> 
> Leo Yan (7):
>   perf arm-spe: Remove unused enum value ARM_SPE_PER_CPU_MMAPS
>   perf arm-spe: Store TSC parameters in auxtrace info
>   perf arm-spe: Dump TSC parameters
>   perf arm-spe: Convert event kernel time to counter value
>   perf arm-spe: Assign kernel time to synthesized event
>   perf arm-spe: Bail out if the trace is later than perf event
>   perf arm-spe: Don't wait for PERF_RECORD_EXIT event
> 
>  tools/perf/arch/arm64/util/arm-spe.c | 23 +++++++
>  tools/perf/util/arm-spe.c            | 89 +++++++++++++++++++++++-----
>  tools/perf/util/arm-spe.h            |  7 ++-
>  3 files changed, 103 insertions(+), 16 deletions(-)
> 
> --
> 2.25.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 0/7] perf arm-spe: Enable timestamp
  2021-04-06  9:38 ` [PATCH v2 0/7] perf arm-spe: Enable timestamp Al Grant
@ 2021-04-07 13:15   ` Leo Yan
  2021-04-07 13:28     ` Adrian Hunter
  0 siblings, 1 reply; 12+ messages in thread
From: Leo Yan @ 2021-04-07 13:15 UTC (permalink / raw)
  To: Al Grant
  Cc: Arnaldo Carvalho de Melo, John Garry, Will Deacon,
	Mathieu Poirier, James Clark, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Adrian Hunter, linux-arm-kernel, linux-kernel

Hi Al,

On Tue, Apr 06, 2021 at 09:38:32AM +0000, Al Grant wrote:

[...]

> > This patch set is to enable timestamp for Arm SPE trace.  It reads out TSC
> > parameters from mmap page and stores into auxtrace info structure;
> 
> Why not synthesize a PERF_RECORD_TIME_CONV - isn't that specifically to
> capture the TSC parameters from the mmap page? If a generic mechanism
> exists it would be better to use it, otherwise we'll have to do this again for
> future trace formats.

Good point!  Actually "perf record" tool has synthesized event
PERF_RECORD_TIME_CONV.  This patch series is studying the
implementation from Intel-PT, so the question is why the existed
implementations (like Intel-PT, Intel-BTS) don't directly use
PERF_RECORD_TIME_CONV for retriving TSC parameters.

I agree using PERF_RECORD_TIME_CONV for TSC parameter is better than
extending auxtrace info.  Will experiment for this.

> perf_read_tsc_conversion and perf_event__synth_time_conv are currently
> in arch/x86/util/tsc.c, but nothing in them is x86-specific and they could be
> moved somewhere more generic.

This is not true on the mainline kernel; these functions have been
moved into the file util/tsc.c.

Thanks for suggestions,
Leo

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 0/7] perf arm-spe: Enable timestamp
  2021-04-07 13:15   ` Leo Yan
@ 2021-04-07 13:28     ` Adrian Hunter
  2021-04-07 13:40       ` Leo Yan
  0 siblings, 1 reply; 12+ messages in thread
From: Adrian Hunter @ 2021-04-07 13:28 UTC (permalink / raw)
  To: Leo Yan, Al Grant
  Cc: Arnaldo Carvalho de Melo, John Garry, Will Deacon,
	Mathieu Poirier, James Clark, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	linux-arm-kernel, linux-kernel

On 7/04/21 4:15 pm, Leo Yan wrote:
> Hi Al,
> 
> On Tue, Apr 06, 2021 at 09:38:32AM +0000, Al Grant wrote:
> 
> [...]
> 
>>> This patch set is to enable timestamp for Arm SPE trace.  It reads out TSC
>>> parameters from mmap page and stores into auxtrace info structure;
>>
>> Why not synthesize a PERF_RECORD_TIME_CONV - isn't that specifically to
>> capture the TSC parameters from the mmap page? If a generic mechanism
>> exists it would be better to use it, otherwise we'll have to do this again for
>> future trace formats.
> 
> Good point!  Actually "perf record" tool has synthesized event
> PERF_RECORD_TIME_CONV.  This patch series is studying the
> implementation from Intel-PT, so the question is why the existed
> implementations (like Intel-PT, Intel-BTS) don't directly use
> PERF_RECORD_TIME_CONV for retriving TSC parameters.

PERF_RECORD_TIME_CONV was added later because the TSC information is
needed by jitdump.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 0/7] perf arm-spe: Enable timestamp
  2021-04-07 13:28     ` Adrian Hunter
@ 2021-04-07 13:40       ` Leo Yan
  0 siblings, 0 replies; 12+ messages in thread
From: Leo Yan @ 2021-04-07 13:40 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Al Grant, Arnaldo Carvalho de Melo, John Garry, Will Deacon,
	Mathieu Poirier, James Clark, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	linux-arm-kernel, linux-kernel

On Wed, Apr 07, 2021 at 04:28:40PM +0300, Adrian Hunter wrote:
> On 7/04/21 4:15 pm, Leo Yan wrote:
> > Hi Al,
> > 
> > On Tue, Apr 06, 2021 at 09:38:32AM +0000, Al Grant wrote:
> > 
> > [...]
> > 
> >>> This patch set is to enable timestamp for Arm SPE trace.  It reads out TSC
> >>> parameters from mmap page and stores into auxtrace info structure;
> >>
> >> Why not synthesize a PERF_RECORD_TIME_CONV - isn't that specifically to
> >> capture the TSC parameters from the mmap page? If a generic mechanism
> >> exists it would be better to use it, otherwise we'll have to do this again for
> >> future trace formats.
> > 
> > Good point!  Actually "perf record" tool has synthesized event
> > PERF_RECORD_TIME_CONV.  This patch series is studying the
> > implementation from Intel-PT, so the question is why the existed
> > implementations (like Intel-PT, Intel-BTS) don't directly use
> > PERF_RECORD_TIME_CONV for retriving TSC parameters.
> 
> PERF_RECORD_TIME_CONV was added later because the TSC information is
> needed by jitdump.

Thanks for the info, Adrian.

If so, it's good for Arm SPE to use PERF_RECORD_TIME_CONV for TSC
parameters.  Will spin patch series for this.

Leo

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2021-04-07 13:41 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-03  7:23 [PATCH v2 0/7] perf arm-spe: Enable timestamp Leo Yan
2021-04-03  7:23 ` [PATCH v2 1/7] perf arm-spe: Remove unused enum value ARM_SPE_PER_CPU_MMAPS Leo Yan
2021-04-03  7:23 ` [PATCH v2 2/7] perf arm-spe: Store TSC parameters in auxtrace info Leo Yan
2021-04-03  7:23 ` [PATCH v2 3/7] perf arm-spe: Dump TSC parameters Leo Yan
2021-04-03  7:23 ` [PATCH v2 4/7] perf arm-spe: Convert event kernel time to counter value Leo Yan
2021-04-03  7:23 ` [PATCH v2 5/7] perf arm-spe: Assign kernel time to synthesized event Leo Yan
2021-04-03  7:23 ` [PATCH v2 6/7] perf arm-spe: Bail out if the trace is later than perf event Leo Yan
2021-04-03  7:23 ` [PATCH v2 7/7] perf arm-spe: Don't wait for PERF_RECORD_EXIT event Leo Yan
2021-04-06  9:38 ` [PATCH v2 0/7] perf arm-spe: Enable timestamp Al Grant
2021-04-07 13:15   ` Leo Yan
2021-04-07 13:28     ` Adrian Hunter
2021-04-07 13:40       ` Leo Yan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).