linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v5 0/5] perf arm-spe: Enable timestamp
@ 2021-05-19  7:19 Leo Yan
  2021-05-19  7:19 ` [PATCH v5 1/5] perf arm-spe: Save clock parameters from TIME_CONV event Leo Yan
                   ` (4 more replies)
  0 siblings, 5 replies; 10+ messages in thread
From: Leo Yan @ 2021-05-19  7:19 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, James Clark, John Garry, Will Deacon,
	Mathieu Poirier, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Dave Martin,
	Al Grant, linux-arm-kernel, linux-perf-users, linux-kernel
  Cc: Leo Yan

This patch set is to enable timestamp for Arm SPE trace.  It reads out
TSC parameters from the TIME_CONV event, the parameters are used for
conversion between timer counter and kernel time, the calculated
timestamps are assigned to Arm SPE samples.

This patch set can be clearly applied on perf/core branch with:

  commit 046b243a6afb ("perf x86 kvm-stat: Support to analyze kvm MSR")

The patches have been tested on Hisilicon D06 platform.

Changes from v4:
* Dropped the change "perf arm-spe: Remove unused enum value
  ARM_SPE_PER_CPU_MMAPS" for format compatibility (James).

Changes from v3:
* Let to be backwards-compatible for TIME_CONV event (Adrian).

Changes from v2:
* Changed to use TIME_CONV event for extracting clock parameters (Al).

Changes from v1:
* Rebased patch series on the latest perf/core branch;
* Fixed the patch for dumping TSC parameters to support both the
  older and new auxtrace info format.


Leo Yan (5):
  perf arm-spe: Save clock parameters from TIME_CONV event
  perf arm-spe: Convert event kernel time to counter value
  perf arm-spe: Assign kernel time to synthesized event
  perf arm-spe: Bail out if the trace is later than perf event
  perf arm-spe: Don't wait for PERF_RECORD_EXIT event

 tools/perf/util/arm-spe.c | 73 +++++++++++++++++++++++++++++++++------
 1 file changed, 63 insertions(+), 10 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v5 1/5] perf arm-spe: Save clock parameters from TIME_CONV event
  2021-05-19  7:19 [PATCH v5 0/5] perf arm-spe: Enable timestamp Leo Yan
@ 2021-05-19  7:19 ` Leo Yan
  2021-05-19  7:19 ` [PATCH v5 2/5] perf arm-spe: Convert event kernel time to counter value Leo Yan
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 10+ messages in thread
From: Leo Yan @ 2021-05-19  7:19 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, James Clark, John Garry, Will Deacon,
	Mathieu Poirier, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Dave Martin,
	Al Grant, linux-arm-kernel, linux-perf-users, linux-kernel
  Cc: Leo Yan

During the recording phase, "perf record" tool synthesizes event
PERF_RECORD_TIME_CONV for the hardware clock parameters and saves the
event into the data file.

Afterwards, when processing the data file, the event TIME_CONV will be
processed at the very early time and is stored into session context.

This patch extracts these parameters from the session context and saves
into the structure "spe->tc" with the type perf_tsc_conversion, so that
the parameters are ready for conversion between clock counter and time
stamp.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 tools/perf/util/arm-spe.c | 26 ++++++++++++++++++++++++++
 1 file changed, 26 insertions(+)

diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c
index 2539d4baec44..d2ae5a5c13ee 100644
--- a/tools/perf/util/arm-spe.c
+++ b/tools/perf/util/arm-spe.c
@@ -26,6 +26,7 @@
 #include "symbol.h"
 #include "thread.h"
 #include "thread-stack.h"
+#include "tsc.h"
 #include "tool.h"
 #include "util/synthetic-events.h"
 
@@ -45,6 +46,8 @@ struct arm_spe {
 	struct machine			*machine;
 	u32				pmu_type;
 
+	struct perf_tsc_conversion	tc;
+
 	u8				timeless_decoding;
 	u8				data_queued;
 
@@ -1006,6 +1009,7 @@ int arm_spe_process_auxtrace_info(union perf_event *event,
 {
 	struct perf_record_auxtrace_info *auxtrace_info = &event->auxtrace_info;
 	size_t min_sz = sizeof(u64) * ARM_SPE_AUXTRACE_PRIV_MAX;
+	struct perf_record_time_conv *tc = &session->time_conv;
 	struct arm_spe *spe;
 	int err;
 
@@ -1027,6 +1031,28 @@ int arm_spe_process_auxtrace_info(union perf_event *event,
 	spe->pmu_type = auxtrace_info->priv[ARM_SPE_PMU_TYPE];
 
 	spe->timeless_decoding = arm_spe__is_timeless_decoding(spe);
+
+	/*
+	 * The synthesized event PERF_RECORD_TIME_CONV has been handled ahead
+	 * and the parameters for hardware clock are stored in the session
+	 * context.  Passes these parameters to the struct perf_tsc_conversion
+	 * in "spe->tc", which is used for later conversion between clock
+	 * counter and timestamp.
+	 *
+	 * For backward compatibility, copies the fields starting from
+	 * "time_cycles" only if they are contained in the event.
+	 */
+	spe->tc.time_shift = tc->time_shift;
+	spe->tc.time_mult = tc->time_mult;
+	spe->tc.time_zero = tc->time_zero;
+
+	if (event_contains(*tc, time_cycles)) {
+		spe->tc.time_cycles = tc->time_cycles;
+		spe->tc.time_mask = tc->time_mask;
+		spe->tc.cap_user_time_zero = tc->cap_user_time_zero;
+		spe->tc.cap_user_time_short = tc->cap_user_time_short;
+	}
+
 	spe->auxtrace.process_event = arm_spe_process_event;
 	spe->auxtrace.process_auxtrace_event = arm_spe_process_auxtrace_event;
 	spe->auxtrace.flush_events = arm_spe_flush;
-- 
2.25.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v5 2/5] perf arm-spe: Convert event kernel time to counter value
  2021-05-19  7:19 [PATCH v5 0/5] perf arm-spe: Enable timestamp Leo Yan
  2021-05-19  7:19 ` [PATCH v5 1/5] perf arm-spe: Save clock parameters from TIME_CONV event Leo Yan
@ 2021-05-19  7:19 ` Leo Yan
  2021-05-19  7:19 ` [PATCH v5 3/5] perf arm-spe: Assign kernel time to synthesized event Leo Yan
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 10+ messages in thread
From: Leo Yan @ 2021-05-19  7:19 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, James Clark, John Garry, Will Deacon,
	Mathieu Poirier, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Dave Martin,
	Al Grant, linux-arm-kernel, linux-perf-users, linux-kernel
  Cc: Leo Yan

When handle a perf event, Arm SPE decoder needs to decide if this perf
event is earlier or later than the samples from Arm SPE trace data; to
do comparision, it needs to use the same unit for the time.

This patch converts the event kernel time to arch timer's counter value,
thus it can be used to compare with counter value contained in Arm SPE
Timestamp packet.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 tools/perf/util/arm-spe.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c
index d2ae5a5c13ee..ff8b52e6d475 100644
--- a/tools/perf/util/arm-spe.c
+++ b/tools/perf/util/arm-spe.c
@@ -669,7 +669,7 @@ static int arm_spe_process_event(struct perf_session *session,
 	}
 
 	if (sample->time && (sample->time != (u64) -1))
-		timestamp = sample->time;
+		timestamp = perf_time_to_tsc(sample->time, &spe->tc);
 	else
 		timestamp = 0;
 
-- 
2.25.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v5 3/5] perf arm-spe: Assign kernel time to synthesized event
  2021-05-19  7:19 [PATCH v5 0/5] perf arm-spe: Enable timestamp Leo Yan
  2021-05-19  7:19 ` [PATCH v5 1/5] perf arm-spe: Save clock parameters from TIME_CONV event Leo Yan
  2021-05-19  7:19 ` [PATCH v5 2/5] perf arm-spe: Convert event kernel time to counter value Leo Yan
@ 2021-05-19  7:19 ` Leo Yan
  2021-05-19  7:19 ` [PATCH v5 4/5] perf arm-spe: Bail out if the trace is later than perf event Leo Yan
  2021-05-19  7:19 ` [PATCH v5 5/5] perf arm-spe: Don't wait for PERF_RECORD_EXIT event Leo Yan
  4 siblings, 0 replies; 10+ messages in thread
From: Leo Yan @ 2021-05-19  7:19 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, James Clark, John Garry, Will Deacon,
	Mathieu Poirier, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Dave Martin,
	Al Grant, linux-arm-kernel, linux-perf-users, linux-kernel
  Cc: Leo Yan

In current code, it assigns the arch timer counter to the synthesized
samples Arm SPE trace, thus the samples don't contain the kernel time
but only contain the raw counter value.

To fix the issue, this patch converts the timer counter to kernel time
and assigns it to sample timestamp.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 tools/perf/util/arm-spe.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c
index ff8b52e6d475..da379328442c 100644
--- a/tools/perf/util/arm-spe.c
+++ b/tools/perf/util/arm-spe.c
@@ -234,7 +234,7 @@ static void arm_spe_prep_sample(struct arm_spe *spe,
 	struct arm_spe_record *record = &speq->decoder->record;
 
 	if (!spe->timeless_decoding)
-		sample->time = speq->timestamp;
+		sample->time = tsc_to_perf_time(record->timestamp, &spe->tc);
 
 	sample->ip = record->from_ip;
 	sample->cpumode = arm_spe_cpumode(spe, sample->ip);
-- 
2.25.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v5 4/5] perf arm-spe: Bail out if the trace is later than perf event
  2021-05-19  7:19 [PATCH v5 0/5] perf arm-spe: Enable timestamp Leo Yan
                   ` (2 preceding siblings ...)
  2021-05-19  7:19 ` [PATCH v5 3/5] perf arm-spe: Assign kernel time to synthesized event Leo Yan
@ 2021-05-19  7:19 ` Leo Yan
  2021-05-19  7:19 ` [PATCH v5 5/5] perf arm-spe: Don't wait for PERF_RECORD_EXIT event Leo Yan
  4 siblings, 0 replies; 10+ messages in thread
From: Leo Yan @ 2021-05-19  7:19 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, James Clark, John Garry, Will Deacon,
	Mathieu Poirier, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Dave Martin,
	Al Grant, linux-arm-kernel, linux-perf-users, linux-kernel
  Cc: Leo Yan

It's possible that record in Arm SPE trace is later than perf event and
vice versa.  This asks to correlate the perf events and Arm SPE
synthesized events to be processed in the manner of correct timing.

To achieve the time ordering, this patch reverses the flow, it firstly
calls arm_spe_sample() and then calls arm_spe_decode().  By comparing
the timestamp value and detect the perf event is coming earlier than Arm
SPE trace data, it bails out from the decoding loop, the last record is
pushed into auxtrace stack and is deferred to generate sample.  To track
the timestamp, everytime it updates timestamp for the latest record.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 tools/perf/util/arm-spe.c | 37 ++++++++++++++++++++++++++++++++++---
 1 file changed, 34 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c
index da379328442c..5c5b438584c4 100644
--- a/tools/perf/util/arm-spe.c
+++ b/tools/perf/util/arm-spe.c
@@ -434,12 +434,36 @@ static int arm_spe_sample(struct arm_spe_queue *speq)
 static int arm_spe_run_decoder(struct arm_spe_queue *speq, u64 *timestamp)
 {
 	struct arm_spe *spe = speq->spe;
+	struct arm_spe_record *record;
 	int ret;
 
 	if (!spe->kernel_start)
 		spe->kernel_start = machine__kernel_start(spe->machine);
 
 	while (1) {
+		/*
+		 * The usual logic is firstly to decode the packets, and then
+		 * based the record to synthesize sample; but here the flow is
+		 * reversed: it calls arm_spe_sample() for synthesizing samples
+		 * prior to arm_spe_decode().
+		 *
+		 * Two reasons for this code logic:
+		 * 1. Firstly, when setup queue in arm_spe__setup_queue(), it
+		 * has decoded trace data and generated a record, but the record
+		 * is left to generate sample until run to here, so it's correct
+		 * to synthesize sample for the left record.
+		 * 2. After decoding trace data, it needs to compare the record
+		 * timestamp with the coming perf event, if the record timestamp
+		 * is later than the perf event, it needs bail out and pushs the
+		 * record into auxtrace heap, thus the record can be deferred to
+		 * synthesize sample until run to here at the next time; so this
+		 * can correlate samples between Arm SPE trace data and other
+		 * perf events with correct time ordering.
+		 */
+		ret = arm_spe_sample(speq);
+		if (ret)
+			return ret;
+
 		ret = arm_spe_decode(speq->decoder);
 		if (!ret) {
 			pr_debug("No data or all data has been processed.\n");
@@ -453,10 +477,17 @@ static int arm_spe_run_decoder(struct arm_spe_queue *speq, u64 *timestamp)
 		if (ret < 0)
 			continue;
 
-		ret = arm_spe_sample(speq);
-		if (ret)
-			return ret;
+		record = &speq->decoder->record;
 
+		/* Update timestamp for the last record */
+		if (record->timestamp > speq->timestamp)
+			speq->timestamp = record->timestamp;
+
+		/*
+		 * If the timestamp of the queue is later than timestamp of the
+		 * coming perf event, bail out so can allow the perf event to
+		 * be processed ahead.
+		 */
 		if (!spe->timeless_decoding && speq->timestamp >= *timestamp) {
 			*timestamp = speq->timestamp;
 			return 0;
-- 
2.25.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v5 5/5] perf arm-spe: Don't wait for PERF_RECORD_EXIT event
  2021-05-19  7:19 [PATCH v5 0/5] perf arm-spe: Enable timestamp Leo Yan
                   ` (3 preceding siblings ...)
  2021-05-19  7:19 ` [PATCH v5 4/5] perf arm-spe: Bail out if the trace is later than perf event Leo Yan
@ 2021-05-19  7:19 ` Leo Yan
  2021-06-25 13:25   ` James Clark
  4 siblings, 1 reply; 10+ messages in thread
From: Leo Yan @ 2021-05-19  7:19 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, James Clark, John Garry, Will Deacon,
	Mathieu Poirier, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Dave Martin,
	Al Grant, linux-arm-kernel, linux-perf-users, linux-kernel
  Cc: Leo Yan

When decode Arm SPE trace, it waits for PERF_RECORD_EXIT event (the last
perf event) for processing trace data, which is needless and even might
cause logic error, e.g. it might fail to correlate perf events with Arm
SPE events correctly.

So this patch removes the condition checking for PERF_RECORD_EXIT event.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 tools/perf/util/arm-spe.c | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c
index 5c5b438584c4..58b7069c5a5f 100644
--- a/tools/perf/util/arm-spe.c
+++ b/tools/perf/util/arm-spe.c
@@ -717,11 +717,7 @@ static int arm_spe_process_event(struct perf_session *session,
 					sample->time);
 		}
 	} else if (timestamp) {
-		if (event->header.type == PERF_RECORD_EXIT) {
-			err = arm_spe_process_queues(spe, timestamp);
-			if (err)
-				return err;
-		}
+		err = arm_spe_process_queues(spe, timestamp);
 	}
 
 	return err;
-- 
2.25.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v5 5/5] perf arm-spe: Don't wait for PERF_RECORD_EXIT event
  2021-05-19  7:19 ` [PATCH v5 5/5] perf arm-spe: Don't wait for PERF_RECORD_EXIT event Leo Yan
@ 2021-06-25 13:25   ` James Clark
  2021-06-28 12:12     ` Leo Yan
  0 siblings, 1 reply; 10+ messages in thread
From: James Clark @ 2021-06-25 13:25 UTC (permalink / raw)
  To: Leo Yan, Arnaldo Carvalho de Melo, John Garry, Will Deacon,
	Mathieu Poirier, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Dave Martin,
	Al Grant, linux-arm-kernel, linux-perf-users, linux-kernel



On 19/05/2021 08:19, Leo Yan wrote:
> When decode Arm SPE trace, it waits for PERF_RECORD_EXIT event (the last
> perf event) for processing trace data, which is needless and even might
> cause logic error, e.g. it might fail to correlate perf events with Arm
> SPE events correctly.
> 
> So this patch removes the condition checking for PERF_RECORD_EXIT event.
> 
> Signed-off-by: Leo Yan <leo.yan@linaro.org>
> ---
>  tools/perf/util/arm-spe.c | 6 +-----
>  1 file changed, 1 insertion(+), 5 deletions(-)
> 
> diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c
> index 5c5b438584c4..58b7069c5a5f 100644
> --- a/tools/perf/util/arm-spe.c
> +++ b/tools/perf/util/arm-spe.c
> @@ -717,11 +717,7 @@ static int arm_spe_process_event(struct perf_session *session,
>  					sample->time);
>  		}
>  	} else if (timestamp) {
> -		if (event->header.type == PERF_RECORD_EXIT) {
> -			err = arm_spe_process_queues(spe, timestamp);
> -			if (err)
> -				return err;
> -		}
> +		err = arm_spe_process_queues(spe, timestamp);
>  	}
>  
>  	return err;
> 

For the whole set:
Reviewed-by: James Clark <james.clark@arm.com>
Tested-by: James Clark <james.clark@arm.com>

I see a big improvement in decoding involving multiple processes because the timestamps are now
correlated with the comm and mmap events.

For example perf-exec samples are visible right before the exec is done, and on an
application that forks, samples are visible from all processes. For example:

   perf record -e arm_spe// -- bash -c "stress -c 1"
   perf script

   perf-exec  4502 [003] 259755.050409:          1    l1d-access:  ffff80001014b840 sched_clock+0x40 ([kernel.kallsyms])
   perf-exec  4502 [003] 259755.050409:          1    tlb-access:  ffff80001014b840 sched_clock+0x40 ([kernel.kallsyms])
   perf-exec  4502 [003] 259755.050409:          1        memory:  ffff80001014b840 sched_clock+0x40 ([kernel.kallsyms])
   perf-exec  4502 [003] 259755.050411:          1    tlb-access:  ffff800010120fb8 __rcu_read_lock+0x0 ([kernel.kallsyms])
   bash  4502 [003] 259755.050411:          1   branch-miss:  ffff8000105b2a40 memcpy+0x80 ([kernel.kallsyms])
   bash  4502 [003] 259755.050411:          1    tlb-access:                 0 [unknown] ([unknown])
   ...
   stress  4502 [003] 259755.051468:          1    l1d-access:  ffff800010259a24 __vma_adjust+0x1f4 ([kernel.kallsyms])
   stress  4502 [003] 259755.051468:          1    tlb-access:  ffff800010259a24 __vma_adjust+0x1f4 ([kernel.kallsyms])
   stress  4502 [003] 259755.051468:          1        memory:  ffff800010259a24 __vma_adjust+0x1f4 ([kernel.kallsyms])

Previously samples were only attributed to 'stress', which was obviously wrong.

James


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v5 5/5] perf arm-spe: Don't wait for PERF_RECORD_EXIT event
  2021-06-25 13:25   ` James Clark
@ 2021-06-28 12:12     ` Leo Yan
  2021-07-01 17:03       ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 10+ messages in thread
From: Leo Yan @ 2021-06-28 12:12 UTC (permalink / raw)
  To: James Clark
  Cc: Arnaldo Carvalho de Melo, John Garry, Will Deacon,
	Mathieu Poirier, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Dave Martin,
	Al Grant, linux-arm-kernel, linux-perf-users, linux-kernel

On Fri, Jun 25, 2021 at 02:25:15PM +0100, James Clark wrote:
> 
> 
> On 19/05/2021 08:19, Leo Yan wrote:
> > When decode Arm SPE trace, it waits for PERF_RECORD_EXIT event (the last
> > perf event) for processing trace data, which is needless and even might
> > cause logic error, e.g. it might fail to correlate perf events with Arm
> > SPE events correctly.
> > 
> > So this patch removes the condition checking for PERF_RECORD_EXIT event.
> > 
> > Signed-off-by: Leo Yan <leo.yan@linaro.org>
> > ---
> >  tools/perf/util/arm-spe.c | 6 +-----
> >  1 file changed, 1 insertion(+), 5 deletions(-)
> > 
> > diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c
> > index 5c5b438584c4..58b7069c5a5f 100644
> > --- a/tools/perf/util/arm-spe.c
> > +++ b/tools/perf/util/arm-spe.c
> > @@ -717,11 +717,7 @@ static int arm_spe_process_event(struct perf_session *session,
> >  					sample->time);
> >  		}
> >  	} else if (timestamp) {
> > -		if (event->header.type == PERF_RECORD_EXIT) {
> > -			err = arm_spe_process_queues(spe, timestamp);
> > -			if (err)
> > -				return err;
> > -		}
> > +		err = arm_spe_process_queues(spe, timestamp);
> >  	}
> >  
> >  	return err;
> > 
> 
> For the whole set:
> Reviewed-by: James Clark <james.clark@arm.com>
> Tested-by: James Clark <james.clark@arm.com>

> I see a big improvement in decoding involving multiple processes because the timestamps are now
> correlated with the comm and mmap events.
> 
> For example perf-exec samples are visible right before the exec is done, and on an
> application that forks, samples are visible from all processes. For example:
> 
>    perf record -e arm_spe// -- bash -c "stress -c 1"
>    perf script
> 
>    perf-exec  4502 [003] 259755.050409:          1    l1d-access:  ffff80001014b840 sched_clock+0x40 ([kernel.kallsyms])
>    perf-exec  4502 [003] 259755.050409:          1    tlb-access:  ffff80001014b840 sched_clock+0x40 ([kernel.kallsyms])
>    perf-exec  4502 [003] 259755.050409:          1        memory:  ffff80001014b840 sched_clock+0x40 ([kernel.kallsyms])
>    perf-exec  4502 [003] 259755.050411:          1    tlb-access:  ffff800010120fb8 __rcu_read_lock+0x0 ([kernel.kallsyms])
>    bash  4502 [003] 259755.050411:          1   branch-miss:  ffff8000105b2a40 memcpy+0x80 ([kernel.kallsyms])
>    bash  4502 [003] 259755.050411:          1    tlb-access:                 0 [unknown] ([unknown])
>    ...
>    stress  4502 [003] 259755.051468:          1    l1d-access:  ffff800010259a24 __vma_adjust+0x1f4 ([kernel.kallsyms])
>    stress  4502 [003] 259755.051468:          1    tlb-access:  ffff800010259a24 __vma_adjust+0x1f4 ([kernel.kallsyms])
>    stress  4502 [003] 259755.051468:          1        memory:  ffff800010259a24 __vma_adjust+0x1f4 ([kernel.kallsyms])
> 
> Previously samples were only attributed to 'stress', which was obviously wrong.

Thanks a lot for the review and testing, James!

Hi Arnaldo, I confirmed this patch set can be cleanly applied on
the latest acme/perf/core branch, so could you pick up this patch
set?

Thanks,
Leo

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v5 5/5] perf arm-spe: Don't wait for PERF_RECORD_EXIT event
  2021-06-28 12:12     ` Leo Yan
@ 2021-07-01 17:03       ` Arnaldo Carvalho de Melo
  2021-07-02  1:31         ` Leo Yan
  0 siblings, 1 reply; 10+ messages in thread
From: Arnaldo Carvalho de Melo @ 2021-07-01 17:03 UTC (permalink / raw)
  To: Leo Yan
  Cc: James Clark, John Garry, Will Deacon, Mathieu Poirier,
	Peter Zijlstra, Ingo Molnar, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Dave Martin, Al Grant, linux-arm-kernel,
	linux-perf-users, linux-kernel

Em Mon, Jun 28, 2021 at 08:12:17PM +0800, Leo Yan escreveu:
> On Fri, Jun 25, 2021 at 02:25:15PM +0100, James Clark wrote:
> > For the whole set:
> > Reviewed-by: James Clark <james.clark@arm.com>
> > Tested-by: James Clark <james.clark@arm.com>
 
> > I see a big improvement in decoding involving multiple processes because the timestamps are now

> > 
> > For example perf-exec samples are visible right before the exec is done, and on an
> > application that forks, samples are visible from all processes. For example:

> >    perf record -e arm_spe// -- bash -c "stress -c 1"
> >    perf script

> >    perf-exec  4502 [003] 259755.050409:          1    l1d-access:  ffff80001014b840 sched_clock+0x40 ([kernel.kallsyms])
> >    perf-exec  4502 [003] 259755.050409:          1    tlb-access:  ffff80001014b840 sched_clock+0x40 ([kernel.kallsyms])
> >    perf-exec  4502 [003] 259755.050409:          1        memory:  ffff80001014b840 sched_clock+0x40 ([kernel.kallsyms])
> >    perf-exec  4502 [003] 259755.050411:          1    tlb-access:  ffff800010120fb8 __rcu_read_lock+0x0 ([kernel.kallsyms])
> >    bash  4502 [003] 259755.050411:          1   branch-miss:  ffff8000105b2a40 memcpy+0x80 ([kernel.kallsyms])
> >    bash  4502 [003] 259755.050411:          1    tlb-access:                 0 [unknown] ([unknown])
> >    ...
> >    stress  4502 [003] 259755.051468:          1    l1d-access:  ffff800010259a24 __vma_adjust+0x1f4 ([kernel.kallsyms])
> >    stress  4502 [003] 259755.051468:          1    tlb-access:  ffff800010259a24 __vma_adjust+0x1f4 ([kernel.kallsyms])
> >    stress  4502 [003] 259755.051468:          1        memory:  ffff800010259a24 __vma_adjust+0x1f4 ([kernel.kallsyms])

> > Previously samples were only attributed to 'stress', which was obviously wrong.
> 
> Thanks a lot for the review and testing, James!
> 
> Hi Arnaldo, I confirmed this patch set can be cleanly applied on
> the latest acme/perf/core branch, so could you pick up this patch
> set?

Applied, thanks, please let me know if there is still something
outstanding,

- Arnaldo

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v5 5/5] perf arm-spe: Don't wait for PERF_RECORD_EXIT event
  2021-07-01 17:03       ` Arnaldo Carvalho de Melo
@ 2021-07-02  1:31         ` Leo Yan
  0 siblings, 0 replies; 10+ messages in thread
From: Leo Yan @ 2021-07-02  1:31 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: James Clark, John Garry, Will Deacon, Mathieu Poirier,
	Peter Zijlstra, Ingo Molnar, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Dave Martin, Al Grant, linux-arm-kernel,
	linux-perf-users, linux-kernel

On Thu, Jul 01, 2021 at 02:03:16PM -0300, Arnaldo Carvalho de Melo wrote:

[...]

> > Hi Arnaldo, I confirmed this patch set can be cleanly applied on
> > the latest acme/perf/core branch, so could you pick up this patch
> > set?
> 
> Applied, thanks, please let me know if there is still something
> outstanding,

Thanks, Arnaldo!  I confirmed you don't miss anything.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2021-07-02  1:31 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-19  7:19 [PATCH v5 0/5] perf arm-spe: Enable timestamp Leo Yan
2021-05-19  7:19 ` [PATCH v5 1/5] perf arm-spe: Save clock parameters from TIME_CONV event Leo Yan
2021-05-19  7:19 ` [PATCH v5 2/5] perf arm-spe: Convert event kernel time to counter value Leo Yan
2021-05-19  7:19 ` [PATCH v5 3/5] perf arm-spe: Assign kernel time to synthesized event Leo Yan
2021-05-19  7:19 ` [PATCH v5 4/5] perf arm-spe: Bail out if the trace is later than perf event Leo Yan
2021-05-19  7:19 ` [PATCH v5 5/5] perf arm-spe: Don't wait for PERF_RECORD_EXIT event Leo Yan
2021-06-25 13:25   ` James Clark
2021-06-28 12:12     ` Leo Yan
2021-07-01 17:03       ` Arnaldo Carvalho de Melo
2021-07-02  1:31         ` Leo Yan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).