linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/8] perf arm-spe: Enable sample type PERF_SAMPLE_DATA_SRC
@ 2021-01-19 14:46 James Clark
  2021-01-19 14:46 ` [PATCH 2/8] perf arm-spe: Store memory address in packet James Clark
                   ` (7 more replies)
  0 siblings, 8 replies; 16+ messages in thread
From: James Clark @ 2021-01-19 14:46 UTC (permalink / raw)
  To: linux-kernel, linux-perf-users
  Cc: Leo Yan, James Clark, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, John Garry, Will Deacon,
	Mathieu Poirier, Al Grant, Andre Przywara, Wei Li, Tan Xiaojun,
	Adrian Hunter

From: Leo Yan <leo.yan@linaro.org>

This patch is to enable sample type PERF_SAMPLE_DATA_SRC for Arm SPE in
the perf data, when output the tracing data, it tells tools that it
contains data source in the memory event.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Al Grant <al.grant@arm.com>
Cc: Andre Przywara <andre.przywara@arm.com>
Cc: Wei Li <liwei391@huawei.com>
Cc: Tan Xiaojun <tanxiaojun@huawei.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/util/arm-spe.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c
index 8901a1656a41..b134516e890b 100644
--- a/tools/perf/util/arm-spe.c
+++ b/tools/perf/util/arm-spe.c
@@ -803,7 +803,7 @@ arm_spe_synth_events(struct arm_spe *spe, struct perf_session *session)
 	attr.type = PERF_TYPE_HARDWARE;
 	attr.sample_type = evsel->core.attr.sample_type & PERF_SAMPLE_MASK;
 	attr.sample_type |= PERF_SAMPLE_IP | PERF_SAMPLE_TID |
-		PERF_SAMPLE_PERIOD;
+			    PERF_SAMPLE_PERIOD | PERF_SAMPLE_DATA_SRC;
 	if (spe->timeless_decoding)
 		attr.sample_type &= ~(u64)PERF_SAMPLE_TIME;
 	else
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 2/8] perf arm-spe: Store memory address in packet
  2021-01-19 14:46 [PATCH 1/8] perf arm-spe: Enable sample type PERF_SAMPLE_DATA_SRC James Clark
@ 2021-01-19 14:46 ` James Clark
  2021-01-19 14:46 ` [PATCH 3/8] perf arm-spe: Store operation type " James Clark
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 16+ messages in thread
From: James Clark @ 2021-01-19 14:46 UTC (permalink / raw)
  To: linux-kernel, linux-perf-users
  Cc: Leo Yan, James Clark, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, John Garry, Will Deacon,
	Mathieu Poirier, Al Grant, Andre Przywara, Wei Li, Tan Xiaojun,
	Adrian Hunter

From: Leo Yan <leo.yan@linaro.org>

This patch is to store virtual and physical memory addresses in packet,
which will be used for memory samples.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Al Grant <al.grant@arm.com>
Cc: Andre Przywara <andre.przywara@arm.com>
Cc: Wei Li <liwei391@huawei.com>
Cc: Tan Xiaojun <tanxiaojun@huawei.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/util/arm-spe-decoder/arm-spe-decoder.c | 4 ++++
 tools/perf/util/arm-spe-decoder/arm-spe-decoder.h | 2 ++
 2 files changed, 6 insertions(+)

diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
index 90d575cee1b9..7aac3048b090 100644
--- a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
+++ b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
@@ -172,6 +172,10 @@ static int arm_spe_read_record(struct arm_spe_decoder *decoder)
 				decoder->record.from_ip = ip;
 			else if (idx == SPE_ADDR_PKT_HDR_INDEX_BRANCH)
 				decoder->record.to_ip = ip;
+			else if (idx == SPE_ADDR_PKT_HDR_INDEX_DATA_VIRT)
+				decoder->record.virt_addr = ip;
+			else if (idx == SPE_ADDR_PKT_HDR_INDEX_DATA_PHYS)
+				decoder->record.phys_addr = ip;
 			break;
 		case ARM_SPE_COUNTER:
 			break;
diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h
index 24727b8ca7ff..7b845001afe7 100644
--- a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h
+++ b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h
@@ -30,6 +30,8 @@ struct arm_spe_record {
 	u64 from_ip;
 	u64 to_ip;
 	u64 timestamp;
+	u64 virt_addr;
+	u64 phys_addr;
 };
 
 struct arm_spe_insn;
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 3/8] perf arm-spe: Store operation type in packet
  2021-01-19 14:46 [PATCH 1/8] perf arm-spe: Enable sample type PERF_SAMPLE_DATA_SRC James Clark
  2021-01-19 14:46 ` [PATCH 2/8] perf arm-spe: Store memory address in packet James Clark
@ 2021-01-19 14:46 ` James Clark
  2021-01-19 14:46 ` [PATCH 4/8] perf arm-spe: Fill address info for samples James Clark
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 16+ messages in thread
From: James Clark @ 2021-01-19 14:46 UTC (permalink / raw)
  To: linux-kernel, linux-perf-users
  Cc: Leo Yan, James Clark, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, John Garry, Will Deacon,
	Mathieu Poirier, Al Grant, Andre Przywara, Wei Li, Tan Xiaojun,
	Adrian Hunter

From: Leo Yan <leo.yan@linaro.org>

This patch is to store operation type in packet structure.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Al Grant <al.grant@arm.com>
Cc: Andre Przywara <andre.przywara@arm.com>
Cc: Wei Li <liwei391@huawei.com>
Cc: Tan Xiaojun <tanxiaojun@huawei.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/util/arm-spe-decoder/arm-spe-decoder.c | 6 ++++++
 tools/perf/util/arm-spe-decoder/arm-spe-decoder.h | 6 ++++++
 2 files changed, 12 insertions(+)

diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
index 7aac3048b090..32fe41835fa6 100644
--- a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
+++ b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
@@ -182,6 +182,12 @@ static int arm_spe_read_record(struct arm_spe_decoder *decoder)
 		case ARM_SPE_CONTEXT:
 			break;
 		case ARM_SPE_OP_TYPE:
+			if (idx == SPE_OP_PKT_HDR_CLASS_LD_ST_ATOMIC) {
+				if (payload & 0x1)
+					decoder->record.op = ARM_SPE_ST;
+				else
+					decoder->record.op = ARM_SPE_LD;
+			}
 			break;
 		case ARM_SPE_EVENTS:
 			if (payload & BIT(EV_L1D_REFILL))
diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h
index 7b845001afe7..59bdb7309674 100644
--- a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h
+++ b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h
@@ -24,9 +24,15 @@ enum arm_spe_sample_type {
 	ARM_SPE_REMOTE_ACCESS	= 1 << 7,
 };
 
+enum arm_spe_op_type {
+	ARM_SPE_LD		= 1 << 0,
+	ARM_SPE_ST		= 1 << 1,
+};
+
 struct arm_spe_record {
 	enum arm_spe_sample_type type;
 	int err;
+	u32 op;
 	u64 from_ip;
 	u64 to_ip;
 	u64 timestamp;
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 4/8] perf arm-spe: Fill address info for samples
  2021-01-19 14:46 [PATCH 1/8] perf arm-spe: Enable sample type PERF_SAMPLE_DATA_SRC James Clark
  2021-01-19 14:46 ` [PATCH 2/8] perf arm-spe: Store memory address in packet James Clark
  2021-01-19 14:46 ` [PATCH 3/8] perf arm-spe: Store operation type " James Clark
@ 2021-01-19 14:46 ` James Clark
  2021-01-19 14:46 ` [PATCH 5/8] perf arm-spe: Synthesize memory event James Clark
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 16+ messages in thread
From: James Clark @ 2021-01-19 14:46 UTC (permalink / raw)
  To: linux-kernel, linux-perf-users
  Cc: Leo Yan, James Clark, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, John Garry, Will Deacon,
	Mathieu Poirier, Al Grant, Andre Przywara, Wei Li, Tan Xiaojun,
	Adrian Hunter

From: Leo Yan <leo.yan@linaro.org>

To properly handle memory and branch samples, this patch divides into
two functions for generating samples: arm_spe__synth_mem_sample() is for
synthesizing memory and TLB samples; arm_spe__synth_branch_sample() is
to synthesize branch samples.

Arm SPE backend decoder has passed virtual and physical address through
packets, the address info is stored into the synthesize samples in the
function arm_spe__synth_mem_sample().

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Al Grant <al.grant@arm.com>
Cc: Andre Przywara <andre.przywara@arm.com>
Cc: Wei Li <liwei391@huawei.com>
Cc: Tan Xiaojun <tanxiaojun@huawei.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/util/arm-spe.c | 52 +++++++++++++++++++++++----------------
 1 file changed, 31 insertions(+), 21 deletions(-)

diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c
index b134516e890b..578725344603 100644
--- a/tools/perf/util/arm-spe.c
+++ b/tools/perf/util/arm-spe.c
@@ -235,7 +235,6 @@ static void arm_spe_prep_sample(struct arm_spe *spe,
 	sample->cpumode = arm_spe_cpumode(spe, sample->ip);
 	sample->pid = speq->pid;
 	sample->tid = speq->tid;
-	sample->addr = record->to_ip;
 	sample->period = 1;
 	sample->cpu = speq->cpu;
 
@@ -259,18 +258,37 @@ arm_spe_deliver_synth_event(struct arm_spe *spe,
 	return ret;
 }
 
-static int
-arm_spe_synth_spe_events_sample(struct arm_spe_queue *speq,
-				u64 spe_events_id)
+static int arm_spe__synth_mem_sample(struct arm_spe_queue *speq,
+				     u64 spe_events_id)
 {
 	struct arm_spe *spe = speq->spe;
+	struct arm_spe_record *record = &speq->decoder->record;
+	union perf_event *event = speq->event_buf;
+	struct perf_sample sample = { 0 };
+
+	arm_spe_prep_sample(spe, speq, event, &sample);
+
+	sample.id = spe_events_id;
+	sample.stream_id = spe_events_id;
+	sample.addr = record->virt_addr;
+	sample.phys_addr = record->phys_addr;
+
+	return arm_spe_deliver_synth_event(spe, speq, event, &sample);
+}
+
+static int arm_spe__synth_branch_sample(struct arm_spe_queue *speq,
+					u64 spe_events_id)
+{
+	struct arm_spe *spe = speq->spe;
+	struct arm_spe_record *record = &speq->decoder->record;
 	union perf_event *event = speq->event_buf;
-	struct perf_sample sample = { .ip = 0, };
+	struct perf_sample sample = { 0 };
 
 	arm_spe_prep_sample(spe, speq, event, &sample);
 
 	sample.id = spe_events_id;
 	sample.stream_id = spe_events_id;
+	sample.addr = record->to_ip;
 
 	return arm_spe_deliver_synth_event(spe, speq, event, &sample);
 }
@@ -283,15 +301,13 @@ static int arm_spe_sample(struct arm_spe_queue *speq)
 
 	if (spe->sample_flc) {
 		if (record->type & ARM_SPE_L1D_MISS) {
-			err = arm_spe_synth_spe_events_sample(
-					speq, spe->l1d_miss_id);
+			err = arm_spe__synth_mem_sample(speq, spe->l1d_miss_id);
 			if (err)
 				return err;
 		}
 
 		if (record->type & ARM_SPE_L1D_ACCESS) {
-			err = arm_spe_synth_spe_events_sample(
-					speq, spe->l1d_access_id);
+			err = arm_spe__synth_mem_sample(speq, spe->l1d_access_id);
 			if (err)
 				return err;
 		}
@@ -299,15 +315,13 @@ static int arm_spe_sample(struct arm_spe_queue *speq)
 
 	if (spe->sample_llc) {
 		if (record->type & ARM_SPE_LLC_MISS) {
-			err = arm_spe_synth_spe_events_sample(
-					speq, spe->llc_miss_id);
+			err = arm_spe__synth_mem_sample(speq, spe->llc_miss_id);
 			if (err)
 				return err;
 		}
 
 		if (record->type & ARM_SPE_LLC_ACCESS) {
-			err = arm_spe_synth_spe_events_sample(
-					speq, spe->llc_access_id);
+			err = arm_spe__synth_mem_sample(speq, spe->llc_access_id);
 			if (err)
 				return err;
 		}
@@ -315,31 +329,27 @@ static int arm_spe_sample(struct arm_spe_queue *speq)
 
 	if (spe->sample_tlb) {
 		if (record->type & ARM_SPE_TLB_MISS) {
-			err = arm_spe_synth_spe_events_sample(
-					speq, spe->tlb_miss_id);
+			err = arm_spe__synth_mem_sample(speq, spe->tlb_miss_id);
 			if (err)
 				return err;
 		}
 
 		if (record->type & ARM_SPE_TLB_ACCESS) {
-			err = arm_spe_synth_spe_events_sample(
-					speq, spe->tlb_access_id);
+			err = arm_spe__synth_mem_sample(speq, spe->tlb_access_id);
 			if (err)
 				return err;
 		}
 	}
 
 	if (spe->sample_branch && (record->type & ARM_SPE_BRANCH_MISS)) {
-		err = arm_spe_synth_spe_events_sample(speq,
-						      spe->branch_miss_id);
+		err = arm_spe__synth_branch_sample(speq, spe->branch_miss_id);
 		if (err)
 			return err;
 	}
 
 	if (spe->sample_remote_access &&
 	    (record->type & ARM_SPE_REMOTE_ACCESS)) {
-		err = arm_spe_synth_spe_events_sample(speq,
-						      spe->remote_access_id);
+		err = arm_spe__synth_mem_sample(speq, spe->remote_access_id);
 		if (err)
 			return err;
 	}
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 5/8] perf arm-spe: Synthesize memory event
  2021-01-19 14:46 [PATCH 1/8] perf arm-spe: Enable sample type PERF_SAMPLE_DATA_SRC James Clark
                   ` (2 preceding siblings ...)
  2021-01-19 14:46 ` [PATCH 4/8] perf arm-spe: Fill address info for samples James Clark
@ 2021-01-19 14:46 ` James Clark
  2021-01-19 14:46 ` [PATCH 6/8] perf arm-spe: Set sample's data source field James Clark
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 16+ messages in thread
From: James Clark @ 2021-01-19 14:46 UTC (permalink / raw)
  To: linux-kernel, linux-perf-users
  Cc: Leo Yan, James Clark, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, John Garry, Will Deacon,
	Mathieu Poirier, Al Grant, Andre Przywara, Wei Li, Tan Xiaojun,
	Adrian Hunter

From: Leo Yan <leo.yan@linaro.org>

The memory event can deliver two benefits:

- The first benefit is the memory event can give out global view for
  memory accessing, rather than organizing events with scatter mode
  (e.g. uses separate event for L1 cache, last level cache, etc) which
  which can only display a event for single memory type, memory events
  include all memory accessing so it can display the data accessing
  cross memory levels in the same view;

- The second benefit is the sample generation might introduce a big
  overhead and need to wait for long time for Perf reporting, we can
  specify itrace option '--itrace=M' to filter out other events and only
  output memory events, this can significantly reduce the overhead
  caused by generating samples.

This patch is to enable memory event for Arm SPE.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Al Grant <al.grant@arm.com>
Cc: Andre Przywara <andre.przywara@arm.com>
Cc: Wei Li <liwei391@huawei.com>
Cc: Tan Xiaojun <tanxiaojun@huawei.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/util/arm-spe.c | 30 ++++++++++++++++++++++++++++++
 1 file changed, 30 insertions(+)

diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c
index 578725344603..5550906486d8 100644
--- a/tools/perf/util/arm-spe.c
+++ b/tools/perf/util/arm-spe.c
@@ -53,6 +53,7 @@ struct arm_spe {
 	u8				sample_tlb;
 	u8				sample_branch;
 	u8				sample_remote_access;
+	u8				sample_memory;
 
 	u64				l1d_miss_id;
 	u64				l1d_access_id;
@@ -62,6 +63,7 @@ struct arm_spe {
 	u64				tlb_access_id;
 	u64				branch_miss_id;
 	u64				remote_access_id;
+	u64				memory_id;
 
 	u64				kernel_start;
 
@@ -293,6 +295,18 @@ static int arm_spe__synth_branch_sample(struct arm_spe_queue *speq,
 	return arm_spe_deliver_synth_event(spe, speq, event, &sample);
 }
 
+#define SPE_MEM_TYPE	(ARM_SPE_L1D_ACCESS | ARM_SPE_L1D_MISS | \
+			 ARM_SPE_LLC_ACCESS | ARM_SPE_LLC_MISS | \
+			 ARM_SPE_REMOTE_ACCESS)
+
+static bool arm_spe__is_memory_event(enum arm_spe_sample_type type)
+{
+	if (type & SPE_MEM_TYPE)
+		return true;
+
+	return false;
+}
+
 static int arm_spe_sample(struct arm_spe_queue *speq)
 {
 	const struct arm_spe_record *record = &speq->decoder->record;
@@ -354,6 +368,12 @@ static int arm_spe_sample(struct arm_spe_queue *speq)
 			return err;
 	}
 
+	if (spe->sample_memory && arm_spe__is_memory_event(record->type)) {
+		err = arm_spe__synth_mem_sample(speq, spe->memory_id);
+		if (err)
+			return err;
+	}
+
 	return 0;
 }
 
@@ -917,6 +937,16 @@ arm_spe_synth_events(struct arm_spe *spe, struct perf_session *session)
 		id += 1;
 	}
 
+	if (spe->synth_opts.mem) {
+		spe->sample_memory = true;
+
+		err = arm_spe_synth_event(session, &attr, id);
+		if (err)
+			return err;
+		spe->memory_id = id;
+		arm_spe_set_event_name(evlist, id, "memory");
+	}
+
 	return 0;
 }
 
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 6/8] perf arm-spe: Set sample's data source field
  2021-01-19 14:46 [PATCH 1/8] perf arm-spe: Enable sample type PERF_SAMPLE_DATA_SRC James Clark
                   ` (3 preceding siblings ...)
  2021-01-19 14:46 ` [PATCH 5/8] perf arm-spe: Synthesize memory event James Clark
@ 2021-01-19 14:46 ` James Clark
  2021-01-19 14:46 ` [PATCH 7/8] perf arm-spe: Save context ID in record James Clark
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 16+ messages in thread
From: James Clark @ 2021-01-19 14:46 UTC (permalink / raw)
  To: linux-kernel, linux-perf-users
  Cc: Leo Yan, James Clark, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, John Garry, Will Deacon,
	Mathieu Poirier, Al Grant, Andre Przywara, Wei Li, Tan Xiaojun,
	Adrian Hunter

From: Leo Yan <leo.yan@linaro.org>

The sample structure contains the field 'data_src' which is used to
tell the data operation attributions, e.g. operation type is loading or
storing, cache level, it's snooping or remote accessing, etc.  At the
end, the 'data_src' will be parsed by perf mem/c2c tools to display
human readable strings.

This patch is to fill the 'data_src' field in the synthesized samples
base on different types.  Currently perf tool can display statistics for
L1/L2/L3 caches but it doesn't support the 'last level cache'.  To fit
to current implementation, 'data_src' field uses L3 cache for last level
cache.

Before this commit, perf mem report looks like this:
    # Samples: 75K of event 'l1d-miss'
    # Total weight : 75951
    # Sort order   : local_weight,mem,sym,dso,symbol_daddr,dso_daddr,snoop,tlb,locked
    #
    # Overhead       Samples  Local Weight  Memory access             Symbol                  Shared Object     Data Symbol             Data Object       Snoop         TLB access
    # ........  ............  ............  ........................  ......................  ................  ......................  ................  ............  ...................
    #
        81.56%         61945  0             N/A                       [.] 0x00000000000009d8  serial_c          [.] 0000000000000000    [unknown]         N/A           N/A
        18.44%         14003  0             N/A                       [.] 0x0000000000000828  serial_c          [.] 0000000000000000    [unknown]         N/A           N/A

Now on a system with Arm SPE, addresses and access types are displayed:

    # Samples: 75K of event 'l1d-miss'
    # Total weight : 75951
    # Sort order   : local_weight,mem,sym,dso,symbol_daddr,dso_daddr,snoop,tlb,locked
    #
    # Overhead       Samples  Local Weight  Memory access             Symbol                  Shared Object     Data Symbol             Data Object  Snoop         TLB access
    # ........  ............  ............  ........................  ......................  ................  ......................  ...........  ............  ......................
    #
         0.43%           324  0             L1 miss                   [.] 0x00000000000009d8  serial_c          [.] 0x0000ffff80794e00  anon         N/A           Walker hit
         0.42%           322  0             L1 miss                   [.] 0x00000000000009d8  serial_c          [.] 0x0000ffff80794580  anon         N/A           Walker hit

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Al Grant <al.grant@arm.com>
Cc: Andre Przywara <andre.przywara@arm.com>
Cc: Wei Li <liwei391@huawei.com>
Cc: Tan Xiaojun <tanxiaojun@huawei.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/util/arm-spe.c | 69 ++++++++++++++++++++++++++++++++++-----
 1 file changed, 60 insertions(+), 9 deletions(-)

diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c
index 5550906486d8..27a0b9dfe22d 100644
--- a/tools/perf/util/arm-spe.c
+++ b/tools/perf/util/arm-spe.c
@@ -261,7 +261,7 @@ arm_spe_deliver_synth_event(struct arm_spe *spe,
 }
 
 static int arm_spe__synth_mem_sample(struct arm_spe_queue *speq,
-				     u64 spe_events_id)
+				     u64 spe_events_id, u64 data_src)
 {
 	struct arm_spe *spe = speq->spe;
 	struct arm_spe_record *record = &speq->decoder->record;
@@ -274,6 +274,7 @@ static int arm_spe__synth_mem_sample(struct arm_spe_queue *speq,
 	sample.stream_id = spe_events_id;
 	sample.addr = record->virt_addr;
 	sample.phys_addr = record->phys_addr;
+	sample.data_src = data_src;
 
 	return arm_spe_deliver_synth_event(spe, speq, event, &sample);
 }
@@ -307,21 +308,66 @@ static bool arm_spe__is_memory_event(enum arm_spe_sample_type type)
 	return false;
 }
 
+static u64 arm_spe__synth_data_source(const struct arm_spe_record *record)
+{
+	union perf_mem_data_src	data_src = { 0 };
+
+	if (record->op == ARM_SPE_LD)
+		data_src.mem_op = PERF_MEM_OP_LOAD;
+	else
+		data_src.mem_op = PERF_MEM_OP_STORE;
+
+	if (record->type & (ARM_SPE_LLC_ACCESS | ARM_SPE_LLC_MISS)) {
+		data_src.mem_lvl = PERF_MEM_LVL_L3;
+
+		if (record->type & ARM_SPE_LLC_MISS)
+			data_src.mem_lvl |= PERF_MEM_LVL_MISS;
+		else
+			data_src.mem_lvl |= PERF_MEM_LVL_HIT;
+	} else if (record->type & (ARM_SPE_L1D_ACCESS | ARM_SPE_L1D_MISS)) {
+		data_src.mem_lvl = PERF_MEM_LVL_L1;
+
+		if (record->type & ARM_SPE_L1D_MISS)
+			data_src.mem_lvl |= PERF_MEM_LVL_MISS;
+		else
+			data_src.mem_lvl |= PERF_MEM_LVL_HIT;
+	}
+
+	if (record->type & ARM_SPE_REMOTE_ACCESS)
+		data_src.mem_lvl |= PERF_MEM_LVL_REM_CCE1;
+
+	if (record->type & (ARM_SPE_TLB_ACCESS | ARM_SPE_TLB_MISS)) {
+		data_src.mem_dtlb = PERF_MEM_TLB_WK;
+
+		if (record->type & ARM_SPE_TLB_MISS)
+			data_src.mem_dtlb |= PERF_MEM_TLB_MISS;
+		else
+			data_src.mem_dtlb |= PERF_MEM_TLB_HIT;
+	}
+
+	return data_src.val;
+}
+
 static int arm_spe_sample(struct arm_spe_queue *speq)
 {
 	const struct arm_spe_record *record = &speq->decoder->record;
 	struct arm_spe *spe = speq->spe;
+	u64 data_src;
 	int err;
 
+	data_src = arm_spe__synth_data_source(record);
+
 	if (spe->sample_flc) {
 		if (record->type & ARM_SPE_L1D_MISS) {
-			err = arm_spe__synth_mem_sample(speq, spe->l1d_miss_id);
+			err = arm_spe__synth_mem_sample(speq, spe->l1d_miss_id,
+							data_src);
 			if (err)
 				return err;
 		}
 
 		if (record->type & ARM_SPE_L1D_ACCESS) {
-			err = arm_spe__synth_mem_sample(speq, spe->l1d_access_id);
+			err = arm_spe__synth_mem_sample(speq, spe->l1d_access_id,
+							data_src);
 			if (err)
 				return err;
 		}
@@ -329,13 +375,15 @@ static int arm_spe_sample(struct arm_spe_queue *speq)
 
 	if (spe->sample_llc) {
 		if (record->type & ARM_SPE_LLC_MISS) {
-			err = arm_spe__synth_mem_sample(speq, spe->llc_miss_id);
+			err = arm_spe__synth_mem_sample(speq, spe->llc_miss_id,
+							data_src);
 			if (err)
 				return err;
 		}
 
 		if (record->type & ARM_SPE_LLC_ACCESS) {
-			err = arm_spe__synth_mem_sample(speq, spe->llc_access_id);
+			err = arm_spe__synth_mem_sample(speq, spe->llc_access_id,
+							data_src);
 			if (err)
 				return err;
 		}
@@ -343,13 +391,15 @@ static int arm_spe_sample(struct arm_spe_queue *speq)
 
 	if (spe->sample_tlb) {
 		if (record->type & ARM_SPE_TLB_MISS) {
-			err = arm_spe__synth_mem_sample(speq, spe->tlb_miss_id);
+			err = arm_spe__synth_mem_sample(speq, spe->tlb_miss_id,
+							data_src);
 			if (err)
 				return err;
 		}
 
 		if (record->type & ARM_SPE_TLB_ACCESS) {
-			err = arm_spe__synth_mem_sample(speq, spe->tlb_access_id);
+			err = arm_spe__synth_mem_sample(speq, spe->tlb_access_id,
+							data_src);
 			if (err)
 				return err;
 		}
@@ -363,13 +413,14 @@ static int arm_spe_sample(struct arm_spe_queue *speq)
 
 	if (spe->sample_remote_access &&
 	    (record->type & ARM_SPE_REMOTE_ACCESS)) {
-		err = arm_spe__synth_mem_sample(speq, spe->remote_access_id);
+		err = arm_spe__synth_mem_sample(speq, spe->remote_access_id,
+						data_src);
 		if (err)
 			return err;
 	}
 
 	if (spe->sample_memory && arm_spe__is_memory_event(record->type)) {
-		err = arm_spe__synth_mem_sample(speq, spe->memory_id);
+		err = arm_spe__synth_mem_sample(speq, spe->memory_id, data_src);
 		if (err)
 			return err;
 	}
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 7/8] perf arm-spe: Save context ID in record
  2021-01-19 14:46 [PATCH 1/8] perf arm-spe: Enable sample type PERF_SAMPLE_DATA_SRC James Clark
                   ` (4 preceding siblings ...)
  2021-01-19 14:46 ` [PATCH 6/8] perf arm-spe: Set sample's data source field James Clark
@ 2021-01-19 14:46 ` James Clark
  2021-01-19 14:46 ` [PATCH 8/8] perf arm-spe: Set thread TID James Clark
  2021-01-22 12:51 ` [PATCH 1/8] perf arm-spe: Enable sample type PERF_SAMPLE_DATA_SRC Arnaldo Carvalho de Melo
  7 siblings, 0 replies; 16+ messages in thread
From: James Clark @ 2021-01-19 14:46 UTC (permalink / raw)
  To: linux-kernel, linux-perf-users
  Cc: Leo Yan, James Clark, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, John Garry, Will Deacon,
	Mathieu Poirier, Al Grant, Andre Przywara, Wei Li, Tan Xiaojun,
	Adrian Hunter

From: Leo Yan <leo.yan@linaro.org>

This patch is to save context ID in record, this will be used to set TID
for samples.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Al Grant <al.grant@arm.com>
Cc: Andre Przywara <andre.przywara@arm.com>
Cc: Wei Li <liwei391@huawei.com>
Cc: Tan Xiaojun <tanxiaojun@huawei.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/util/arm-spe-decoder/arm-spe-decoder.c | 2 ++
 tools/perf/util/arm-spe-decoder/arm-spe-decoder.h | 1 +
 2 files changed, 3 insertions(+)

diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
index 32fe41835fa6..1b58859d2314 100644
--- a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
+++ b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
@@ -151,6 +151,7 @@ static int arm_spe_read_record(struct arm_spe_decoder *decoder)
 	u64 payload, ip;
 
 	memset(&decoder->record, 0x0, sizeof(decoder->record));
+	decoder->record.context_id = -1;
 
 	while (1) {
 		err = arm_spe_get_next_packet(decoder);
@@ -180,6 +181,7 @@ static int arm_spe_read_record(struct arm_spe_decoder *decoder)
 		case ARM_SPE_COUNTER:
 			break;
 		case ARM_SPE_CONTEXT:
+			decoder->record.context_id = payload;
 			break;
 		case ARM_SPE_OP_TYPE:
 			if (idx == SPE_OP_PKT_HDR_CLASS_LD_ST_ATOMIC) {
diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h
index 59bdb7309674..46a8556a9e95 100644
--- a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h
+++ b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h
@@ -38,6 +38,7 @@ struct arm_spe_record {
 	u64 timestamp;
 	u64 virt_addr;
 	u64 phys_addr;
+	u64 context_id;
 };
 
 struct arm_spe_insn;
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 8/8] perf arm-spe: Set thread TID
  2021-01-19 14:46 [PATCH 1/8] perf arm-spe: Enable sample type PERF_SAMPLE_DATA_SRC James Clark
                   ` (5 preceding siblings ...)
  2021-01-19 14:46 ` [PATCH 7/8] perf arm-spe: Save context ID in record James Clark
@ 2021-01-19 14:46 ` James Clark
       [not found]   ` <20210131120156.GB230721@leoy-ThinkPad-X240s>
  2021-01-22 12:51 ` [PATCH 1/8] perf arm-spe: Enable sample type PERF_SAMPLE_DATA_SRC Arnaldo Carvalho de Melo
  7 siblings, 1 reply; 16+ messages in thread
From: James Clark @ 2021-01-19 14:46 UTC (permalink / raw)
  To: linux-kernel, linux-perf-users
  Cc: Leo Yan, James Clark, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, John Garry, Will Deacon,
	Mathieu Poirier, Al Grant, Andre Przywara, Wei Li, Tan Xiaojun,
	Adrian Hunter

From: Leo Yan <leo.yan@linaro.org>

Set thread TID for SPE samples. Now that the context ID is saved
in each record it can be used to set the TID for a sample.

The context ID is only present in SPE data if the kernel is
compiled with CONFIG_PID_IN_CONTEXTIDR and perf record is
run as root. Otherwise the PID of the first process is assigned
to each SPE sample.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Al Grant <al.grant@arm.com>
Cc: Andre Przywara <andre.przywara@arm.com>
Cc: Wei Li <liwei391@huawei.com>
Cc: Tan Xiaojun <tanxiaojun@huawei.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
---
 tools/perf/util/arm-spe.c | 75 ++++++++++++++++++++++++++-------------
 1 file changed, 50 insertions(+), 25 deletions(-)

diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c
index 27a0b9dfe22d..9828fad7e516 100644
--- a/tools/perf/util/arm-spe.c
+++ b/tools/perf/util/arm-spe.c
@@ -223,6 +223,46 @@ static inline u8 arm_spe_cpumode(struct arm_spe *spe, u64 ip)
 		PERF_RECORD_MISC_USER;
 }
 
+static void arm_spe_set_pid_tid_cpu(struct arm_spe *spe,
+				    struct auxtrace_queue *queue)
+{
+	struct arm_spe_queue *speq = queue->priv;
+	pid_t tid;
+
+	tid = machine__get_current_tid(spe->machine, speq->cpu);
+	if (tid != -1) {
+		speq->tid = tid;
+		thread__zput(speq->thread);
+	} else
+		speq->tid = queue->tid;
+
+	if ((!speq->thread) && (speq->tid != -1)) {
+		speq->thread = machine__find_thread(spe->machine, -1,
+						    speq->tid);
+	}
+
+	if (speq->thread) {
+		speq->pid = speq->thread->pid_;
+		if (queue->cpu == -1)
+			speq->cpu = speq->thread->cpu;
+	}
+}
+
+static int arm_spe_set_tid(struct arm_spe_queue *speq, pid_t tid)
+{
+	int err;
+	struct arm_spe *spe = speq->spe;
+	struct auxtrace_queue *queue;
+
+	err = machine__set_current_tid(spe->machine, speq->cpu, tid, tid);
+	if (err)
+		return err;
+
+	queue = &speq->spe->queues.queue_array[speq->queue_nr];
+	arm_spe_set_pid_tid_cpu(speq->spe, queue);
+	return 0;
+}
+
 static void arm_spe_prep_sample(struct arm_spe *spe,
 				struct arm_spe_queue *speq,
 				union perf_event *event,
@@ -431,6 +471,7 @@ static int arm_spe_sample(struct arm_spe_queue *speq)
 static int arm_spe_run_decoder(struct arm_spe_queue *speq, u64 *timestamp)
 {
 	struct arm_spe *spe = speq->spe;
+	const struct arm_spe_record *record;
 	int ret;
 
 	if (!spe->kernel_start)
@@ -450,6 +491,11 @@ static int arm_spe_run_decoder(struct arm_spe_queue *speq, u64 *timestamp)
 		if (ret < 0)
 			continue;
 
+		record = &speq->decoder->record;
+		ret = arm_spe_set_tid(speq, record->context_id);
+		if (ret)
+			return ret;
+
 		ret = arm_spe_sample(speq);
 		if (ret)
 			return ret;
@@ -500,6 +546,10 @@ static int arm_spe__setup_queue(struct arm_spe *spe,
 
 		record = &speq->decoder->record;
 
+		ret = arm_spe_set_tid(speq, record->context_id);
+		if (ret)
+			return ret;
+
 		speq->timestamp = record->timestamp;
 		ret = auxtrace_heap__add(&spe->heap, queue_nr, speq->timestamp);
 		if (ret)
@@ -552,31 +602,6 @@ static bool arm_spe__is_timeless_decoding(struct arm_spe *spe)
 	return timeless_decoding;
 }
 
-static void arm_spe_set_pid_tid_cpu(struct arm_spe *spe,
-				    struct auxtrace_queue *queue)
-{
-	struct arm_spe_queue *speq = queue->priv;
-	pid_t tid;
-
-	tid = machine__get_current_tid(spe->machine, speq->cpu);
-	if (tid != -1) {
-		speq->tid = tid;
-		thread__zput(speq->thread);
-	} else
-		speq->tid = queue->tid;
-
-	if ((!speq->thread) && (speq->tid != -1)) {
-		speq->thread = machine__find_thread(spe->machine, -1,
-						    speq->tid);
-	}
-
-	if (speq->thread) {
-		speq->pid = speq->thread->pid_;
-		if (queue->cpu == -1)
-			speq->cpu = speq->thread->cpu;
-	}
-}
-
 static int arm_spe_process_queues(struct arm_spe *spe, u64 timestamp)
 {
 	unsigned int queue_nr;
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/8] perf arm-spe: Enable sample type PERF_SAMPLE_DATA_SRC
  2021-01-19 14:46 [PATCH 1/8] perf arm-spe: Enable sample type PERF_SAMPLE_DATA_SRC James Clark
                   ` (6 preceding siblings ...)
  2021-01-19 14:46 ` [PATCH 8/8] perf arm-spe: Set thread TID James Clark
@ 2021-01-22 12:51 ` Arnaldo Carvalho de Melo
  2021-01-22 14:30   ` Leo Yan
  2021-02-11 13:41   ` James Clark
  7 siblings, 2 replies; 16+ messages in thread
From: Arnaldo Carvalho de Melo @ 2021-01-22 12:51 UTC (permalink / raw)
  To: James Clark
  Cc: linux-kernel, linux-perf-users, Leo Yan, Peter Zijlstra,
	Ingo Molnar, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, John Garry, Will Deacon, Mathieu Poirier, Al Grant,
	Andre Przywara, Wei Li, Tan Xiaojun, Adrian Hunter

Em Tue, Jan 19, 2021 at 04:46:51PM +0200, James Clark escreveu:
> From: Leo Yan <leo.yan@linaro.org>
> 
> This patch is to enable sample type PERF_SAMPLE_DATA_SRC for Arm SPE in
> the perf data, when output the tracing data, it tells tools that it
> contains data source in the memory event.
> 
> Signed-off-by: Leo Yan <leo.yan@linaro.org>
> Signed-off-by: James Clark <james.clark@arm.com>

I see two Signed-off-by, ok, any Reviewed-by?

- Arnaldo

> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
> Cc: Jiri Olsa <jolsa@redhat.com>
> Cc: Namhyung Kim <namhyung@kernel.org>
> Cc: John Garry <john.garry@huawei.com>
> Cc: Will Deacon <will@kernel.org>
> Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
> Cc: Al Grant <al.grant@arm.com>
> Cc: Andre Przywara <andre.przywara@arm.com>
> Cc: Wei Li <liwei391@huawei.com>
> Cc: Tan Xiaojun <tanxiaojun@huawei.com>
> Cc: Adrian Hunter <adrian.hunter@intel.com>
> ---
>  tools/perf/util/arm-spe.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c
> index 8901a1656a41..b134516e890b 100644
> --- a/tools/perf/util/arm-spe.c
> +++ b/tools/perf/util/arm-spe.c
> @@ -803,7 +803,7 @@ arm_spe_synth_events(struct arm_spe *spe, struct perf_session *session)
>  	attr.type = PERF_TYPE_HARDWARE;
>  	attr.sample_type = evsel->core.attr.sample_type & PERF_SAMPLE_MASK;
>  	attr.sample_type |= PERF_SAMPLE_IP | PERF_SAMPLE_TID |
> -		PERF_SAMPLE_PERIOD;
> +			    PERF_SAMPLE_PERIOD | PERF_SAMPLE_DATA_SRC;
>  	if (spe->timeless_decoding)
>  		attr.sample_type &= ~(u64)PERF_SAMPLE_TIME;
>  	else
> -- 
> 2.28.0
> 

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/8] perf arm-spe: Enable sample type PERF_SAMPLE_DATA_SRC
  2021-01-22 12:51 ` [PATCH 1/8] perf arm-spe: Enable sample type PERF_SAMPLE_DATA_SRC Arnaldo Carvalho de Melo
@ 2021-01-22 14:30   ` Leo Yan
  2021-02-11 13:41   ` James Clark
  1 sibling, 0 replies; 16+ messages in thread
From: Leo Yan @ 2021-01-22 14:30 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: James Clark, linux-kernel, linux-perf-users, Peter Zijlstra,
	Ingo Molnar, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, John Garry, Will Deacon, Mathieu Poirier, Al Grant,
	Andre Przywara, Wei Li, Tan Xiaojun, Adrian Hunter

Hi Arnaldo,

On Fri, Jan 22, 2021 at 09:51:57AM -0300, Arnaldo Carvalho de Melo wrote:
> Em Tue, Jan 19, 2021 at 04:46:51PM +0200, James Clark escreveu:
> > From: Leo Yan <leo.yan@linaro.org>
> > 
> > This patch is to enable sample type PERF_SAMPLE_DATA_SRC for Arm SPE in
> > the perf data, when output the tracing data, it tells tools that it
> > contains data source in the memory event.
> > 
> > Signed-off-by: Leo Yan <leo.yan@linaro.org>
> > Signed-off-by: James Clark <james.clark@arm.com>
> 
> I see two Signed-off-by, ok, any Reviewed-by?

I had no confidence for some changes in the patch series, since James
is more easier to reach out hardware persons, so James kindly took over
and refined the patches (Thanks a lot!).

I hope Al could take a look for the patches, and myself also will give
a review and test for this series.

P.s. @James, I think some patches have been refactored, so it's good
to add "Co-developed-by:" tag or changing the author name for you.
Will comment on this when I review patches.

Thanks,
Leo

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 8/8] perf arm-spe: Set thread TID
       [not found]   ` <20210131120156.GB230721@leoy-ThinkPad-X240s>
@ 2021-02-01 17:40     ` James Clark
  2021-02-04 10:27       ` Leo Yan
  0 siblings, 1 reply; 16+ messages in thread
From: James Clark @ 2021-02-01 17:40 UTC (permalink / raw)
  To: Leo Yan
  Cc: linux-kernel, linux-perf-users, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, John Garry, Will Deacon,
	Mathieu Poirier, Al Grant, Andre Przywara, Wei Li, Tan Xiaojun,
	Adrian Hunter



On 31/01/2021 14:01, Leo Yan wrote:
> Option 1: by merging patches 07/08 and 08/08, we can firstly support PID
> tracing for root namespace, and later we can extend to support PID
> tracing in container (and in VMs).
> 
> Option 2: we can use the software method to establish PID for SPE
> trace, which can base on kernel's events PERF_RECORD_SWITCH /
> PERF_RECORD_SWITCH_CPU_WIDE and check context switch ip.
> 
> To be honest, I am a bit concern for option 1 for later might
> introduce regression when later support PID for containers (and VMs).
> If you have a plan for option 1, I think it's good to record current
> limitation and the plan for next step in the commit log, so we can merge
> this patch at this time and later extend for containers.
> 
> Otherwise, we need to consider how to implement the PID tracing with
> option 2.  If it is the case, we should firstly only merge patches
> 01 ~ 06 for data source enabling.  How about you think for this?

In my opinion we should do option 1 and use what is there at the moment. That
gets users 90% of the functionality right now.

I plan to look at option 2 at some point, and it can always be added on top of
option 1 or replace what is there. But I don't know when I would get to it or
how long it will take.

James

> 
>> Signed-off-by: Leo Yan <leo.yan@linaro.org>
>> Signed-off-by: James Clark <james.clark@arm.com>
> 
> Besides for techinical question, you could add your "Co-developed-by"
> tags for patches 06, 07, 08/08, which you have took time to refin them.
> 
> Thanks you for kindly efforts.
> 
> [1] https://lore.kernel.org/patchwork/patch/1353286/
> 
>> Cc: Peter Zijlstra <peterz@infradead.org>
>> Cc: Ingo Molnar <mingo@redhat.com>
>> Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
>> Cc: Mark Rutland <mark.rutland@arm.com>
>> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
>> Cc: Jiri Olsa <jolsa@redhat.com>
>> Cc: Namhyung Kim <namhyung@kernel.org>
>> Cc: John Garry <john.garry@huawei.com>
>> Cc: Will Deacon <will@kernel.org>
>> Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
>> Cc: Al Grant <al.grant@arm.com>
>> Cc: Andre Przywara <andre.przywara@arm.com>
>> Cc: Wei Li <liwei391@huawei.com>
>> Cc: Tan Xiaojun <tanxiaojun@huawei.com>
>> Cc: Adrian Hunter <adrian.hunter@intel.com>
>> ---
>>  tools/perf/util/arm-spe.c | 75 ++++++++++++++++++++++++++-------------
>>  1 file changed, 50 insertions(+), 25 deletions(-)
>>
>> diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c
>> index 27a0b9dfe22d..9828fad7e516 100644
>> --- a/tools/perf/util/arm-spe.c
>> +++ b/tools/perf/util/arm-spe.c
>> @@ -223,6 +223,46 @@ static inline u8 arm_spe_cpumode(struct arm_spe *spe, u64 ip)
>>  		PERF_RECORD_MISC_USER;
>>  }
>>  
>> +static void arm_spe_set_pid_tid_cpu(struct arm_spe *spe,
>> +				    struct auxtrace_queue *queue)
>> +{
>> +	struct arm_spe_queue *speq = queue->priv;
>> +	pid_t tid;
>> +
>> +	tid = machine__get_current_tid(spe->machine, speq->cpu);
>> +	if (tid != -1) {
>> +		speq->tid = tid;
>> +		thread__zput(speq->thread);
>> +	} else
>> +		speq->tid = queue->tid;
>> +
>> +	if ((!speq->thread) && (speq->tid != -1)) {
>> +		speq->thread = machine__find_thread(spe->machine, -1,
>> +						    speq->tid);
>> +	}
>> +
>> +	if (speq->thread) {
>> +		speq->pid = speq->thread->pid_;
>> +		if (queue->cpu == -1)
>> +			speq->cpu = speq->thread->cpu;
>> +	}
>> +}
>> +
>> +static int arm_spe_set_tid(struct arm_spe_queue *speq, pid_t tid)
>> +{
>> +	int err;
>> +	struct arm_spe *spe = speq->spe;
>> +	struct auxtrace_queue *queue;
>> +
>> +	err = machine__set_current_tid(spe->machine, speq->cpu, tid, tid);
>> +	if (err)
>> +		return err;
>> +
>> +	queue = &speq->spe->queues.queue_array[speq->queue_nr];
>> +	arm_spe_set_pid_tid_cpu(speq->spe, queue);
>> +	return 0;
>> +}
>> +
>>  static void arm_spe_prep_sample(struct arm_spe *spe,
>>  				struct arm_spe_queue *speq,
>>  				union perf_event *event,
>> @@ -431,6 +471,7 @@ static int arm_spe_sample(struct arm_spe_queue *speq)
>>  static int arm_spe_run_decoder(struct arm_spe_queue *speq, u64 *timestamp)
>>  {
>>  	struct arm_spe *spe = speq->spe;
>> +	const struct arm_spe_record *record;
>>  	int ret;
>>  
>>  	if (!spe->kernel_start)
>> @@ -450,6 +491,11 @@ static int arm_spe_run_decoder(struct arm_spe_queue *speq, u64 *timestamp)
>>  		if (ret < 0)
>>  			continue;
>>  
>> +		record = &speq->decoder->record;
>> +		ret = arm_spe_set_tid(speq, record->context_id);
>> +		if (ret)
>> +			return ret;
>> +
>>  		ret = arm_spe_sample(speq);
>>  		if (ret)
>>  			return ret;
>> @@ -500,6 +546,10 @@ static int arm_spe__setup_queue(struct arm_spe *spe,
>>  
>>  		record = &speq->decoder->record;
>>  
>> +		ret = arm_spe_set_tid(speq, record->context_id);
>> +		if (ret)
>> +			return ret;
>> +
>>  		speq->timestamp = record->timestamp;
>>  		ret = auxtrace_heap__add(&spe->heap, queue_nr, speq->timestamp);
>>  		if (ret)
>> @@ -552,31 +602,6 @@ static bool arm_spe__is_timeless_decoding(struct arm_spe *spe)
>>  	return timeless_decoding;
>>  }
>>  
>> -static void arm_spe_set_pid_tid_cpu(struct arm_spe *spe,
>> -				    struct auxtrace_queue *queue)
>> -{
>> -	struct arm_spe_queue *speq = queue->priv;
>> -	pid_t tid;
>> -
>> -	tid = machine__get_current_tid(spe->machine, speq->cpu);
>> -	if (tid != -1) {
>> -		speq->tid = tid;
>> -		thread__zput(speq->thread);
>> -	} else
>> -		speq->tid = queue->tid;
>> -
>> -	if ((!speq->thread) && (speq->tid != -1)) {
>> -		speq->thread = machine__find_thread(spe->machine, -1,
>> -						    speq->tid);
>> -	}
>> -
>> -	if (speq->thread) {
>> -		speq->pid = speq->thread->pid_;
>> -		if (queue->cpu == -1)
>> -			speq->cpu = speq->thread->cpu;
>> -	}
>> -}
>> -
>>  static int arm_spe_process_queues(struct arm_spe *spe, u64 timestamp)
>>  {
>>  	unsigned int queue_nr;
>> -- 
>> 2.28.0
>>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 8/8] perf arm-spe: Set thread TID
  2021-02-01 17:40     ` James Clark
@ 2021-02-04 10:27       ` Leo Yan
  2021-02-09 15:36         ` James Clark
  0 siblings, 1 reply; 16+ messages in thread
From: Leo Yan @ 2021-02-04 10:27 UTC (permalink / raw)
  To: James Clark
  Cc: linux-kernel, linux-perf-users, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, John Garry, Will Deacon,
	Mathieu Poirier, Al Grant, Andre Przywara, Wei Li, Tan Xiaojun,
	Adrian Hunter

On Mon, Feb 01, 2021 at 07:40:45PM +0200, James Clark wrote:
> 
> On 31/01/2021 14:01, Leo Yan wrote:
> > Option 1: by merging patches 07/08 and 08/08, we can firstly support PID
> > tracing for root namespace, and later we can extend to support PID
> > tracing in container (and in VMs).
> > 
> > Option 2: we can use the software method to establish PID for SPE
> > trace, which can base on kernel's events PERF_RECORD_SWITCH /
> > PERF_RECORD_SWITCH_CPU_WIDE and check context switch ip.
> > 
> > To be honest, I am a bit concern for option 1 for later might
> > introduce regression when later support PID for containers (and VMs).
> > If you have a plan for option 1, I think it's good to record current
> > limitation and the plan for next step in the commit log, so we can merge
> > this patch at this time and later extend for containers.
> > 
> > Otherwise, we need to consider how to implement the PID tracing with
> > option 2.  If it is the case, we should firstly only merge patches
> > 01 ~ 06 for data source enabling.  How about you think for this?
> 
> In my opinion we should do option 1 and use what is there at the moment. That
> gets users 90% of the functionality right now.
> 
> I plan to look at option 2 at some point, and it can always be added on top of
> option 1 or replace what is there. But I don't know when I would get to it or
> how long it will take.

Firstly, sorry for long replying.

Have offline discussion with James and I took time to look into Intel PT
implementation for PID tracing.

AFAICT, for tracing root namespace, the option 1 with using CONTEXTIDR is
the most reliable way for Arm SPE; at beginning I thought option 2 is
better choice for Arm SPE, but after went through Intel PT's code, I think
Arm SPE cannot achieve the same result with Intel PT, this is caused by
its "statistical" character.

Let me explain why Arm SPE has problem with the option 2.  If we want to
enable option 2 by using perf context switch events and switch_ip
approach, it uses below logic:

  Step1: when event PERF_RECORD_SWITCH or PERF_RECORD_SWITCH_CPU_WIDE
  is coming, invokes below functions.  So it tells the "machine"
  context that the process is switched to new one; at this step, it
  simply caches the new PID/TID into the "machine" context.  But the
  samples doesn't really set the new value.

    intel_pt_context_switch()
      `> machine__set_current_tid()

  Step2: when detect the branch instruction's target address equals
  to the address of symbol "__switch_to", this means the the CPU
  really switches context to the next process in the low level code,
  afterwards it will retrieve the cached TID/PID from the "machine"
  context and set the correct PID for "ptq->pid" (see
  intel_pt_sample_set_pid_tid_cpu()), then "ptq->tid" is
  used for synthesizing samples.

Arm SPE has the problem for step2, due to the trace uses statistical
approach, it doesn't trace the complete branch instructions, so it
cannot promise to capture all branches for the symbol "__switch_to".
If we only use the events PERF_RECORD_SWITCH /
PERF_RECORD_SWITCH_CPU_WIDE, then it will lead to the coarse result
for PID tracing.

For this reason, seems to me it's pragmatic to use CONTEXTIDR for
PID tracing at current stage, at least it can allow the root domain
tracing works accurately.  But this will leave the issue for tracing
PID in non root namespace, we need to figure out solution later.

Hi Mark.R, Al, do you have any comments for this?

Thanks,
Leo

> >> Signed-off-by: Leo Yan <leo.yan@linaro.org>
> >> Signed-off-by: James Clark <james.clark@arm.com>
> > 
> > Besides for techinical question, you could add your "Co-developed-by"
> > tags for patches 06, 07, 08/08, which you have took time to refin them.
> > 
> > Thanks you for kindly efforts.
> > 
> > [1] https://lore.kernel.org/patchwork/patch/1353286/
> > 
> >> Cc: Peter Zijlstra <peterz@infradead.org>
> >> Cc: Ingo Molnar <mingo@redhat.com>
> >> Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
> >> Cc: Mark Rutland <mark.rutland@arm.com>
> >> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
> >> Cc: Jiri Olsa <jolsa@redhat.com>
> >> Cc: Namhyung Kim <namhyung@kernel.org>
> >> Cc: John Garry <john.garry@huawei.com>
> >> Cc: Will Deacon <will@kernel.org>
> >> Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
> >> Cc: Al Grant <al.grant@arm.com>
> >> Cc: Andre Przywara <andre.przywara@arm.com>
> >> Cc: Wei Li <liwei391@huawei.com>
> >> Cc: Tan Xiaojun <tanxiaojun@huawei.com>
> >> Cc: Adrian Hunter <adrian.hunter@intel.com>
> >> ---
> >>  tools/perf/util/arm-spe.c | 75 ++++++++++++++++++++++++++-------------
> >>  1 file changed, 50 insertions(+), 25 deletions(-)
> >>
> >> diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c
> >> index 27a0b9dfe22d..9828fad7e516 100644
> >> --- a/tools/perf/util/arm-spe.c
> >> +++ b/tools/perf/util/arm-spe.c
> >> @@ -223,6 +223,46 @@ static inline u8 arm_spe_cpumode(struct arm_spe *spe, u64 ip)
> >>  		PERF_RECORD_MISC_USER;
> >>  }
> >>  
> >> +static void arm_spe_set_pid_tid_cpu(struct arm_spe *spe,
> >> +				    struct auxtrace_queue *queue)
> >> +{
> >> +	struct arm_spe_queue *speq = queue->priv;
> >> +	pid_t tid;
> >> +
> >> +	tid = machine__get_current_tid(spe->machine, speq->cpu);
> >> +	if (tid != -1) {
> >> +		speq->tid = tid;
> >> +		thread__zput(speq->thread);
> >> +	} else
> >> +		speq->tid = queue->tid;
> >> +
> >> +	if ((!speq->thread) && (speq->tid != -1)) {
> >> +		speq->thread = machine__find_thread(spe->machine, -1,
> >> +						    speq->tid);
> >> +	}
> >> +
> >> +	if (speq->thread) {
> >> +		speq->pid = speq->thread->pid_;
> >> +		if (queue->cpu == -1)
> >> +			speq->cpu = speq->thread->cpu;
> >> +	}
> >> +}
> >> +
> >> +static int arm_spe_set_tid(struct arm_spe_queue *speq, pid_t tid)
> >> +{
> >> +	int err;
> >> +	struct arm_spe *spe = speq->spe;
> >> +	struct auxtrace_queue *queue;
> >> +
> >> +	err = machine__set_current_tid(spe->machine, speq->cpu, tid, tid);
> >> +	if (err)
> >> +		return err;
> >> +
> >> +	queue = &speq->spe->queues.queue_array[speq->queue_nr];
> >> +	arm_spe_set_pid_tid_cpu(speq->spe, queue);
> >> +	return 0;
> >> +}
> >> +
> >>  static void arm_spe_prep_sample(struct arm_spe *spe,
> >>  				struct arm_spe_queue *speq,
> >>  				union perf_event *event,
> >> @@ -431,6 +471,7 @@ static int arm_spe_sample(struct arm_spe_queue *speq)
> >>  static int arm_spe_run_decoder(struct arm_spe_queue *speq, u64 *timestamp)
> >>  {
> >>  	struct arm_spe *spe = speq->spe;
> >> +	const struct arm_spe_record *record;
> >>  	int ret;
> >>  
> >>  	if (!spe->kernel_start)
> >> @@ -450,6 +491,11 @@ static int arm_spe_run_decoder(struct arm_spe_queue *speq, u64 *timestamp)
> >>  		if (ret < 0)
> >>  			continue;
> >>  
> >> +		record = &speq->decoder->record;
> >> +		ret = arm_spe_set_tid(speq, record->context_id);
> >> +		if (ret)
> >> +			return ret;
> >> +
> >>  		ret = arm_spe_sample(speq);
> >>  		if (ret)
> >>  			return ret;
> >> @@ -500,6 +546,10 @@ static int arm_spe__setup_queue(struct arm_spe *spe,
> >>  
> >>  		record = &speq->decoder->record;
> >>  
> >> +		ret = arm_spe_set_tid(speq, record->context_id);
> >> +		if (ret)
> >> +			return ret;
> >> +
> >>  		speq->timestamp = record->timestamp;
> >>  		ret = auxtrace_heap__add(&spe->heap, queue_nr, speq->timestamp);
> >>  		if (ret)
> >> @@ -552,31 +602,6 @@ static bool arm_spe__is_timeless_decoding(struct arm_spe *spe)
> >>  	return timeless_decoding;
> >>  }
> >>  
> >> -static void arm_spe_set_pid_tid_cpu(struct arm_spe *spe,
> >> -				    struct auxtrace_queue *queue)
> >> -{
> >> -	struct arm_spe_queue *speq = queue->priv;
> >> -	pid_t tid;
> >> -
> >> -	tid = machine__get_current_tid(spe->machine, speq->cpu);
> >> -	if (tid != -1) {
> >> -		speq->tid = tid;
> >> -		thread__zput(speq->thread);
> >> -	} else
> >> -		speq->tid = queue->tid;
> >> -
> >> -	if ((!speq->thread) && (speq->tid != -1)) {
> >> -		speq->thread = machine__find_thread(spe->machine, -1,
> >> -						    speq->tid);
> >> -	}
> >> -
> >> -	if (speq->thread) {
> >> -		speq->pid = speq->thread->pid_;
> >> -		if (queue->cpu == -1)
> >> -			speq->cpu = speq->thread->cpu;
> >> -	}
> >> -}
> >> -
> >>  static int arm_spe_process_queues(struct arm_spe *spe, u64 timestamp)
> >>  {
> >>  	unsigned int queue_nr;
> >> -- 
> >> 2.28.0
> >>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 8/8] perf arm-spe: Set thread TID
  2021-02-04 10:27       ` Leo Yan
@ 2021-02-09 15:36         ` James Clark
  2021-02-10 10:16           ` James Clark
  0 siblings, 1 reply; 16+ messages in thread
From: James Clark @ 2021-02-09 15:36 UTC (permalink / raw)
  To: Leo Yan
  Cc: linux-kernel, linux-perf-users, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, John Garry, Will Deacon,
	Mathieu Poirier, Al Grant, Andre Przywara, Wei Li, Tan Xiaojun,
	Adrian Hunter



On 04/02/2021 12:27, Leo Yan wrote:
> On Mon, Feb 01, 2021 at 07:40:45PM +0200, James Clark wrote:
>>
>> On 31/01/2021 14:01, Leo Yan wrote:
>>> Option 1: by merging patches 07/08 and 08/08, we can firstly support PID
>>> tracing for root namespace, and later we can extend to support PID
>>> tracing in container (and in VMs).
>>>
> Arm SPE has the problem for step2, due to the trace uses statistical
> approach, it doesn't trace the complete branch instructions, so it
> cannot promise to capture all branches for the symbol "__switch_to".
> If we only use the events PERF_RECORD_SWITCH /
> PERF_RECORD_SWITCH_CPU_WIDE, then it will lead to the coarse result
> for PID tracing.
> 
> For this reason, seems to me it's pragmatic to use CONTEXTIDR for
> PID tracing at current stage, at least it can allow the root domain
> tracing works accurately.  But this will leave the issue for tracing
> PID in non root namespace, we need to figure out solution later.
> 
> Hi Mark.R, Al, do you have any comments for this?

Hi Leo,

I spoke with Al and his suggestion is to clear the PID value if the event
was opened outside of the root namespace.

I think that's not a bad idea as it gets us PIDs in most cases but also
doesn't show any incorrect data. Do you know if it's possible to determine
that from a perf.data file? Unfortunately it doesn't seem to be possible
to disable CONTEXTIDR tracing when opening the event as it's compile time
only and can't be disabled dynamically.

James

> 
> Thanks,
> Leo
> 
>>>> Signed-off-by: Leo Yan <leo.yan@linaro.org>
>>>> Signed-off-by: James Clark <james.clark@arm.com>
>>>
>>> Besides for techinical question, you could add your "Co-developed-by"
>>> tags for patches 06, 07, 08/08, which you have took time to refin them.
>>>
>>> Thanks you for kindly efforts.
>>>
>>> [1] https://lore.kernel.org/patchwork/patch/1353286/
>>>
>>>> Cc: Peter Zijlstra <peterz@infradead.org>
>>>> Cc: Ingo Molnar <mingo@redhat.com>
>>>> Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
>>>> Cc: Mark Rutland <mark.rutland@arm.com>
>>>> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
>>>> Cc: Jiri Olsa <jolsa@redhat.com>
>>>> Cc: Namhyung Kim <namhyung@kernel.org>
>>>> Cc: John Garry <john.garry@huawei.com>
>>>> Cc: Will Deacon <will@kernel.org>
>>>> Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
>>>> Cc: Al Grant <al.grant@arm.com>
>>>> Cc: Andre Przywara <andre.przywara@arm.com>
>>>> Cc: Wei Li <liwei391@huawei.com>
>>>> Cc: Tan Xiaojun <tanxiaojun@huawei.com>
>>>> Cc: Adrian Hunter <adrian.hunter@intel.com>
>>>> ---
>>>>  tools/perf/util/arm-spe.c | 75 ++++++++++++++++++++++++++-------------
>>>>  1 file changed, 50 insertions(+), 25 deletions(-)
>>>>
>>>> diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c
>>>> index 27a0b9dfe22d..9828fad7e516 100644
>>>> --- a/tools/perf/util/arm-spe.c
>>>> +++ b/tools/perf/util/arm-spe.c
>>>> @@ -223,6 +223,46 @@ static inline u8 arm_spe_cpumode(struct arm_spe *spe, u64 ip)
>>>>  		PERF_RECORD_MISC_USER;
>>>>  }
>>>>  
>>>> +static void arm_spe_set_pid_tid_cpu(struct arm_spe *spe,
>>>> +				    struct auxtrace_queue *queue)
>>>> +{
>>>> +	struct arm_spe_queue *speq = queue->priv;
>>>> +	pid_t tid;
>>>> +
>>>> +	tid = machine__get_current_tid(spe->machine, speq->cpu);
>>>> +	if (tid != -1) {
>>>> +		speq->tid = tid;
>>>> +		thread__zput(speq->thread);
>>>> +	} else
>>>> +		speq->tid = queue->tid;
>>>> +
>>>> +	if ((!speq->thread) && (speq->tid != -1)) {
>>>> +		speq->thread = machine__find_thread(spe->machine, -1,
>>>> +						    speq->tid);
>>>> +	}
>>>> +
>>>> +	if (speq->thread) {
>>>> +		speq->pid = speq->thread->pid_;
>>>> +		if (queue->cpu == -1)
>>>> +			speq->cpu = speq->thread->cpu;
>>>> +	}
>>>> +}
>>>> +
>>>> +static int arm_spe_set_tid(struct arm_spe_queue *speq, pid_t tid)
>>>> +{
>>>> +	int err;
>>>> +	struct arm_spe *spe = speq->spe;
>>>> +	struct auxtrace_queue *queue;
>>>> +
>>>> +	err = machine__set_current_tid(spe->machine, speq->cpu, tid, tid);
>>>> +	if (err)
>>>> +		return err;
>>>> +
>>>> +	queue = &speq->spe->queues.queue_array[speq->queue_nr];
>>>> +	arm_spe_set_pid_tid_cpu(speq->spe, queue);
>>>> +	return 0;
>>>> +}
>>>> +
>>>>  static void arm_spe_prep_sample(struct arm_spe *spe,
>>>>  				struct arm_spe_queue *speq,
>>>>  				union perf_event *event,
>>>> @@ -431,6 +471,7 @@ static int arm_spe_sample(struct arm_spe_queue *speq)
>>>>  static int arm_spe_run_decoder(struct arm_spe_queue *speq, u64 *timestamp)
>>>>  {
>>>>  	struct arm_spe *spe = speq->spe;
>>>> +	const struct arm_spe_record *record;
>>>>  	int ret;
>>>>  
>>>>  	if (!spe->kernel_start)
>>>> @@ -450,6 +491,11 @@ static int arm_spe_run_decoder(struct arm_spe_queue *speq, u64 *timestamp)
>>>>  		if (ret < 0)
>>>>  			continue;
>>>>  
>>>> +		record = &speq->decoder->record;
>>>> +		ret = arm_spe_set_tid(speq, record->context_id);
>>>> +		if (ret)
>>>> +			return ret;
>>>> +
>>>>  		ret = arm_spe_sample(speq);
>>>>  		if (ret)
>>>>  			return ret;
>>>> @@ -500,6 +546,10 @@ static int arm_spe__setup_queue(struct arm_spe *spe,
>>>>  
>>>>  		record = &speq->decoder->record;
>>>>  
>>>> +		ret = arm_spe_set_tid(speq, record->context_id);
>>>> +		if (ret)
>>>> +			return ret;
>>>> +
>>>>  		speq->timestamp = record->timestamp;
>>>>  		ret = auxtrace_heap__add(&spe->heap, queue_nr, speq->timestamp);
>>>>  		if (ret)
>>>> @@ -552,31 +602,6 @@ static bool arm_spe__is_timeless_decoding(struct arm_spe *spe)
>>>>  	return timeless_decoding;
>>>>  }
>>>>  
>>>> -static void arm_spe_set_pid_tid_cpu(struct arm_spe *spe,
>>>> -				    struct auxtrace_queue *queue)
>>>> -{
>>>> -	struct arm_spe_queue *speq = queue->priv;
>>>> -	pid_t tid;
>>>> -
>>>> -	tid = machine__get_current_tid(spe->machine, speq->cpu);
>>>> -	if (tid != -1) {
>>>> -		speq->tid = tid;
>>>> -		thread__zput(speq->thread);
>>>> -	} else
>>>> -		speq->tid = queue->tid;
>>>> -
>>>> -	if ((!speq->thread) && (speq->tid != -1)) {
>>>> -		speq->thread = machine__find_thread(spe->machine, -1,
>>>> -						    speq->tid);
>>>> -	}
>>>> -
>>>> -	if (speq->thread) {
>>>> -		speq->pid = speq->thread->pid_;
>>>> -		if (queue->cpu == -1)
>>>> -			speq->cpu = speq->thread->cpu;
>>>> -	}
>>>> -}
>>>> -
>>>>  static int arm_spe_process_queues(struct arm_spe *spe, u64 timestamp)
>>>>  {
>>>>  	unsigned int queue_nr;
>>>> -- 
>>>> 2.28.0
>>>>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 8/8] perf arm-spe: Set thread TID
  2021-02-09 15:36         ` James Clark
@ 2021-02-10 10:16           ` James Clark
  2021-02-10 12:03             ` Leo Yan
  0 siblings, 1 reply; 16+ messages in thread
From: James Clark @ 2021-02-10 10:16 UTC (permalink / raw)
  To: Leo Yan
  Cc: linux-kernel, linux-perf-users, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, John Garry, Will Deacon,
	Mathieu Poirier, Al Grant, Andre Przywara, Wei Li, Tan Xiaojun,
	Adrian Hunter



On 09/02/2021 17:36, James Clark wrote:
> 
> 
> On 04/02/2021 12:27, Leo Yan wrote:
>> On Mon, Feb 01, 2021 at 07:40:45PM +0200, James Clark wrote:
>>>
>>> On 31/01/2021 14:01, Leo Yan wrote:
>>>> Option 1: by merging patches 07/08 and 08/08, we can firstly support PID
>>>> tracing for root namespace, and later we can extend to support PID
>>>> tracing in container (and in VMs).
>>>>
>> Arm SPE has the problem for step2, due to the trace uses statistical
>> approach, it doesn't trace the complete branch instructions, so it
>> cannot promise to capture all branches for the symbol "__switch_to".
>> If we only use the events PERF_RECORD_SWITCH /
>> PERF_RECORD_SWITCH_CPU_WIDE, then it will lead to the coarse result
>> for PID tracing.
>>
>> For this reason, seems to me it's pragmatic to use CONTEXTIDR for
>> PID tracing at current stage, at least it can allow the root domain
>> tracing works accurately.  But this will leave the issue for tracing
>> PID in non root namespace, we need to figure out solution later.
>>
>> Hi Mark.R, Al, do you have any comments for this?
> 
> Hi Leo,
> 
> I spoke with Al and his suggestion is to clear the PID value if the event
> was opened outside of the root namespace.
> 
> I think that's not a bad idea as it gets us PIDs in most cases but also
> doesn't show any incorrect data. Do you know if it's possible to determine
> that from a perf.data file? Unfortunately it doesn't seem to be possible
> to disable CONTEXTIDR tracing when opening the event as it's compile time
> only and can't be disabled dynamically.
> 
> James
> 

I've had a think about it and I think we should do one of two things:

#1) Remove the PID setting from the data source patchset. This will keep the
    existing behaviour of using the PID of the first traced process only even
    if there are forks. Later we can implement #2 or attempt to make it work
    even in non root namespaces.

    I'm not sure how this will impact your c2c patchset if you are relying on
    the PID data Leo?

#2) Make a change in the SPE driver to add an option for disabling CONTEXTIDR.
    We will disable this from userspace if the event is opened in a non root
    namespace. So we will only show PID data if we know it's valid, otherwise
    the existing behaviour of only using the first PID will remain.

Hopefully those solutions will help to minimise changes in behaviour between
kernel releases that could be confusing.


>>
>> Thanks,
>> Leo
>>
>>>>> Signed-off-by: Leo Yan <leo.yan@linaro.org>
>>>>> Signed-off-by: James Clark <james.clark@arm.com>
>>>>
>>>> Besides for techinical question, you could add your "Co-developed-by"
>>>> tags for patches 06, 07, 08/08, which you have took time to refin them.
>>>>
>>>> Thanks you for kindly efforts.
>>>>
>>>> [1] https://lore.kernel.org/patchwork/patch/1353286/
>>>>
>>>>> Cc: Peter Zijlstra <peterz@infradead.org>
>>>>> Cc: Ingo Molnar <mingo@redhat.com>
>>>>> Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
>>>>> Cc: Mark Rutland <mark.rutland@arm.com>
>>>>> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
>>>>> Cc: Jiri Olsa <jolsa@redhat.com>
>>>>> Cc: Namhyung Kim <namhyung@kernel.org>
>>>>> Cc: John Garry <john.garry@huawei.com>
>>>>> Cc: Will Deacon <will@kernel.org>
>>>>> Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
>>>>> Cc: Al Grant <al.grant@arm.com>
>>>>> Cc: Andre Przywara <andre.przywara@arm.com>
>>>>> Cc: Wei Li <liwei391@huawei.com>
>>>>> Cc: Tan Xiaojun <tanxiaojun@huawei.com>
>>>>> Cc: Adrian Hunter <adrian.hunter@intel.com>
>>>>> ---
>>>>>  tools/perf/util/arm-spe.c | 75 ++++++++++++++++++++++++++-------------
>>>>>  1 file changed, 50 insertions(+), 25 deletions(-)
>>>>>
>>>>> diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c
>>>>> index 27a0b9dfe22d..9828fad7e516 100644
>>>>> --- a/tools/perf/util/arm-spe.c
>>>>> +++ b/tools/perf/util/arm-spe.c
>>>>> @@ -223,6 +223,46 @@ static inline u8 arm_spe_cpumode(struct arm_spe *spe, u64 ip)
>>>>>  		PERF_RECORD_MISC_USER;
>>>>>  }
>>>>>  
>>>>> +static void arm_spe_set_pid_tid_cpu(struct arm_spe *spe,
>>>>> +				    struct auxtrace_queue *queue)
>>>>> +{
>>>>> +	struct arm_spe_queue *speq = queue->priv;
>>>>> +	pid_t tid;
>>>>> +
>>>>> +	tid = machine__get_current_tid(spe->machine, speq->cpu);
>>>>> +	if (tid != -1) {
>>>>> +		speq->tid = tid;
>>>>> +		thread__zput(speq->thread);
>>>>> +	} else
>>>>> +		speq->tid = queue->tid;
>>>>> +
>>>>> +	if ((!speq->thread) && (speq->tid != -1)) {
>>>>> +		speq->thread = machine__find_thread(spe->machine, -1,
>>>>> +						    speq->tid);
>>>>> +	}
>>>>> +
>>>>> +	if (speq->thread) {
>>>>> +		speq->pid = speq->thread->pid_;
>>>>> +		if (queue->cpu == -1)
>>>>> +			speq->cpu = speq->thread->cpu;
>>>>> +	}
>>>>> +}
>>>>> +
>>>>> +static int arm_spe_set_tid(struct arm_spe_queue *speq, pid_t tid)
>>>>> +{
>>>>> +	int err;
>>>>> +	struct arm_spe *spe = speq->spe;
>>>>> +	struct auxtrace_queue *queue;
>>>>> +
>>>>> +	err = machine__set_current_tid(spe->machine, speq->cpu, tid, tid);
>>>>> +	if (err)
>>>>> +		return err;
>>>>> +
>>>>> +	queue = &speq->spe->queues.queue_array[speq->queue_nr];
>>>>> +	arm_spe_set_pid_tid_cpu(speq->spe, queue);
>>>>> +	return 0;
>>>>> +}
>>>>> +
>>>>>  static void arm_spe_prep_sample(struct arm_spe *spe,
>>>>>  				struct arm_spe_queue *speq,
>>>>>  				union perf_event *event,
>>>>> @@ -431,6 +471,7 @@ static int arm_spe_sample(struct arm_spe_queue *speq)
>>>>>  static int arm_spe_run_decoder(struct arm_spe_queue *speq, u64 *timestamp)
>>>>>  {
>>>>>  	struct arm_spe *spe = speq->spe;
>>>>> +	const struct arm_spe_record *record;
>>>>>  	int ret;
>>>>>  
>>>>>  	if (!spe->kernel_start)
>>>>> @@ -450,6 +491,11 @@ static int arm_spe_run_decoder(struct arm_spe_queue *speq, u64 *timestamp)
>>>>>  		if (ret < 0)
>>>>>  			continue;
>>>>>  
>>>>> +		record = &speq->decoder->record;
>>>>> +		ret = arm_spe_set_tid(speq, record->context_id);
>>>>> +		if (ret)
>>>>> +			return ret;
>>>>> +
>>>>>  		ret = arm_spe_sample(speq);
>>>>>  		if (ret)
>>>>>  			return ret;
>>>>> @@ -500,6 +546,10 @@ static int arm_spe__setup_queue(struct arm_spe *spe,
>>>>>  
>>>>>  		record = &speq->decoder->record;
>>>>>  
>>>>> +		ret = arm_spe_set_tid(speq, record->context_id);
>>>>> +		if (ret)
>>>>> +			return ret;
>>>>> +
>>>>>  		speq->timestamp = record->timestamp;
>>>>>  		ret = auxtrace_heap__add(&spe->heap, queue_nr, speq->timestamp);
>>>>>  		if (ret)
>>>>> @@ -552,31 +602,6 @@ static bool arm_spe__is_timeless_decoding(struct arm_spe *spe)
>>>>>  	return timeless_decoding;
>>>>>  }
>>>>>  
>>>>> -static void arm_spe_set_pid_tid_cpu(struct arm_spe *spe,
>>>>> -				    struct auxtrace_queue *queue)
>>>>> -{
>>>>> -	struct arm_spe_queue *speq = queue->priv;
>>>>> -	pid_t tid;
>>>>> -
>>>>> -	tid = machine__get_current_tid(spe->machine, speq->cpu);
>>>>> -	if (tid != -1) {
>>>>> -		speq->tid = tid;
>>>>> -		thread__zput(speq->thread);
>>>>> -	} else
>>>>> -		speq->tid = queue->tid;
>>>>> -
>>>>> -	if ((!speq->thread) && (speq->tid != -1)) {
>>>>> -		speq->thread = machine__find_thread(spe->machine, -1,
>>>>> -						    speq->tid);
>>>>> -	}
>>>>> -
>>>>> -	if (speq->thread) {
>>>>> -		speq->pid = speq->thread->pid_;
>>>>> -		if (queue->cpu == -1)
>>>>> -			speq->cpu = speq->thread->cpu;
>>>>> -	}
>>>>> -}
>>>>> -
>>>>>  static int arm_spe_process_queues(struct arm_spe *spe, u64 timestamp)
>>>>>  {
>>>>>  	unsigned int queue_nr;
>>>>> -- 
>>>>> 2.28.0
>>>>>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 8/8] perf arm-spe: Set thread TID
  2021-02-10 10:16           ` James Clark
@ 2021-02-10 12:03             ` Leo Yan
  0 siblings, 0 replies; 16+ messages in thread
From: Leo Yan @ 2021-02-10 12:03 UTC (permalink / raw)
  To: James Clark
  Cc: linux-kernel, linux-perf-users, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, John Garry, Will Deacon,
	Mathieu Poirier, Al Grant, Andre Przywara, Wei Li, Tan Xiaojun,
	Adrian Hunter

Hi James,

On Wed, Feb 10, 2021 at 12:16:58PM +0200, James Clark wrote:
> 
> 
> On 09/02/2021 17:36, James Clark wrote:
> > 
> > 
> > On 04/02/2021 12:27, Leo Yan wrote:
> >> On Mon, Feb 01, 2021 at 07:40:45PM +0200, James Clark wrote:
> >>>
> >>> On 31/01/2021 14:01, Leo Yan wrote:
> >>>> Option 1: by merging patches 07/08 and 08/08, we can firstly support PID
> >>>> tracing for root namespace, and later we can extend to support PID
> >>>> tracing in container (and in VMs).
> >>>>
> >> Arm SPE has the problem for step2, due to the trace uses statistical
> >> approach, it doesn't trace the complete branch instructions, so it
> >> cannot promise to capture all branches for the symbol "__switch_to".
> >> If we only use the events PERF_RECORD_SWITCH /
> >> PERF_RECORD_SWITCH_CPU_WIDE, then it will lead to the coarse result
> >> for PID tracing.
> >>
> >> For this reason, seems to me it's pragmatic to use CONTEXTIDR for
> >> PID tracing at current stage, at least it can allow the root domain
> >> tracing works accurately.  But this will leave the issue for tracing
> >> PID in non root namespace, we need to figure out solution later.
> >>
> >> Hi Mark.R, Al, do you have any comments for this?
> > 
> > Hi Leo,
> > 
> > I spoke with Al and his suggestion is to clear the PID value if the event
> > was opened outside of the root namespace.
> > 
> > I think that's not a bad idea as it gets us PIDs in most cases but also
> > doesn't show any incorrect data. Do you know if it's possible to determine
> > that from a perf.data file? Unfortunately it doesn't seem to be possible
> > to disable CONTEXTIDR tracing when opening the event as it's compile time
> > only and can't be disabled dynamically.
> > 
> > James
> > 
> 
> I've had a think about it and I think we should do one of two things:

Thanks a lot for digging!

> #1) Remove the PID setting from the data source patchset. This will keep the
>     existing behaviour of using the PID of the first traced process only even
>     if there are forks. Later we can implement #2 or attempt to make it work
>     even in non root namespaces.

I agree.  Let's simplify the data source patch set; could you resend the
data source patch set so this can allow perf maintainer to easier follow
up (and merge) the patch series?  Thanks!

>     I'm not sure how this will impact your c2c patchset if you are relying on
>     the PID data Leo?

Yes, based on the experiment, if we want to extend "perf c2c" for
exhibit multi-threading info, then it depends on PID tracing.

> #2) Make a change in the SPE driver to add an option for disabling CONTEXTIDR.
>     We will disable this from userspace if the event is opened in a non root
>     namespace. So we will only show PID data if we know it's valid, otherwise
>     the existing behaviour of only using the first PID will remain.

Yeah, just a minor difference in my head.

Yes, we can use the kernel to export an extra PMU format, e.g. a new PMU format
"contextid", so the kernel provides a knob for userspace (this is similiar with
perf cs-etm :)).

I am just wandering if we can disable CONTEXTIDR tracing in the kernel side,
e.g. when the kernel detects if it's running on non root namespace, it should
not set bit SYS_PMSCR_EL1_CX_SHIFT; so if the tool in the userspace has
specified the PMU format "contextid" from non root namespace, the kernel should
report failure for without permission.

This seems to me, at least, we can have a sane solution for root
namespace.

Thanks,
Leo

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/8] perf arm-spe: Enable sample type PERF_SAMPLE_DATA_SRC
  2021-01-22 12:51 ` [PATCH 1/8] perf arm-spe: Enable sample type PERF_SAMPLE_DATA_SRC Arnaldo Carvalho de Melo
  2021-01-22 14:30   ` Leo Yan
@ 2021-02-11 13:41   ` James Clark
  1 sibling, 0 replies; 16+ messages in thread
From: James Clark @ 2021-02-11 13:41 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: linux-kernel, linux-perf-users, Leo Yan, Peter Zijlstra,
	Ingo Molnar, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, John Garry, Will Deacon, Mathieu Poirier, Al Grant,
	Andre Przywara, Wei Li, Tan Xiaojun, Adrian Hunter



On 22/01/2021 14:51, Arnaldo Carvalho de Melo wrote:
> Em Tue, Jan 19, 2021 at 04:46:51PM +0200, James Clark escreveu:
>> From: Leo Yan <leo.yan@linaro.org>
>>
>> This patch is to enable sample type PERF_SAMPLE_DATA_SRC for Arm SPE in
>> the perf data, when output the tracing data, it tells tools that it
>> contains data source in the memory event.
>>
>> Signed-off-by: Leo Yan <leo.yan@linaro.org>
>> Signed-off-by: James Clark <james.clark@arm.com>
> 
> I see two Signed-off-by, ok, any Reviewed-by?
> 
> - Arnaldo

Hi Arnaldo,

I have submitted v2 and added my reviewed-by and tested-by.

I didn't change any of the authors as Leo suggested because I only
modified the last two patches which we dropped anyway to not show
any misleading PID data when run from a container.


Thanks
James

> 
>> Cc: Peter Zijlstra <peterz@infradead.org>
>> Cc: Ingo Molnar <mingo@redhat.com>
>> Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
>> Cc: Mark Rutland <mark.rutland@arm.com>
>> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
>> Cc: Jiri Olsa <jolsa@redhat.com>
>> Cc: Namhyung Kim <namhyung@kernel.org>
>> Cc: John Garry <john.garry@huawei.com>
>> Cc: Will Deacon <will@kernel.org>
>> Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
>> Cc: Al Grant <al.grant@arm.com>
>> Cc: Andre Przywara <andre.przywara@arm.com>
>> Cc: Wei Li <liwei391@huawei.com>
>> Cc: Tan Xiaojun <tanxiaojun@huawei.com>
>> Cc: Adrian Hunter <adrian.hunter@intel.com>
>> ---
>>  tools/perf/util/arm-spe.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c
>> index 8901a1656a41..b134516e890b 100644
>> --- a/tools/perf/util/arm-spe.c
>> +++ b/tools/perf/util/arm-spe.c
>> @@ -803,7 +803,7 @@ arm_spe_synth_events(struct arm_spe *spe, struct perf_session *session)
>>  	attr.type = PERF_TYPE_HARDWARE;
>>  	attr.sample_type = evsel->core.attr.sample_type & PERF_SAMPLE_MASK;
>>  	attr.sample_type |= PERF_SAMPLE_IP | PERF_SAMPLE_TID |
>> -		PERF_SAMPLE_PERIOD;
>> +			    PERF_SAMPLE_PERIOD | PERF_SAMPLE_DATA_SRC;
>>  	if (spe->timeless_decoding)
>>  		attr.sample_type &= ~(u64)PERF_SAMPLE_TIME;
>>  	else
>> -- 
>> 2.28.0
>>
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2021-02-11 13:57 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-19 14:46 [PATCH 1/8] perf arm-spe: Enable sample type PERF_SAMPLE_DATA_SRC James Clark
2021-01-19 14:46 ` [PATCH 2/8] perf arm-spe: Store memory address in packet James Clark
2021-01-19 14:46 ` [PATCH 3/8] perf arm-spe: Store operation type " James Clark
2021-01-19 14:46 ` [PATCH 4/8] perf arm-spe: Fill address info for samples James Clark
2021-01-19 14:46 ` [PATCH 5/8] perf arm-spe: Synthesize memory event James Clark
2021-01-19 14:46 ` [PATCH 6/8] perf arm-spe: Set sample's data source field James Clark
2021-01-19 14:46 ` [PATCH 7/8] perf arm-spe: Save context ID in record James Clark
2021-01-19 14:46 ` [PATCH 8/8] perf arm-spe: Set thread TID James Clark
     [not found]   ` <20210131120156.GB230721@leoy-ThinkPad-X240s>
2021-02-01 17:40     ` James Clark
2021-02-04 10:27       ` Leo Yan
2021-02-09 15:36         ` James Clark
2021-02-10 10:16           ` James Clark
2021-02-10 12:03             ` Leo Yan
2021-01-22 12:51 ` [PATCH 1/8] perf arm-spe: Enable sample type PERF_SAMPLE_DATA_SRC Arnaldo Carvalho de Melo
2021-01-22 14:30   ` Leo Yan
2021-02-11 13:41   ` James Clark

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).