[RFC] perf: Handle multiple formatted AUX records
diff mbox series

Message ID 20210122151829.2890484-2-suzuki.poulose@arm.com
State New, archived
Headers show
Series
  • perf: Handle multiple formatted AUX records
Related show

Commit Message

Suzuki K Poulose Jan. 22, 2021, 3:18 p.m. UTC
CoreSight PMU supports aux-buffer for the ETM tracing. The trace
generated by the ETM (associated with individual CPUs, like Intel PT)
is captured by a separate IP (CoreSight TMC-ETR/ETF until now).

The TMC-ETR applies formatting of the raw ETM trace data, as it
can collect traces from multiple ETMs, with the TraceID to indicate
the source of a given trace packet.

Arm Trace Buffer Extension is new "sink" IP, attached to individual
CPUs and thus do not provide additional formatting, like TMC-ETR.

Additionally, a system could have both TRBE *and* TMC-ETR for
the trace collection. e.g, TMC-ETR could be used as a single
trace buffer to collect data from multiple ETMs to correlate
the traces from different CPUs. It is possible to have a
perf session where some events end up collecting the trace
in TMC-ETR while the others in TRBE. Thus we need a way
to identify the type of the trace for each AUX record.

This patch adds a new flag to indicate the trace format
for the given record. Also, includes the changes that
demonstrates how this can be used in the CoreSight PMU
to solve the problem.

Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 drivers/hwtracing/coresight/coresight-etm-perf.c | 2 ++
 include/linux/coresight.h                        | 1 +
 include/uapi/linux/perf_event.h                  | 1 +
 3 files changed, 4 insertions(+)

Comments

Peter Zijlstra Jan. 25, 2021, 10:25 a.m. UTC | #1
On Fri, Jan 22, 2021 at 03:18:29PM +0000, Suzuki K Poulose wrote:
> CoreSight PMU supports aux-buffer for the ETM tracing. The trace
> generated by the ETM (associated with individual CPUs, like Intel PT)
> is captured by a separate IP (CoreSight TMC-ETR/ETF until now).
> 
> The TMC-ETR applies formatting of the raw ETM trace data, as it
> can collect traces from multiple ETMs, with the TraceID to indicate
> the source of a given trace packet.
> 
> Arm Trace Buffer Extension is new "sink" IP, attached to individual
> CPUs and thus do not provide additional formatting, like TMC-ETR.
> 
> Additionally, a system could have both TRBE *and* TMC-ETR for
> the trace collection. e.g, TMC-ETR could be used as a single
> trace buffer to collect data from multiple ETMs to correlate
> the traces from different CPUs. It is possible to have a
> perf session where some events end up collecting the trace
> in TMC-ETR while the others in TRBE. Thus we need a way
> to identify the type of the trace for each AUX record.
> 
> This patch adds a new flag to indicate the trace format
> for the given record. Also, includes the changes that
> demonstrates how this can be used in the CoreSight PMU
> to solve the problem.
> 
> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> ---

> diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
> index b15e3447cd9f..ea7dcc7b30f0 100644
> --- a/include/uapi/linux/perf_event.h
> +++ b/include/uapi/linux/perf_event.h
> @@ -1109,6 +1109,7 @@ enum perf_callchain_context {
>  #define PERF_AUX_FLAG_OVERWRITE		0x02	/* snapshot from overwrite mode */
>  #define PERF_AUX_FLAG_PARTIAL		0x04	/* record contains gaps */
>  #define PERF_AUX_FLAG_COLLISION		0x08	/* sample collided with another */
> +#define PERF_AUX_FLAG_ALT_FMT		0x10	/* this record is in alternate trace format */

Since we have a whole u64, do we want to reserve a whole nibble (or
maybe even a byte) for a format type? Because with a single bit like
this, we'll kick ourselves when we end up with the need for a 3rd format
type.
Suzuki K Poulose Jan. 25, 2021, 10:45 a.m. UTC | #2
Hi Peter

On 1/25/21 10:25 AM, Peter Zijlstra wrote:
> On Fri, Jan 22, 2021 at 03:18:29PM +0000, Suzuki K Poulose wrote:
>> CoreSight PMU supports aux-buffer for the ETM tracing. The trace
>> generated by the ETM (associated with individual CPUs, like Intel PT)
>> is captured by a separate IP (CoreSight TMC-ETR/ETF until now).
>>
>> The TMC-ETR applies formatting of the raw ETM trace data, as it
>> can collect traces from multiple ETMs, with the TraceID to indicate
>> the source of a given trace packet.
>>
>> Arm Trace Buffer Extension is new "sink" IP, attached to individual
>> CPUs and thus do not provide additional formatting, like TMC-ETR.
>>
>> Additionally, a system could have both TRBE *and* TMC-ETR for
>> the trace collection. e.g, TMC-ETR could be used as a single
>> trace buffer to collect data from multiple ETMs to correlate
>> the traces from different CPUs. It is possible to have a
>> perf session where some events end up collecting the trace
>> in TMC-ETR while the others in TRBE. Thus we need a way
>> to identify the type of the trace for each AUX record.
>>
>> This patch adds a new flag to indicate the trace format
>> for the given record. Also, includes the changes that
>> demonstrates how this can be used in the CoreSight PMU
>> to solve the problem.
>>
>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
>> ---
> 
>> diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
>> index b15e3447cd9f..ea7dcc7b30f0 100644
>> --- a/include/uapi/linux/perf_event.h
>> +++ b/include/uapi/linux/perf_event.h
>> @@ -1109,6 +1109,7 @@ enum perf_callchain_context {
>>   #define PERF_AUX_FLAG_OVERWRITE		0x02	/* snapshot from overwrite mode */
>>   #define PERF_AUX_FLAG_PARTIAL		0x04	/* record contains gaps */
>>   #define PERF_AUX_FLAG_COLLISION		0x08	/* sample collided with another */
>> +#define PERF_AUX_FLAG_ALT_FMT		0x10	/* this record is in alternate trace format */
> 
> Since we have a whole u64, do we want to reserve a whole nibble (or
> maybe even a byte) for a format type? Because with a single bit like
> this, we'll kick ourselves when we end up with the need for a 3rd format
> type.
> 

Sure, makes sense. We could do:

#define PERF_AUX_FLAG_PMU_FORMAT_TYPE_MASK	0xff00

Additionally, the values could be allocated by individual PMUs and
interpreted by the corresponding counterpart. That way we don't
have to worry about centralized allocation of the "TYPE" fields.

e,g:

#define PERF_AUX_FLAG_CORESIGHT_FORMAT_CORESIGHT	0x0000
#define PERF_AUX_FLAG_CORESIGHT_FORMAT_RAW		0x0100

#define PERF_AUX_FLAG_RANDOM_PMU_FORMAT_FMT1		0x0000
#define PERF_AUX_FLAG_RANDOM_PMU_FORMAT_FMT2		0x0100


What do you think ?

Cheers
Suzuki
Peter Zijlstra Jan. 25, 2021, 11:01 a.m. UTC | #3
On Mon, Jan 25, 2021 at 10:45:06AM +0000, Suzuki K Poulose wrote:
> On 1/25/21 10:25 AM, Peter Zijlstra wrote:

> > Since we have a whole u64, do we want to reserve a whole nibble (or
> > maybe even a byte) for a format type? Because with a single bit like
> > this, we'll kick ourselves when we end up with the need for a 3rd format
> > type.
> > 
> 
> Sure, makes sense. We could do:
> 
> #define PERF_AUX_FLAG_PMU_FORMAT_TYPE_MASK	0xff00
> 
> Additionally, the values could be allocated by individual PMUs and
> interpreted by the corresponding counterpart. That way we don't
> have to worry about centralized allocation of the "TYPE" fields.
> 
> e,g:
> 
> #define PERF_AUX_FLAG_CORESIGHT_FORMAT_CORESIGHT	0x0000
> #define PERF_AUX_FLAG_CORESIGHT_FORMAT_RAW		0x0100
> 
> #define PERF_AUX_FLAG_RANDOM_PMU_FORMAT_FMT1		0x0000
> #define PERF_AUX_FLAG_RANDOM_PMU_FORMAT_FMT2		0x0100
> 
> 
> What do you think ?

Sounds good to me.

Patch
diff mbox series

diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c
index e776a07b0852..81602bd8da59 100644
--- a/drivers/hwtracing/coresight/coresight-etm-perf.c
+++ b/drivers/hwtracing/coresight/coresight-etm-perf.c
@@ -429,6 +429,8 @@  static void etm_event_stop(struct perf_event *event, int mode)
 
 		size = sink_ops(sink)->update_buffer(sink, handle,
 					      event_data->snk_config);
+		if (!sink->formatted_trace)
+			perf_aux_output_flag(handle, PERF_AUX_FLAG_ALT_FMT);
 		perf_aux_output_end(handle, size);
 	}
 
diff --git a/include/linux/coresight.h b/include/linux/coresight.h
index e019182521a1..45c173c391a4 100644
--- a/include/linux/coresight.h
+++ b/include/linux/coresight.h
@@ -241,6 +241,7 @@  struct coresight_device {
 	int nr_links;
 	bool has_conns_grp;
 	bool ect_enabled; /* true only if associated ect device is enabled */
+	bool formatted_trace; /* Trace is CoreSight formatted ? */
 };
 
 /*
diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index b15e3447cd9f..ea7dcc7b30f0 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -1109,6 +1109,7 @@  enum perf_callchain_context {
 #define PERF_AUX_FLAG_OVERWRITE		0x02	/* snapshot from overwrite mode */
 #define PERF_AUX_FLAG_PARTIAL		0x04	/* record contains gaps */
 #define PERF_AUX_FLAG_COLLISION		0x08	/* sample collided with another */
+#define PERF_AUX_FLAG_ALT_FMT		0x10	/* this record is in alternate trace format */
 
 #define PERF_FLAG_FD_NO_GROUP		(1UL << 0)
 #define PERF_FLAG_FD_OUTPUT		(1UL << 1)