All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH V2 0/2] perf/x86: Add ability to sample TSC
@ 2015-02-20 12:44 Adrian Hunter
  2015-02-20 12:44 ` [PATCH V2 1/2] perf: Sample additional clock value Adrian Hunter
  2015-02-20 12:44 ` [PATCH V2 2/2] perf/x86: Provide TSC for PERF_SAMPLE_CLOCK_PT Adrian Hunter
  0 siblings, 2 replies; 5+ messages in thread
From: Adrian Hunter @ 2015-02-20 12:44 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar
  Cc: Arnaldo Carvalho de Melo, linux-kernel, David Ahern,
	Frederic Weisbecker, Jiri Olsa, Namhyung Kim, Paul Mackerras,
	Stephane Eranian, John Stultz, Thomas Gleixner, Pawel Moll,
	Steven Rostedt, Andi Kleen, Mathieu Poirier

Hi

TSC is needed to synchronize Intel Processor Trace (Intel PT)
with perf event samples. Refer to patch 1 for more details.

There is a description of Intel PT in the Intel Architecture
manuals:

http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html 

With the advent of switching perf_clock to CLOCK_MONOTONIC,
it will not be possible to convert perf_clock directly to/from
TSC. So add the ability to sample TSC instead.

Changes in V2:
    perf: Sample additional clock value
        Rename "Architecture specific clock" to
        "Processor trace clock" 
        Rename PERF_SAMPLE_CLOCK_ARCH -> PERF_SAMPLE_CLOCK_PT etc
        Expand commit message

    perf/x86: Provide TSC for PERF_SAMPLE_CLOCK_PT
        Rename PERF_SAMPLE_CLOCK_ARCH -> PERF_SAMPLE_CLOCK_PT etc
        Expand commit message


Adrian Hunter (2):
      perf: Sample additional clock value
      perf/x86: Provide TSC for PERF_SAMPLE_CLOCK_PT

 arch/x86/include/asm/perf_event.h |  6 ++++++
 arch/x86/kernel/cpu/perf_event.c  | 10 ++++++++++
 include/linux/perf_event.h        |  3 ++-
 include/uapi/linux/perf_event.h   | 19 +++++++++++++++++--
 kernel/events/core.c              | 30 ++++++++++++++++++++++++++++++
 kernel/events/internal.h          |  4 ++++
 6 files changed, 69 insertions(+), 3 deletions(-)


Regards
Adrian

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH V2 1/2] perf: Sample additional clock value
  2015-02-20 12:44 [PATCH V2 0/2] perf/x86: Add ability to sample TSC Adrian Hunter
@ 2015-02-20 12:44 ` Adrian Hunter
  2015-02-20 17:06   ` David Ahern
  2015-02-20 12:44 ` [PATCH V2 2/2] perf/x86: Provide TSC for PERF_SAMPLE_CLOCK_PT Adrian Hunter
  1 sibling, 1 reply; 5+ messages in thread
From: Adrian Hunter @ 2015-02-20 12:44 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar
  Cc: Arnaldo Carvalho de Melo, linux-kernel, David Ahern,
	Frederic Weisbecker, Jiri Olsa, Namhyung Kim, Paul Mackerras,
	Stephane Eranian, John Stultz, Thomas Gleixner, Pawel Moll,
	Steven Rostedt, Andi Kleen, Mathieu Poirier

This is needed to allow perf event samples to be
synchronized with data from other sources, and
in particular, sources like Intel Processor Trace
(Intel PT) where the hardware produces a trace
with hardware defined timestamps (i.e. TSC).

For example, to decode an Intel PT trace, the decoder
must walk the object code. To determine what object
code is running, the decoder must track events like
sched_switch and MMAP and match them against the trace
data using the timestamps.

Note that it is not the accuracy of the time sources
that is at issue but instead the ability to correctly
order events.

On modern machines, perf_clock is currently directly
related to TSC, however that is to change when
perf_clock becomes CLOCK_MONOTONIC.

Consequently add PERF_SAMPLE_CLOCK to sample some
other clock. The patch allows for 16 possible clock
selections with the only initial possibility a
processor trace clock which will be TSC on x86.

Although there are only 16 possible clock selections,
it is envisioned that POSIX clock ids would be a
single selection, with the actual clock id provided
in another perf_event_attr member.

Based-on-patch-by: Pawel Moll <pawel.moll@arm.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 include/linux/perf_event.h      |  3 ++-
 include/uapi/linux/perf_event.h | 19 +++++++++++++++++--
 kernel/events/core.c            | 30 ++++++++++++++++++++++++++++++
 kernel/events/internal.h        |  4 ++++
 4 files changed, 53 insertions(+), 3 deletions(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index efe2d2d..9385140 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -655,7 +655,7 @@ extern void perf_pmu_migrate_context(struct pmu *pmu,
 				int src_cpu, int dst_cpu);
 extern u64 perf_event_read_value(struct perf_event *event,
 				 u64 *enabled, u64 *running);
-
+u64 perf_sample_clock_pt(void);
 
 struct perf_sample_data {
 	/*
@@ -687,6 +687,7 @@ struct perf_sample_data {
 		u32	cpu;
 		u32	reserved;
 	}				cpu_entry;
+	u64				clock;
 	struct perf_callchain_entry	*callchain;
 
 	/*
diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index be9ff06..2fccfc0 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -139,8 +139,9 @@ enum perf_event_sample_format {
 	PERF_SAMPLE_IDENTIFIER			= 1U << 16,
 	PERF_SAMPLE_TRANSACTION			= 1U << 17,
 	PERF_SAMPLE_REGS_INTR			= 1U << 18,
+	PERF_SAMPLE_CLOCK			= 1U << 19,
 
-	PERF_SAMPLE_MAX = 1U << 19,		/* non-ABI */
+	PERF_SAMPLE_MAX = 1U << 20,		/* non-ABI */
 };
 
 /*
@@ -228,6 +229,16 @@ enum {
 };
 
 /*
+ * Values to determine clock to sample.
+ */
+enum perf_sample_clock_type {
+	/* Processor trace clock (TSC on x86) */
+	PERF_SAMPLE_CLOCK_PT		= 0,
+
+	PERF_SAMPLE_CLOCK_MAX		/* non-ABI */
+};
+
+/*
  * The format of the data returned by read() on a perf event fd,
  * as specified by attr.read_format:
  *
@@ -328,7 +339,9 @@ struct perf_event_attr {
 				exclude_callchain_user   : 1, /* exclude user callchains */
 				mmap2          :  1, /* include mmap with inode data     */
 				comm_exec      :  1, /* flag comm events that are due to an exec */
-				__reserved_1   : 39;
+				/* clock: see enum perf_sample_clock_type */
+				clock          :  4, /* which clock */
+				__reserved_1   : 35;
 
 	union {
 		__u32		wakeup_events;	  /* wakeup every n events */
@@ -601,6 +614,7 @@ enum perf_event_type {
 	 * 	{ u64			id;       } && PERF_SAMPLE_ID
 	 * 	{ u64			stream_id;} && PERF_SAMPLE_STREAM_ID
 	 * 	{ u32			cpu, res; } && PERF_SAMPLE_CPU
+	 *	{ u64			clock;    } && PERF_SAMPLE_CLOCK
 	 *	{ u64			id;	  } && PERF_SAMPLE_IDENTIFIER
 	 * } && perf_event_attr::sample_id_all
 	 *
@@ -746,6 +760,7 @@ enum perf_event_type {
 	 *	{ u64			transaction; } && PERF_SAMPLE_TRANSACTION
 	 *	{ u64			abi; # enum perf_sample_regs_abi
 	 *	  u64			regs[weight(mask)]; } && PERF_SAMPLE_REGS_INTR
+	 *	{ u64			clock;    } && PERF_SAMPLE_CLOCK
 	 * };
 	 */
 	PERF_RECORD_SAMPLE			= 9,
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 799f034..dc39915 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -1323,6 +1323,9 @@ static void perf_event__id_header_size(struct perf_event *event)
 	if (sample_type & PERF_SAMPLE_CPU)
 		size += sizeof(data->cpu_entry);
 
+	if (sample_type & PERF_SAMPLE_CLOCK)
+		size += sizeof(data->clock);
+
 	event->id_header_size = size;
 }
 
@@ -4915,6 +4918,11 @@ perf_output_sample_ustack(struct perf_output_handle *handle, u64 dump_size,
 	}
 }
 
+u64 __weak perf_sample_clock_pt(void)
+{
+	return 0;
+}
+
 static void __perf_event_header__init_id(struct perf_event_header *header,
 					 struct perf_sample_data *data,
 					 struct perf_event *event)
@@ -4943,6 +4951,16 @@ static void __perf_event_header__init_id(struct perf_event_header *header,
 		data->cpu_entry.cpu	 = raw_smp_processor_id();
 		data->cpu_entry.reserved = 0;
 	}
+
+	if (sample_type & PERF_SAMPLE_CLOCK) {
+		switch (event->attr.clock) {
+		case PERF_SAMPLE_CLOCK_PT:
+			data->clock = perf_sample_clock_pt();
+			break;
+		default:
+			data->clock = 0;
+		}
+	}
 }
 
 void perf_event_header__init_id(struct perf_event_header *header,
@@ -4973,6 +4991,9 @@ static void __perf_event__output_id_sample(struct perf_output_handle *handle,
 	if (sample_type & PERF_SAMPLE_CPU)
 		perf_output_put(handle, data->cpu_entry);
 
+	if (sample_type & PERF_SAMPLE_CLOCK)
+		perf_output_put(handle, data->clock);
+
 	if (sample_type & PERF_SAMPLE_IDENTIFIER)
 		perf_output_put(handle, data->id);
 }
@@ -5218,6 +5239,9 @@ void perf_output_sample(struct perf_output_handle *handle,
 		}
 	}
 
+	if (sample_type & PERF_SAMPLE_CLOCK)
+		perf_output_put(handle, data->clock);
+
 	if (!event->attr.watermark) {
 		int wakeup_events = event->attr.wakeup_events;
 
@@ -7632,6 +7656,12 @@ static int perf_copy_attr(struct perf_event_attr __user *uattr,
 
 	if (attr->sample_type & PERF_SAMPLE_REGS_INTR)
 		ret = perf_reg_validate(attr->sample_regs_intr);
+
+	if ((attr->sample_type & PERF_SAMPLE_CLOCK) &&
+	    (attr->clock >= PERF_SAMPLE_CLOCK_MAX ||
+	     (!HAVE_PERF_SAMPLE_CLOCK_PT &&
+	      attr->clock == PERF_SAMPLE_CLOCK_PT)))
+		return -EINVAL;
 out:
 	return ret;
 
diff --git a/kernel/events/internal.h b/kernel/events/internal.h
index 9f6ce9b..418142f 100644
--- a/kernel/events/internal.h
+++ b/kernel/events/internal.h
@@ -228,4 +228,8 @@ static inline bool arch_perf_have_user_stack_dump(void)
 #define perf_user_stack_pointer(regs) 0
 #endif /* CONFIG_HAVE_PERF_USER_STACK_DUMP */
 
+#ifndef HAVE_PERF_SAMPLE_CLOCK_PT
+#define HAVE_PERF_SAMPLE_CLOCK_PT 0
+#endif
+
 #endif /* _KERNEL_EVENTS_INTERNAL_H */
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH V2 2/2] perf/x86: Provide TSC for PERF_SAMPLE_CLOCK_PT
  2015-02-20 12:44 [PATCH V2 0/2] perf/x86: Add ability to sample TSC Adrian Hunter
  2015-02-20 12:44 ` [PATCH V2 1/2] perf: Sample additional clock value Adrian Hunter
@ 2015-02-20 12:44 ` Adrian Hunter
  1 sibling, 0 replies; 5+ messages in thread
From: Adrian Hunter @ 2015-02-20 12:44 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar
  Cc: Arnaldo Carvalho de Melo, linux-kernel, David Ahern,
	Frederic Weisbecker, Jiri Olsa, Namhyung Kim, Paul Mackerras,
	Stephane Eranian, John Stultz, Thomas Gleixner, Pawel Moll,
	Steven Rostedt, Andi Kleen, Mathieu Poirier

Provide TSC for PERF_SAMPLE_CLOCK_PT. This is needed
to match perf events against hardware traces like
Intel Processor Trace (Intel PT) which is the
purpose for which PERF_SAMPLE_CLOCK_PT was invented.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 arch/x86/include/asm/perf_event.h |  6 ++++++
 arch/x86/kernel/cpu/perf_event.c  | 10 ++++++++++
 2 files changed, 16 insertions(+)

diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
index dc0f6ed..a022f53 100644
--- a/arch/x86/include/asm/perf_event.h
+++ b/arch/x86/include/asm/perf_event.h
@@ -261,6 +261,12 @@ struct perf_guest_switch_msr {
 extern struct perf_guest_switch_msr *perf_guest_get_msrs(int *nr);
 extern void perf_get_x86_pmu_capability(struct x86_pmu_capability *cap);
 extern void perf_check_microcode(void);
+
+#ifdef CONFIG_X86_TSC
+#define HAVE_PERF_SAMPLE_CLOCK_PT 1
+u64 perf_sample_clock_pt(void);
+#endif
+
 #else
 static inline struct perf_guest_switch_msr *perf_guest_get_msrs(int *nr)
 {
diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
index 8271d6b..dc10084 100644
--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -2256,3 +2256,13 @@ void perf_get_x86_pmu_capability(struct x86_pmu_capability *cap)
 	cap->events_mask_len	= x86_pmu.events_mask_len;
 }
 EXPORT_SYMBOL_GPL(perf_get_x86_pmu_capability);
+
+#ifdef CONFIG_X86_TSC
+u64 perf_sample_clock_pt(void)
+{
+	u64 tsc;
+
+	rdtscll(tsc);
+	return tsc;
+}
+#endif
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH V2 1/2] perf: Sample additional clock value
  2015-02-20 12:44 ` [PATCH V2 1/2] perf: Sample additional clock value Adrian Hunter
@ 2015-02-20 17:06   ` David Ahern
  2015-02-23  7:29     ` Adrian Hunter
  0 siblings, 1 reply; 5+ messages in thread
From: David Ahern @ 2015-02-20 17:06 UTC (permalink / raw)
  To: Adrian Hunter, Peter Zijlstra, Ingo Molnar
  Cc: Arnaldo Carvalho de Melo, linux-kernel, Frederic Weisbecker,
	Jiri Olsa, Namhyung Kim, Paul Mackerras, Stephane Eranian,
	John Stultz, Thomas Gleixner, Pawel Moll, Steven Rostedt,
	Andi Kleen, Mathieu Poirier

On 2/20/15 5:44 AM, Adrian Hunter wrote:
> diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
> index efe2d2d..9385140 100644
> --- a/include/linux/perf_event.h
> +++ b/include/linux/perf_event.h
> @@ -655,7 +655,7 @@ extern void perf_pmu_migrate_context(struct pmu *pmu,
>   				int src_cpu, int dst_cpu);
>   extern u64 perf_event_read_value(struct perf_event *event,
>   				 u64 *enabled, u64 *running);
> -
> +u64 perf_sample_clock_pt(void);

Core functions should not be arch specific. PT == x86.


> @@ -4915,6 +4918,11 @@ perf_output_sample_ustack(struct perf_output_handle *handle, u64 dump_size,
>   	}
>   }
>
> +u64 __weak perf_sample_clock_pt(void)
> +{
> +	return 0;
> +}
> +
>   static void __perf_event_header__init_id(struct perf_event_header *header,
>   					 struct perf_sample_data *data,
>   					 struct perf_event *event)
> @@ -4943,6 +4951,16 @@ static void __perf_event_header__init_id(struct perf_event_header *header,
>   		data->cpu_entry.cpu	 = raw_smp_processor_id();
>   		data->cpu_entry.reserved = 0;
>   	}
> +
> +	if (sample_type & PERF_SAMPLE_CLOCK) {
> +		switch (event->attr.clock) {
> +		case PERF_SAMPLE_CLOCK_PT:
> +			data->clock = perf_sample_clock_pt();
> +			break;
> +		default:
> +			data->clock = 0;
> +		}
> +	}
>   }
>

Ditto here. This should be transparent for arch specific clocks.

David


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH V2 1/2] perf: Sample additional clock value
  2015-02-20 17:06   ` David Ahern
@ 2015-02-23  7:29     ` Adrian Hunter
  0 siblings, 0 replies; 5+ messages in thread
From: Adrian Hunter @ 2015-02-23  7:29 UTC (permalink / raw)
  To: David Ahern
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	linux-kernel, Frederic Weisbecker, Jiri Olsa, Namhyung Kim,
	Paul Mackerras, Stephane Eranian, John Stultz, Thomas Gleixner,
	Pawel Moll, Steven Rostedt, Andi Kleen, Mathieu Poirier

On 20/02/15 19:06, David Ahern wrote:
> On 2/20/15 5:44 AM, Adrian Hunter wrote:
>> diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
>> index efe2d2d..9385140 100644
>> --- a/include/linux/perf_event.h
>> +++ b/include/linux/perf_event.h
>> @@ -655,7 +655,7 @@ extern void perf_pmu_migrate_context(struct pmu *pmu,
>>                   int src_cpu, int dst_cpu);
>>   extern u64 perf_event_read_value(struct perf_event *event,
>>                    u64 *enabled, u64 *running);
>> -
>> +u64 perf_sample_clock_pt(void);
> 
> Core functions should not be arch specific. PT == x86.

Actually it was one of the ARM guys that asked for it to be called
"Processor Trace". It was "arch" in V1.

	http://marc.info/?l=linux-kernel&m=142436891806015

But it has been superseded completely by patches from Peter, so it
is not going further anyway.


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2015-02-23  7:31 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-02-20 12:44 [PATCH V2 0/2] perf/x86: Add ability to sample TSC Adrian Hunter
2015-02-20 12:44 ` [PATCH V2 1/2] perf: Sample additional clock value Adrian Hunter
2015-02-20 17:06   ` David Ahern
2015-02-23  7:29     ` Adrian Hunter
2015-02-20 12:44 ` [PATCH V2 2/2] perf/x86: Provide TSC for PERF_SAMPLE_CLOCK_PT Adrian Hunter

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.