All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v7 0/6] perf: add ability to sample interrupted machine state
@ 2014-09-24 11:48 Stephane Eranian
  2014-09-24 11:48 ` [PATCH v7 1/6] perf: add ability to sample machine state on interrupt Stephane Eranian
                   ` (6 more replies)
  0 siblings, 7 replies; 23+ messages in thread
From: Stephane Eranian @ 2014-09-24 11:48 UTC (permalink / raw)
  To: linux-kernel; +Cc: peterz, mingo, ak, jolsa, acme, cebbert.lkml

This short patch series add the ability to sample the interrupted
machine state for each hardware sample. This is useful to analyze
the state after certain events, for instance for function value
profiling after a call instruction.

The patch extends the interface with a new PERF_SAMPLE_REGS_INTR 
smaple_type flag. The register to sample can be named in the
sample_regs_intr bitmask for each event. The name and bit
position for each register is architecture dependent and
provided, just like for PERF_SAMPLE_REGS_USER by asm/perf_regs.h.

The support is similar to PERF_SAMPLE_REGS_USER.

On Intel x86, the series includes support for capturing the
PEBS state as well. When precise sampling is used, the interrupted
state is collect from the PEBS records, at least partially.
The PEBS machine state is a subset of the machine state.

The series provides access to this new feature in perf record
with the -I option. It is possible to display the sampled
register values using perf report -D.

This patch series is the fundation for a future series adding
function value profiling.

In V2, we address the issues raised during reviews:
 - add sample parsing test
 - shorten perf record option to --intr-regs
 - added man page for perf record -I/--intr-regs option
 - refactor register printf code between user and intr regs
 - rebase to v3.16-rc3

In V3, we rebase to 3.16.0+ and made the modifications suggested
by PeterZ. We also integrated his patch to improve the layout
of perf_sample_data.

In V4, we rebase to 3.17-rc3 and we fix the ABI change issue 
reported by Namhyung Kim.

In V5, the patch is rebased to 3.17-rc4 and the bugs reported
on LKML about the PEBS machine state copying code have been fixed
(a few registers were missing).

In V6, we added the missing copy of the eflags from the PEBS
mahcine state (reported by Andi). We fixed some formatting
issues. We rebased to 3.17-rc6.

In V7, we fix the eflags compilation error, we added
the #ifndef CONFIG_X86_32 to enable 32-bit compilations.
Still relative to 3.17-rc6

Peter Zijlstra (1):
  perf: improve perf_sample_data struct layout

Stephane Eranian (5):
  perf: add ability to sample machine state on interrupt
  perf/x86: add support for sampling PEBS machine state registers
  perf tools: add core support for sampling intr machine state regs
  perf/tests: add interrupted state sample parsing test
  perf record: add new -I option to sample interrupted machine state

 arch/x86/kernel/cpu/perf_event_intel_ds.c |   23 ++++++++++++
 include/linux/perf_event.h                |   37 ++++++++++---------
 include/uapi/linux/perf_event.h           |   15 +++++++-
 kernel/events/core.c                      |   51 ++++++++++++++++++++++++--
 tools/perf/Documentation/perf-record.txt  |    6 ++++
 tools/perf/builtin-record.c               |    2 ++
 tools/perf/perf.h                         |    1 +
 tools/perf/tests/sample-parsing.c         |   55 +++++++++++++++++++++--------
 tools/perf/util/event.h                   |    1 +
 tools/perf/util/evsel.c                   |   46 +++++++++++++++++++++++-
 tools/perf/util/header.c                  |    1 +
 tools/perf/util/session.c                 |   44 ++++++++++++++++++++---
 12 files changed, 240 insertions(+), 42 deletions(-)

-- 
1.7.9.5


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH v7 1/6] perf: add ability to sample machine state on interrupt
  2014-09-24 11:48 [PATCH v7 0/6] perf: add ability to sample interrupted machine state Stephane Eranian
@ 2014-09-24 11:48 ` Stephane Eranian
  2014-11-16 12:35   ` [tip:perf/core] perf: Add " tip-bot for Stephane Eranian
  2014-11-21 21:26   ` [PATCH v7 1/6] perf: add " Arnaldo Carvalho de Melo
  2014-09-24 11:48 ` [PATCH v7 2/6] perf/x86: add support for sampling PEBS machine state registers Stephane Eranian
                   ` (5 subsequent siblings)
  6 siblings, 2 replies; 23+ messages in thread
From: Stephane Eranian @ 2014-09-24 11:48 UTC (permalink / raw)
  To: linux-kernel; +Cc: peterz, mingo, ak, jolsa, acme, cebbert.lkml

Enable capture of interrupted machine state for each
sample.

Registers to sample are passed per event in the
sample_regs_intr bitmask.

To sample interrupt machine state, the
PERF_SAMPLE_INTR_REGS must be passed in
sample_type.

The list of available registers is arch
dependent and provided by asm/perf_regs.h

Registers are laid out as u64 in the order
of the bit order of sample_intr_regs.

This patch also adds a new ABI version
PERF_ATTR_SIZE_VER4 because we extend
the perf_event_attr struct with a new u64
field.

Reviewed-by: Jiri Olsa <jolsa@redhat.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Stephane Eranian <eranian@google.com>
---
 include/linux/perf_event.h      |    7 ++++--
 include/uapi/linux/perf_event.h |   15 ++++++++++++-
 kernel/events/core.c            |   46 +++++++++++++++++++++++++++++++++++++--
 3 files changed, 63 insertions(+), 5 deletions(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 893a0d0..68d46d5 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -79,7 +79,7 @@ struct perf_branch_stack {
 	struct perf_branch_entry	entries[0];
 };
 
-struct perf_regs_user {
+struct perf_regs {
 	__u64		abi;
 	struct pt_regs	*regs;
 };
@@ -600,7 +600,8 @@ struct perf_sample_data {
 	struct perf_callchain_entry	*callchain;
 	struct perf_raw_record		*raw;
 	struct perf_branch_stack	*br_stack;
-	struct perf_regs_user		regs_user;
+	struct perf_regs		regs_user;
+	struct perf_regs		regs_intr;
 	u64				stack_user_size;
 	u64				weight;
 	/*
@@ -630,6 +631,8 @@ static inline void perf_sample_data_init(struct perf_sample_data *data,
 	data->weight = 0;
 	data->data_src.val = PERF_MEM_NA;
 	data->txn = 0;
+	data->regs_intr.abi = PERF_SAMPLE_REGS_ABI_NONE;
+	data->regs_intr.regs = NULL;
 }
 
 extern void perf_output_sample(struct perf_output_handle *handle,
diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index 9269de2..48d4a01 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -137,8 +137,9 @@ enum perf_event_sample_format {
 	PERF_SAMPLE_DATA_SRC			= 1U << 15,
 	PERF_SAMPLE_IDENTIFIER			= 1U << 16,
 	PERF_SAMPLE_TRANSACTION			= 1U << 17,
+	PERF_SAMPLE_REGS_INTR			= 1U << 18,
 
-	PERF_SAMPLE_MAX = 1U << 18,		/* non-ABI */
+	PERF_SAMPLE_MAX = 1U << 19,		/* non-ABI */
 };
 
 /*
@@ -238,6 +239,7 @@ enum perf_event_read_format {
 #define PERF_ATTR_SIZE_VER2	80	/* add: branch_sample_type */
 #define PERF_ATTR_SIZE_VER3	96	/* add: sample_regs_user */
 					/* add: sample_stack_user */
+#define PERF_ATTR_SIZE_VER4	104	/* add: sample_regs_intr */
 
 /*
  * Hardware event_id to monitor via a performance monitoring event:
@@ -334,6 +336,15 @@ struct perf_event_attr {
 
 	/* Align to u64. */
 	__u32	__reserved_2;
+	/*
+	 * Defines set of regs to dump for each sample
+	 * state captured on:
+	 *  - precise = 0: PMU interrupt
+	 *  - precise > 0: sampled instruction
+	 *
+	 * See asm/perf_regs.h for details.
+	 */
+	__u64	sample_regs_intr;
 };
 
 #define perf_flags(attr)	(*(&(attr)->read_format + 1))
@@ -686,6 +697,8 @@ enum perf_event_type {
 	 *	{ u64			weight;   } && PERF_SAMPLE_WEIGHT
 	 *	{ u64			data_src; } && PERF_SAMPLE_DATA_SRC
 	 *	{ u64			transaction; } && PERF_SAMPLE_TRANSACTION
+	 *	{ u64			abi; # enum perf_sample_regs_abi
+	 *	  u64			regs[weight(mask)]; } && PERF_SAMPLE_REGS_INTR
 	 * };
 	 */
 	PERF_RECORD_SAMPLE			= 9,
diff --git a/kernel/events/core.c b/kernel/events/core.c
index eaa636e..7941343 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -4430,7 +4430,7 @@ perf_output_sample_regs(struct perf_output_handle *handle,
 	}
 }
 
-static void perf_sample_regs_user(struct perf_regs_user *regs_user,
+static void perf_sample_regs_user(struct perf_regs *regs_user,
 				  struct pt_regs *regs)
 {
 	if (!user_mode(regs)) {
@@ -4446,6 +4446,14 @@ static void perf_sample_regs_user(struct perf_regs_user *regs_user,
 	}
 }
 
+static void perf_sample_regs_intr(struct perf_regs *regs_intr,
+				  struct pt_regs *regs)
+{
+	regs_intr->regs = regs;
+	regs_intr->abi  = perf_reg_abi(current);
+}
+
+
 /*
  * Get remaining task size from user stack pointer.
  *
@@ -4827,6 +4835,23 @@ void perf_output_sample(struct perf_output_handle *handle,
 	if (sample_type & PERF_SAMPLE_TRANSACTION)
 		perf_output_put(handle, data->txn);
 
+	if (sample_type & PERF_SAMPLE_REGS_INTR) {
+		u64 abi = data->regs_intr.abi;
+		/*
+		 * If there are no regs to dump, notice it through
+		 * first u64 being zero (PERF_SAMPLE_REGS_ABI_NONE).
+		 */
+		perf_output_put(handle, abi);
+
+		if (abi) {
+			u64 mask = event->attr.sample_regs_intr;
+
+			perf_output_sample_regs(handle,
+						data->regs_intr.regs,
+						mask);
+		}
+	}
+
 	if (!event->attr.watermark) {
 		int wakeup_events = event->attr.wakeup_events;
 
@@ -4913,7 +4938,7 @@ void perf_prepare_sample(struct perf_event_header *header,
 		 * in case new sample type is added, because we could eat
 		 * up the rest of the sample size.
 		 */
-		struct perf_regs_user *uregs = &data->regs_user;
+		struct perf_regs *uregs = &data->regs_user;
 		u16 stack_size = event->attr.sample_stack_user;
 		u16 size = sizeof(u64);
 
@@ -4934,6 +4959,21 @@ void perf_prepare_sample(struct perf_event_header *header,
 		data->stack_user_size = stack_size;
 		header->size += size;
 	}
+
+	if (sample_type & PERF_SAMPLE_REGS_INTR) {
+		/* regs dump ABI info */
+		int size = sizeof(u64);
+
+		perf_sample_regs_intr(&data->regs_intr, regs);
+
+		if (data->regs_intr.regs) {
+			u64 mask = event->attr.sample_regs_intr;
+
+			size += hweight64(mask) * sizeof(u64);
+		}
+
+		header->size += size;
+	}
 }
 
 static void perf_event_output(struct perf_event *event,
@@ -7134,6 +7174,8 @@ static int perf_copy_attr(struct perf_event_attr __user *uattr,
 			ret = -EINVAL;
 	}
 
+	if (attr->sample_type & PERF_SAMPLE_REGS_INTR)
+		ret = perf_reg_validate(attr->sample_regs_intr);
 out:
 	return ret;
 
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v7 2/6] perf/x86: add support for sampling PEBS machine state registers
  2014-09-24 11:48 [PATCH v7 0/6] perf: add ability to sample interrupted machine state Stephane Eranian
  2014-09-24 11:48 ` [PATCH v7 1/6] perf: add ability to sample machine state on interrupt Stephane Eranian
@ 2014-09-24 11:48 ` Stephane Eranian
  2014-11-16 12:35   ` [tip:perf/core] perf/x86: Add " tip-bot for Stephane Eranian
  2014-09-24 11:48 ` [PATCH v7 3/6] perf tools: add core support for sampling intr machine state regs Stephane Eranian
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 23+ messages in thread
From: Stephane Eranian @ 2014-09-24 11:48 UTC (permalink / raw)
  To: linux-kernel; +Cc: peterz, mingo, ak, jolsa, acme, cebbert.lkml

PEBS can capture machine state regs at retiremnt of the sampled
instructions. When precise sampling is enabled on an event, PEBS
is used, so substitute the interrupted state with the PEBS state.
Note that not all registers are captured by PEBS. Those missing
are replaced by the interrupt state counter-parts.

Signed-off-by: Stephane Eranian <eranian@google.com>
---
 arch/x86/kernel/cpu/perf_event_intel_ds.c |   23 +++++++++++++++++++++++
 1 file changed, 23 insertions(+)

diff --git a/arch/x86/kernel/cpu/perf_event_intel_ds.c b/arch/x86/kernel/cpu/perf_event_intel_ds.c
index b1553d0..ce439f7 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_ds.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_ds.c
@@ -886,6 +886,29 @@ static void __intel_pmu_pebs_event(struct perf_event *event,
 	regs.bp = pebs->bp;
 	regs.sp = pebs->sp;
 
+	if (sample_type & PERF_SAMPLE_REGS_INTR) {
+		regs.ax = pebs->ax;
+		regs.bx = pebs->bx;
+		regs.cx = pebs->cx;
+		regs.dx = pebs->dx;
+		regs.si = pebs->si;
+		regs.di = pebs->di;
+		regs.bp = pebs->bp;
+		regs.sp = pebs->sp;
+
+		regs.flags = pebs->flags;
+#ifndef CONFIG_X86_32
+		regs.r8 = pebs->r8;
+		regs.r9 = pebs->r9;
+		regs.r10 = pebs->r10;
+		regs.r11 = pebs->r11;
+		regs.r12 = pebs->r12;
+		regs.r13 = pebs->r13;
+		regs.r14 = pebs->r14;
+		regs.r15 = pebs->r15;
+#endif
+	}
+
 	if (event->attr.precise_ip > 1 && x86_pmu.intel_cap.pebs_format >= 2) {
 		regs.ip = pebs->real_ip;
 		regs.flags |= PERF_EFLAGS_EXACT;
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v7 3/6] perf tools: add core support for sampling intr machine state regs
  2014-09-24 11:48 [PATCH v7 0/6] perf: add ability to sample interrupted machine state Stephane Eranian
  2014-09-24 11:48 ` [PATCH v7 1/6] perf: add ability to sample machine state on interrupt Stephane Eranian
  2014-09-24 11:48 ` [PATCH v7 2/6] perf/x86: add support for sampling PEBS machine state registers Stephane Eranian
@ 2014-09-24 11:48 ` Stephane Eranian
  2014-11-16 12:36   ` [tip:perf/core] perf tools: Add " tip-bot for Stephane Eranian
  2014-09-24 11:48 ` [PATCH v7 4/6] perf/tests: add interrupted state sample parsing test Stephane Eranian
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 23+ messages in thread
From: Stephane Eranian @ 2014-09-24 11:48 UTC (permalink / raw)
  To: linux-kernel; +Cc: peterz, mingo, ak, jolsa, acme, cebbert.lkml

Add the infrastructure to setup, collect and report the interrupt
machine state regs which can be captured by the kernel.

Signed-off-by: Stephane Eranian <eranian@google.com>
---
 tools/perf/perf.h         |    1 +
 tools/perf/util/event.h   |    1 +
 tools/perf/util/evsel.c   |   46 ++++++++++++++++++++++++++++++++++++++++++++-
 tools/perf/util/header.c  |    1 +
 tools/perf/util/session.c |   44 ++++++++++++++++++++++++++++++++++++++-----
 5 files changed, 87 insertions(+), 6 deletions(-)

diff --git a/tools/perf/perf.h b/tools/perf/perf.h
index 510c65f..309d956 100644
--- a/tools/perf/perf.h
+++ b/tools/perf/perf.h
@@ -54,6 +54,7 @@ struct record_opts {
 	bool	     sample_weight;
 	bool	     sample_time;
 	bool	     period;
+	bool	     sample_intr_regs;
 	unsigned int freq;
 	unsigned int mmap_pages;
 	unsigned int user_freq;
diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
index 7eb7107..d6e79f3 100644
--- a/tools/perf/util/event.h
+++ b/tools/perf/util/event.h
@@ -162,6 +162,7 @@ struct perf_sample {
 	struct ip_callchain *callchain;
 	struct branch_stack *branch_stack;
 	struct regs_dump  user_regs;
+	struct regs_dump  intr_regs;
 	struct stack_dump user_stack;
 	struct sample_read read;
 };
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index b38de58..5c2e784 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -628,6 +628,11 @@ void perf_evsel__config(struct perf_evsel *evsel, struct record_opts *opts)
 	if (opts->call_graph_enabled && !evsel->no_aux_samples)
 		perf_evsel__config_callgraph(evsel, opts);
 
+	if (opts->sample_intr_regs) {
+		attr->sample_regs_intr = PERF_REGS_MASK;
+		perf_evsel__set_sample_bit(evsel, REGS_INTR);
+	}
+
 	if (target__has_cpu(&opts->target))
 		perf_evsel__set_sample_bit(evsel, CPU);
 
@@ -1005,6 +1010,7 @@ static size_t perf_event_attr__fprintf(struct perf_event_attr *attr, FILE *fp)
 	ret += PRINT_ATTR_X64(branch_sample_type);
 	ret += PRINT_ATTR_X64(sample_regs_user);
 	ret += PRINT_ATTR_U32(sample_stack_user);
+	ret += PRINT_ATTR_X64(sample_regs_intr);
 
 	ret += fprintf(fp, "%.60s\n", graph_dotted_line);
 
@@ -1504,6 +1510,23 @@ int perf_evsel__parse_sample(struct perf_evsel *evsel, union perf_event *event,
 		array++;
 	}
 
+	data->intr_regs.abi = PERF_SAMPLE_REGS_ABI_NONE;
+	if (type & PERF_SAMPLE_REGS_INTR) {
+		OVERFLOW_CHECK_u64(array);
+		data->intr_regs.abi = *array;
+		array++;
+
+		if (data->intr_regs.abi != PERF_SAMPLE_REGS_ABI_NONE) {
+			u64 mask = evsel->attr.sample_regs_intr;
+
+			sz = hweight_long(mask) * sizeof(u64);
+			OVERFLOW_CHECK(array, sz, max_size);
+			data->intr_regs.mask = mask;
+			data->intr_regs.regs = (u64 *)array;
+			array = (void *)array + sz;
+		}
+	}
+
 	return 0;
 }
 
@@ -1599,6 +1622,16 @@ size_t perf_event__sample_event_size(const struct perf_sample *sample, u64 type,
 	if (type & PERF_SAMPLE_TRANSACTION)
 		result += sizeof(u64);
 
+	if (type & PERF_SAMPLE_REGS_INTR) {
+		if (sample->intr_regs.abi) {
+			result += sizeof(u64);
+			sz = hweight_long(sample->intr_regs.mask) * sizeof(u64);
+			result += sz;
+		} else {
+			result += sizeof(u64);
+		}
+	}
+
 	return result;
 }
 
@@ -1777,6 +1810,17 @@ int perf_event__synthesize_sample(union perf_event *event, u64 type,
 		array++;
 	}
 
+	if (type & PERF_SAMPLE_REGS_INTR) {
+		if (sample->intr_regs.abi) {
+			*array++ = sample->intr_regs.abi;
+			sz = hweight_long(sample->intr_regs.mask) * sizeof(u64);
+			memcpy(array, sample->intr_regs.regs, sz);
+			array = (void *)array + sz;
+		} else {
+			*array++ = 0;
+		}
+	}
+
 	return 0;
 }
 
@@ -1906,7 +1950,7 @@ static int sample_type__fprintf(FILE *fp, bool *first, u64 value)
 		bit_name(READ), bit_name(CALLCHAIN), bit_name(ID), bit_name(CPU),
 		bit_name(PERIOD), bit_name(STREAM_ID), bit_name(RAW),
 		bit_name(BRANCH_STACK), bit_name(REGS_USER), bit_name(STACK_USER),
-		bit_name(IDENTIFIER),
+		bit_name(IDENTIFIER), bit_name(REGS_INTR),
 		{ .name = NULL, }
 	};
 #undef bit_name
diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index 158c787..62514a7 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -2458,6 +2458,7 @@ static const int attr_file_abi_sizes[] = {
 	[1] = PERF_ATTR_SIZE_VER1,
 	[2] = PERF_ATTR_SIZE_VER2,
 	[3] = PERF_ATTR_SIZE_VER3,
+	[4] = PERF_ATTR_SIZE_VER4,
 	0,
 };
 
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 6d2d50d..dc7a8d1 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -581,15 +581,46 @@ static void regs_dump__printf(u64 mask, u64 *regs)
 	}
 }
 
+static const char *regs_abi[] = {
+	[PERF_SAMPLE_REGS_ABI_NONE] = "none",
+	[PERF_SAMPLE_REGS_ABI_32] = "32-bit",
+	[PERF_SAMPLE_REGS_ABI_64] = "64-bit",
+};
+
+static inline const char *regs_dump_abi(struct regs_dump *d)
+{
+	if (d->abi > PERF_SAMPLE_REGS_ABI_64)
+		return "unknown";
+
+	return regs_abi[d->abi];
+}
+
+static void regs__printf(const char *type, struct regs_dump *regs)
+{
+	u64 mask = regs->mask;
+
+	printf("... %s regs: mask 0x%" PRIx64 " ABI %s\n",
+	       type,
+	       mask,
+	       regs_dump_abi(regs));
+
+	regs_dump__printf(mask, regs->regs);
+}
+
 static void regs_user__printf(struct perf_sample *sample)
 {
 	struct regs_dump *user_regs = &sample->user_regs;
 
-	if (user_regs->regs) {
-		u64 mask = user_regs->mask;
-		printf("... user regs: mask 0x%" PRIx64 "\n", mask);
-		regs_dump__printf(mask, user_regs->regs);
-	}
+	if (user_regs->regs)
+		regs__printf("user", user_regs);
+}
+
+static void regs_intr__printf(struct perf_sample *sample)
+{
+	struct regs_dump *intr_regs = &sample->intr_regs;
+
+	if (intr_regs->regs)
+		regs__printf("intr", intr_regs);
 }
 
 static void stack_user__printf(struct stack_dump *dump)
@@ -688,6 +719,9 @@ static void dump_sample(struct perf_evsel *evsel, union perf_event *event,
 	if (sample_type & PERF_SAMPLE_REGS_USER)
 		regs_user__printf(sample);
 
+	if (sample_type & PERF_SAMPLE_REGS_INTR)
+		regs_intr__printf(sample);
+
 	if (sample_type & PERF_SAMPLE_STACK_USER)
 		stack_user__printf(&sample->user_stack);
 
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v7 4/6] perf/tests: add interrupted state sample parsing test
  2014-09-24 11:48 [PATCH v7 0/6] perf: add ability to sample interrupted machine state Stephane Eranian
                   ` (2 preceding siblings ...)
  2014-09-24 11:48 ` [PATCH v7 3/6] perf tools: add core support for sampling intr machine state regs Stephane Eranian
@ 2014-09-24 11:48 ` Stephane Eranian
  2014-11-16 12:36   ` [tip:perf/core] perf/tests: Add " tip-bot for Stephane Eranian
  2014-09-24 11:48 ` [PATCH v7 5/6] perf record: add new -I option to sample interrupted machine state Stephane Eranian
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 23+ messages in thread
From: Stephane Eranian @ 2014-09-24 11:48 UTC (permalink / raw)
  To: linux-kernel; +Cc: peterz, mingo, ak, jolsa, acme, cebbert.lkml

This patch updates the sample parsing test with support
for the sampling of machine interrupted state.

The patch modifies the do_test() code to sahred the sample
regts bitmask between user and intr regs.

Signed-off-by: Stephane Eranian <eranian@google.com>
---
 tools/perf/tests/sample-parsing.c |   55 +++++++++++++++++++++++++++----------
 1 file changed, 40 insertions(+), 15 deletions(-)

diff --git a/tools/perf/tests/sample-parsing.c b/tools/perf/tests/sample-parsing.c
index ca292f9..4908c64 100644
--- a/tools/perf/tests/sample-parsing.c
+++ b/tools/perf/tests/sample-parsing.c
@@ -126,16 +126,28 @@ static bool samples_same(const struct perf_sample *s1,
 	if (type & PERF_SAMPLE_TRANSACTION)
 		COMP(transaction);
 
+	if (type & PERF_SAMPLE_REGS_INTR) {
+		size_t sz = hweight_long(s1->intr_regs.mask) * sizeof(u64);
+
+		COMP(intr_regs.mask);
+		COMP(intr_regs.abi);
+		if (s1->intr_regs.abi &&
+		    (!s1->intr_regs.regs || !s2->intr_regs.regs ||
+		     memcmp(s1->intr_regs.regs, s2->intr_regs.regs, sz))) {
+			pr_debug("Samples differ at 'intr_regs'\n");
+			return false;
+		}
+	}
+
 	return true;
 }
 
-static int do_test(u64 sample_type, u64 sample_regs_user, u64 read_format)
+static int do_test(u64 sample_type, u64 sample_regs, u64 read_format)
 {
 	struct perf_evsel evsel = {
 		.needs_swap = false,
 		.attr = {
 			.sample_type = sample_type,
-			.sample_regs_user = sample_regs_user,
 			.read_format = read_format,
 		},
 	};
@@ -154,7 +166,7 @@ static int do_test(u64 sample_type, u64 sample_regs_user, u64 read_format)
 		/* 1 branch_entry */
 		.data = {1, 211, 212, 213},
 	};
-	u64 user_regs[64];
+	u64 regs[64];
 	const u64 raw_data[] = {0x123456780a0b0c0dULL, 0x1102030405060708ULL};
 	const u64 data[] = {0x2211443366558877ULL, 0, 0xaabbccddeeff4321ULL};
 	struct perf_sample sample = {
@@ -176,8 +188,8 @@ static int do_test(u64 sample_type, u64 sample_regs_user, u64 read_format)
 		.branch_stack	= &branch_stack.branch_stack,
 		.user_regs	= {
 			.abi	= PERF_SAMPLE_REGS_ABI_64,
-			.mask	= sample_regs_user,
-			.regs	= user_regs,
+			.mask	= sample_regs,
+			.regs	= regs,
 		},
 		.user_stack	= {
 			.size	= sizeof(data),
@@ -187,14 +199,25 @@ static int do_test(u64 sample_type, u64 sample_regs_user, u64 read_format)
 			.time_enabled = 0x030a59d664fca7deULL,
 			.time_running = 0x011b6ae553eb98edULL,
 		},
+		.intr_regs	= {
+			.abi	= PERF_SAMPLE_REGS_ABI_64,
+			.mask	= sample_regs,
+			.regs	= regs,
+		},
 	};
 	struct sample_read_value values[] = {{1, 5}, {9, 3}, {2, 7}, {6, 4},};
 	struct perf_sample sample_out;
 	size_t i, sz, bufsz;
 	int err, ret = -1;
 
-	for (i = 0; i < sizeof(user_regs); i++)
-		*(i + (u8 *)user_regs) = i & 0xfe;
+	if (sample_type & PERF_SAMPLE_REGS_USER)
+		evsel.attr.sample_regs_user = sample_regs;
+
+	if (sample_type & PERF_SAMPLE_REGS_INTR)
+		evsel.attr.sample_regs_intr = sample_regs;
+
+	for (i = 0; i < sizeof(regs); i++)
+		*(i + (u8 *)regs) = i & 0xfe;
 
 	if (read_format & PERF_FORMAT_GROUP) {
 		sample.read.group.nr     = 4;
@@ -271,7 +294,7 @@ int test__sample_parsing(void)
 {
 	const u64 rf[] = {4, 5, 6, 7, 12, 13, 14, 15};
 	u64 sample_type;
-	u64 sample_regs_user;
+	u64 sample_regs;
 	size_t i;
 	int err;
 
@@ -280,7 +303,7 @@ int test__sample_parsing(void)
 	 * were added.  Please actually update the test rather than just change
 	 * the condition below.
 	 */
-	if (PERF_SAMPLE_MAX > PERF_SAMPLE_TRANSACTION << 1) {
+	if (PERF_SAMPLE_MAX > PERF_SAMPLE_REGS_INTR << 1) {
 		pr_debug("sample format has changed, some new PERF_SAMPLE_ bit was introduced - test needs updating\n");
 		return -1;
 	}
@@ -297,22 +320,24 @@ int test__sample_parsing(void)
 			}
 			continue;
 		}
+		sample_regs = 0;
 
 		if (sample_type == PERF_SAMPLE_REGS_USER)
-			sample_regs_user = 0x3fff;
-		else
-			sample_regs_user = 0;
+			sample_regs = 0x3fff;
+
+		if (sample_type == PERF_SAMPLE_REGS_INTR)
+			sample_regs = 0xff0fff;
 
-		err = do_test(sample_type, sample_regs_user, 0);
+		err = do_test(sample_type, sample_regs, 0);
 		if (err)
 			return err;
 	}
 
 	/* Test all sample format bits together */
 	sample_type = PERF_SAMPLE_MAX - 1;
-	sample_regs_user = 0x3fff;
+	sample_regs = 0x3fff; /* shared yb intr and user regs */
 	for (i = 0; i < ARRAY_SIZE(rf); i++) {
-		err = do_test(sample_type, sample_regs_user, rf[i]);
+		err = do_test(sample_type, sample_regs, rf[i]);
 		if (err)
 			return err;
 	}
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v7 5/6] perf record: add new -I option to sample interrupted machine state
  2014-09-24 11:48 [PATCH v7 0/6] perf: add ability to sample interrupted machine state Stephane Eranian
                   ` (3 preceding siblings ...)
  2014-09-24 11:48 ` [PATCH v7 4/6] perf/tests: add interrupted state sample parsing test Stephane Eranian
@ 2014-09-24 11:48 ` Stephane Eranian
  2014-11-16 12:36   ` [tip:perf/core] perf record: Add " tip-bot for Stephane Eranian
  2014-09-24 11:48 ` [PATCH v7 6/6] perf: improve perf_sample_data struct layout Stephane Eranian
  2014-09-25  9:26 ` [PATCH v7 0/6] perf: add ability to sample interrupted machine state Peter Zijlstra
  6 siblings, 1 reply; 23+ messages in thread
From: Stephane Eranian @ 2014-09-24 11:48 UTC (permalink / raw)
  To: linux-kernel; +Cc: peterz, mingo, ak, jolsa, acme, cebbert.lkml

Add -I/--intr-regs option to capture machine state registers at
interrupt.

Add the corresponding man page description

Signed-off-by: Stephane Eranian <eranian@google.com>
---
 tools/perf/Documentation/perf-record.txt |    6 ++++++
 tools/perf/builtin-record.c              |    2 ++
 2 files changed, 8 insertions(+)

diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
index d460049..1a36259 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -214,6 +214,12 @@ if combined with -a or -C options.
 After starting the program, wait msecs before measuring. This is useful to
 filter out the startup phase of the program, which is often very different.
 
+-I::
+--intr-regs::
+Capture machine state (registers) at interrupt, i.e., on counter overflows for
+each sample. List of captured registers depends on the architecture. This option
+is off by default.
+
 SEE ALSO
 --------
 linkperf:perf-stat[1], linkperf:perf-list[1]
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index a1b0403..cd58f01 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -883,6 +883,8 @@ const struct option record_options[] = {
 		    "sample transaction flags (special events only)"),
 	OPT_BOOLEAN(0, "per-thread", &record.opts.target.per_thread,
 		    "use per-thread mmaps"),
+	OPT_BOOLEAN('I', "intr-regs", &record.opts.sample_intr_regs,
+		    "Sample machine registers on interrupt"),
 	OPT_END()
 };
 
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v7 6/6] perf: improve perf_sample_data struct layout
  2014-09-24 11:48 [PATCH v7 0/6] perf: add ability to sample interrupted machine state Stephane Eranian
                   ` (4 preceding siblings ...)
  2014-09-24 11:48 ` [PATCH v7 5/6] perf record: add new -I option to sample interrupted machine state Stephane Eranian
@ 2014-09-24 11:48 ` Stephane Eranian
  2014-11-16 12:37   ` [tip:perf/core] perf: Improve the " tip-bot for Peter Zijlstra
  2014-09-25  9:26 ` [PATCH v7 0/6] perf: add ability to sample interrupted machine state Peter Zijlstra
  6 siblings, 1 reply; 23+ messages in thread
From: Stephane Eranian @ 2014-09-24 11:48 UTC (permalink / raw)
  To: linux-kernel; +Cc: peterz, mingo, ak, jolsa, acme, cebbert.lkml

From: Peter Zijlstra <peterz@infradead.org>

This patch reorders fields in the perf_sample_data
struct in order to minimize the number of cachelines
touched in perf_sample_data_init(). It also removes
some intializations which are redundant with the
code in kernel/events/core.c

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
---
 include/linux/perf_event.h |   34 +++++++++++++++++-----------------
 kernel/events/core.c       |    5 ++++-
 2 files changed, 21 insertions(+), 18 deletions(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 68d46d5..486e84c 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -580,35 +580,40 @@ extern u64 perf_event_read_value(struct perf_event *event,
 
 
 struct perf_sample_data {
-	u64				type;
+	/*
+	 * Fields set by perf_sample_data_init(), group so as to
+	 * minimize the cachelines touched.
+	 */
+	u64				addr;
+	struct perf_raw_record		*raw;
+	struct perf_branch_stack	*br_stack;
+	u64				period;
+	u64				weight;
+	u64				txn;
+	union  perf_mem_data_src	data_src;
 
+	/*
+	 * The other fields, optionally {set,used} by
+	 * perf_{prepare,output}_sample().
+	 */
+	u64				type;
 	u64				ip;
 	struct {
 		u32	pid;
 		u32	tid;
 	}				tid_entry;
 	u64				time;
-	u64				addr;
 	u64				id;
 	u64				stream_id;
 	struct {
 		u32	cpu;
 		u32	reserved;
 	}				cpu_entry;
-	u64				period;
-	union  perf_mem_data_src	data_src;
 	struct perf_callchain_entry	*callchain;
-	struct perf_raw_record		*raw;
-	struct perf_branch_stack	*br_stack;
 	struct perf_regs		regs_user;
 	struct perf_regs		regs_intr;
 	u64				stack_user_size;
-	u64				weight;
-	/*
-	 * Transaction flags for abort events:
-	 */
-	u64				txn;
-};
+} ____cacheline_aligned;
 
 /* default value for data source */
 #define PERF_MEM_NA (PERF_MEM_S(OP, NA)   |\
@@ -625,14 +630,9 @@ static inline void perf_sample_data_init(struct perf_sample_data *data,
 	data->raw  = NULL;
 	data->br_stack = NULL;
 	data->period = period;
-	data->regs_user.abi = PERF_SAMPLE_REGS_ABI_NONE;
-	data->regs_user.regs = NULL;
-	data->stack_user_size = 0;
 	data->weight = 0;
 	data->data_src.val = PERF_MEM_NA;
 	data->txn = 0;
-	data->regs_intr.abi = PERF_SAMPLE_REGS_ABI_NONE;
-	data->regs_intr.regs = NULL;
 }
 
 extern void perf_output_sample(struct perf_output_handle *handle,
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 7941343..64a95be 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -4441,8 +4441,11 @@ static void perf_sample_regs_user(struct perf_regs *regs_user,
 	}
 
 	if (regs) {
-		regs_user->regs = regs;
 		regs_user->abi  = perf_reg_abi(current);
+		regs_user->regs = regs;
+	} else {
+		regs_user->abi = PERF_SAMPLE_REGS_ABI_NONE;
+		regs_user->regs = NULL;
 	}
 }
 
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH v7 0/6] perf: add ability to sample interrupted machine state
  2014-09-24 11:48 [PATCH v7 0/6] perf: add ability to sample interrupted machine state Stephane Eranian
                   ` (5 preceding siblings ...)
  2014-09-24 11:48 ` [PATCH v7 6/6] perf: improve perf_sample_data struct layout Stephane Eranian
@ 2014-09-25  9:26 ` Peter Zijlstra
  2014-09-25 10:32   ` Stephane Eranian
  6 siblings, 1 reply; 23+ messages in thread
From: Peter Zijlstra @ 2014-09-25  9:26 UTC (permalink / raw)
  To: Stephane Eranian; +Cc: linux-kernel, mingo, ak, jolsa, acme, cebbert.lkml



These do indeed appear to compile just fine, thanks!

Arnaldo, may I gently prod you to pass judgement over the tools bits?

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v7 0/6] perf: add ability to sample interrupted machine state
  2014-09-25  9:26 ` [PATCH v7 0/6] perf: add ability to sample interrupted machine state Peter Zijlstra
@ 2014-09-25 10:32   ` Stephane Eranian
  2014-09-25 14:29     ` Peter Zijlstra
  0 siblings, 1 reply; 23+ messages in thread
From: Stephane Eranian @ 2014-09-25 10:32 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: LKML, mingo, ak, Jiri Olsa, Arnaldo Carvalho de Melo, Chuck Ebbert

Peter,

On Thu, Sep 25, 2014 at 11:26 AM, Peter Zijlstra <peterz@infradead.org> wrote:
>
>
> These do indeed appear to compile just fine, thanks!
>
I posted v7 with the x86_32 fixes and the eflags changes.
Is that what you compiled?

> Arnaldo, may I gently prod you to pass judgement over the tools bits?

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v7 0/6] perf: add ability to sample interrupted machine state
  2014-09-25 10:32   ` Stephane Eranian
@ 2014-09-25 14:29     ` Peter Zijlstra
  2014-09-25 16:22       ` Stephane Eranian
  0 siblings, 1 reply; 23+ messages in thread
From: Peter Zijlstra @ 2014-09-25 14:29 UTC (permalink / raw)
  To: Stephane Eranian
  Cc: LKML, mingo, ak, Jiri Olsa, Arnaldo Carvalho de Melo, Chuck Ebbert

On Thu, Sep 25, 2014 at 12:32:39PM +0200, Stephane Eranian wrote:
> Peter,
> 
> On Thu, Sep 25, 2014 at 11:26 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> >
> >
> > These do indeed appear to compile just fine, thanks!
> >
> I posted v7 with the x86_32 fixes and the eflags changes.
> Is that what you compiled?

Yep, I grabbed the latest. I've not yet tried the x86_32 compile, but I
did check the 'offending' patch and it has the #ifdef guard in place so
I'll assume it works for now.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v7 0/6] perf: add ability to sample interrupted machine state
  2014-09-25 14:29     ` Peter Zijlstra
@ 2014-09-25 16:22       ` Stephane Eranian
  0 siblings, 0 replies; 23+ messages in thread
From: Stephane Eranian @ 2014-09-25 16:22 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: LKML, mingo, ak, Jiri Olsa, Arnaldo Carvalho de Melo, Chuck Ebbert

On Thu, Sep 25, 2014 at 4:29 PM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Thu, Sep 25, 2014 at 12:32:39PM +0200, Stephane Eranian wrote:
>> Peter,
>>
>> On Thu, Sep 25, 2014 at 11:26 AM, Peter Zijlstra <peterz@infradead.org> wrote:
>> >
>> >
>> > These do indeed appear to compile just fine, thanks!
>> >
>> I posted v7 with the x86_32 fixes and the eflags changes.
>> Is that what you compiled?
>
> Yep, I grabbed the latest. I've not yet tried the x86_32 compile, but I
> did check the 'offending' patch and it has the #ifdef guard in place so
> I'll assume it works for now.

Yes, I compiled on my 32-bit Atom laptop. It compiles.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [tip:perf/core] perf: Add ability to sample machine state on interrupt
  2014-09-24 11:48 ` [PATCH v7 1/6] perf: add ability to sample machine state on interrupt Stephane Eranian
@ 2014-11-16 12:35   ` tip-bot for Stephane Eranian
  2014-11-21 21:26   ` [PATCH v7 1/6] perf: add " Arnaldo Carvalho de Melo
  1 sibling, 0 replies; 23+ messages in thread
From: tip-bot for Stephane Eranian @ 2014-11-16 12:35 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: tglx, acme, torvalds, hpa, mingo, jolsa, linux-kernel, peterz, eranian

Commit-ID:  60e2364e60e86e81bc6377f49779779e6120977f
Gitweb:     http://git.kernel.org/tip/60e2364e60e86e81bc6377f49779779e6120977f
Author:     Stephane Eranian <eranian@google.com>
AuthorDate: Wed, 24 Sep 2014 13:48:37 +0200
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Sun, 16 Nov 2014 11:41:57 +0100

perf: Add ability to sample machine state on interrupt

Enable capture of interrupted machine state for each sample.

Registers to sample are passed per event in the sample_regs_intr bitmask.

To sample interrupt machine state, the PERF_SAMPLE_INTR_REGS must be passed in
sample_type.

The list of available registers is arch dependent and provided by asm/perf_regs.h

Registers are laid out as u64 in the order of the bit order of sample_intr_regs.

This patch also adds a new ABI version PERF_ATTR_SIZE_VER4 because we extend
the perf_event_attr struct with a new u64 field.

Reviewed-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: cebbert.lkml@gmail.com
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux-api@vger.kernel.org
Link: http://lkml.kernel.org/r/1411559322-16548-2-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 include/linux/perf_event.h      |  7 +++++--
 include/uapi/linux/perf_event.h | 15 +++++++++++++-
 kernel/events/core.c            | 46 +++++++++++++++++++++++++++++++++++++++--
 3 files changed, 63 insertions(+), 5 deletions(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 893a0d0..68d46d5 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -79,7 +79,7 @@ struct perf_branch_stack {
 	struct perf_branch_entry	entries[0];
 };
 
-struct perf_regs_user {
+struct perf_regs {
 	__u64		abi;
 	struct pt_regs	*regs;
 };
@@ -600,7 +600,8 @@ struct perf_sample_data {
 	struct perf_callchain_entry	*callchain;
 	struct perf_raw_record		*raw;
 	struct perf_branch_stack	*br_stack;
-	struct perf_regs_user		regs_user;
+	struct perf_regs		regs_user;
+	struct perf_regs		regs_intr;
 	u64				stack_user_size;
 	u64				weight;
 	/*
@@ -630,6 +631,8 @@ static inline void perf_sample_data_init(struct perf_sample_data *data,
 	data->weight = 0;
 	data->data_src.val = PERF_MEM_NA;
 	data->txn = 0;
+	data->regs_intr.abi = PERF_SAMPLE_REGS_ABI_NONE;
+	data->regs_intr.regs = NULL;
 }
 
 extern void perf_output_sample(struct perf_output_handle *handle,
diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index 9d84540..9b79abb 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -137,8 +137,9 @@ enum perf_event_sample_format {
 	PERF_SAMPLE_DATA_SRC			= 1U << 15,
 	PERF_SAMPLE_IDENTIFIER			= 1U << 16,
 	PERF_SAMPLE_TRANSACTION			= 1U << 17,
+	PERF_SAMPLE_REGS_INTR			= 1U << 18,
 
-	PERF_SAMPLE_MAX = 1U << 18,		/* non-ABI */
+	PERF_SAMPLE_MAX = 1U << 19,		/* non-ABI */
 };
 
 /*
@@ -238,6 +239,7 @@ enum perf_event_read_format {
 #define PERF_ATTR_SIZE_VER2	80	/* add: branch_sample_type */
 #define PERF_ATTR_SIZE_VER3	96	/* add: sample_regs_user */
 					/* add: sample_stack_user */
+#define PERF_ATTR_SIZE_VER4	104	/* add: sample_regs_intr */
 
 /*
  * Hardware event_id to monitor via a performance monitoring event:
@@ -334,6 +336,15 @@ struct perf_event_attr {
 
 	/* Align to u64. */
 	__u32	__reserved_2;
+	/*
+	 * Defines set of regs to dump for each sample
+	 * state captured on:
+	 *  - precise = 0: PMU interrupt
+	 *  - precise > 0: sampled instruction
+	 *
+	 * See asm/perf_regs.h for details.
+	 */
+	__u64	sample_regs_intr;
 };
 
 #define perf_flags(attr)	(*(&(attr)->read_format + 1))
@@ -686,6 +697,8 @@ enum perf_event_type {
 	 *	{ u64			weight;   } && PERF_SAMPLE_WEIGHT
 	 *	{ u64			data_src; } && PERF_SAMPLE_DATA_SRC
 	 *	{ u64			transaction; } && PERF_SAMPLE_TRANSACTION
+	 *	{ u64			abi; # enum perf_sample_regs_abi
+	 *	  u64			regs[weight(mask)]; } && PERF_SAMPLE_REGS_INTR
 	 * };
 	 */
 	PERF_RECORD_SAMPLE			= 9,
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 1cd5eef..c2be159 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -4460,7 +4460,7 @@ perf_output_sample_regs(struct perf_output_handle *handle,
 	}
 }
 
-static void perf_sample_regs_user(struct perf_regs_user *regs_user,
+static void perf_sample_regs_user(struct perf_regs *regs_user,
 				  struct pt_regs *regs)
 {
 	if (!user_mode(regs)) {
@@ -4476,6 +4476,14 @@ static void perf_sample_regs_user(struct perf_regs_user *regs_user,
 	}
 }
 
+static void perf_sample_regs_intr(struct perf_regs *regs_intr,
+				  struct pt_regs *regs)
+{
+	regs_intr->regs = regs;
+	regs_intr->abi  = perf_reg_abi(current);
+}
+
+
 /*
  * Get remaining task size from user stack pointer.
  *
@@ -4857,6 +4865,23 @@ void perf_output_sample(struct perf_output_handle *handle,
 	if (sample_type & PERF_SAMPLE_TRANSACTION)
 		perf_output_put(handle, data->txn);
 
+	if (sample_type & PERF_SAMPLE_REGS_INTR) {
+		u64 abi = data->regs_intr.abi;
+		/*
+		 * If there are no regs to dump, notice it through
+		 * first u64 being zero (PERF_SAMPLE_REGS_ABI_NONE).
+		 */
+		perf_output_put(handle, abi);
+
+		if (abi) {
+			u64 mask = event->attr.sample_regs_intr;
+
+			perf_output_sample_regs(handle,
+						data->regs_intr.regs,
+						mask);
+		}
+	}
+
 	if (!event->attr.watermark) {
 		int wakeup_events = event->attr.wakeup_events;
 
@@ -4943,7 +4968,7 @@ void perf_prepare_sample(struct perf_event_header *header,
 		 * in case new sample type is added, because we could eat
 		 * up the rest of the sample size.
 		 */
-		struct perf_regs_user *uregs = &data->regs_user;
+		struct perf_regs *uregs = &data->regs_user;
 		u16 stack_size = event->attr.sample_stack_user;
 		u16 size = sizeof(u64);
 
@@ -4964,6 +4989,21 @@ void perf_prepare_sample(struct perf_event_header *header,
 		data->stack_user_size = stack_size;
 		header->size += size;
 	}
+
+	if (sample_type & PERF_SAMPLE_REGS_INTR) {
+		/* regs dump ABI info */
+		int size = sizeof(u64);
+
+		perf_sample_regs_intr(&data->regs_intr, regs);
+
+		if (data->regs_intr.regs) {
+			u64 mask = event->attr.sample_regs_intr;
+
+			size += hweight64(mask) * sizeof(u64);
+		}
+
+		header->size += size;
+	}
 }
 
 static void perf_event_output(struct perf_event *event,
@@ -7151,6 +7191,8 @@ static int perf_copy_attr(struct perf_event_attr __user *uattr,
 			ret = -EINVAL;
 	}
 
+	if (attr->sample_type & PERF_SAMPLE_REGS_INTR)
+		ret = perf_reg_validate(attr->sample_regs_intr);
 out:
 	return ret;
 

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [tip:perf/core] perf/x86: Add support for sampling PEBS machine state registers
  2014-09-24 11:48 ` [PATCH v7 2/6] perf/x86: add support for sampling PEBS machine state registers Stephane Eranian
@ 2014-11-16 12:35   ` tip-bot for Stephane Eranian
  0 siblings, 0 replies; 23+ messages in thread
From: tip-bot for Stephane Eranian @ 2014-11-16 12:35 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: torvalds, hpa, eranian, linux-kernel, tglx, peterz, acme, mingo

Commit-ID:  aea48559ac454a065244d3eff0c94cc8af9c553e
Gitweb:     http://git.kernel.org/tip/aea48559ac454a065244d3eff0c94cc8af9c553e
Author:     Stephane Eranian <eranian@google.com>
AuthorDate: Wed, 24 Sep 2014 13:48:38 +0200
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Sun, 16 Nov 2014 11:41:58 +0100

perf/x86: Add support for sampling PEBS machine state registers

PEBS can capture machine state regs at retiremnt of the sampled
instructions. When precise sampling is enabled on an event, PEBS
is used, so substitute the interrupted state with the PEBS state.
Note that not all registers are captured by PEBS. Those missing
are replaced by the interrupt state counter-parts.

Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1411559322-16548-3-git-send-email-eranian@google.com
Cc: cebbert.lkml@gmail.com
Cc: jolsa@redhat.com
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/kernel/cpu/perf_event_intel_ds.c | 23 +++++++++++++++++++++++
 1 file changed, 23 insertions(+)

diff --git a/arch/x86/kernel/cpu/perf_event_intel_ds.c b/arch/x86/kernel/cpu/perf_event_intel_ds.c
index 3c5d5c1..495ae97 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_ds.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_ds.c
@@ -886,6 +886,29 @@ static void __intel_pmu_pebs_event(struct perf_event *event,
 	regs.bp = pebs->bp;
 	regs.sp = pebs->sp;
 
+	if (sample_type & PERF_SAMPLE_REGS_INTR) {
+		regs.ax = pebs->ax;
+		regs.bx = pebs->bx;
+		regs.cx = pebs->cx;
+		regs.dx = pebs->dx;
+		regs.si = pebs->si;
+		regs.di = pebs->di;
+		regs.bp = pebs->bp;
+		regs.sp = pebs->sp;
+
+		regs.flags = pebs->flags;
+#ifndef CONFIG_X86_32
+		regs.r8 = pebs->r8;
+		regs.r9 = pebs->r9;
+		regs.r10 = pebs->r10;
+		regs.r11 = pebs->r11;
+		regs.r12 = pebs->r12;
+		regs.r13 = pebs->r13;
+		regs.r14 = pebs->r14;
+		regs.r15 = pebs->r15;
+#endif
+	}
+
 	if (event->attr.precise_ip > 1 && x86_pmu.intel_cap.pebs_format >= 2) {
 		regs.ip = pebs->real_ip;
 		regs.flags |= PERF_EFLAGS_EXACT;

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [tip:perf/core] perf tools: Add core support for sampling intr machine state regs
  2014-09-24 11:48 ` [PATCH v7 3/6] perf tools: add core support for sampling intr machine state regs Stephane Eranian
@ 2014-11-16 12:36   ` tip-bot for Stephane Eranian
  0 siblings, 0 replies; 23+ messages in thread
From: tip-bot for Stephane Eranian @ 2014-11-16 12:36 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: dzickus, torvalds, wangnan0, jolsa, acme, peterz, adrian.hunter,
	linux-kernel, jean.pihet, hpa, jolsa, dsahern, ak, tglx,
	namhyung, Waiman.Long, eranian, paulus, mingo

Commit-ID:  6a21c0b5c2abd2fdfa6fff79f11df3d6082c1873
Gitweb:     http://git.kernel.org/tip/6a21c0b5c2abd2fdfa6fff79f11df3d6082c1873
Author:     Stephane Eranian <eranian@google.com>
AuthorDate: Wed, 24 Sep 2014 13:48:39 +0200
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Sun, 16 Nov 2014 11:41:59 +0100

perf tools: Add core support for sampling intr machine state regs

Add the infrastructure to setup, collect and report the interrupt
machine state regs which can be captured by the kernel.

Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: cebbert.lkml@gmail.com
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Jean Pihet <jean.pihet@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Waiman Long <Waiman.Long@hp.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1411559322-16548-4-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 tools/perf/perf.h         |  1 +
 tools/perf/util/event.h   |  1 +
 tools/perf/util/evsel.c   | 46 +++++++++++++++++++++++++++++++++++++++++++++-
 tools/perf/util/header.c  |  1 +
 tools/perf/util/session.c | 44 +++++++++++++++++++++++++++++++++++++++-----
 5 files changed, 87 insertions(+), 6 deletions(-)

diff --git a/tools/perf/perf.h b/tools/perf/perf.h
index 511c2831..1dabb85 100644
--- a/tools/perf/perf.h
+++ b/tools/perf/perf.h
@@ -52,6 +52,7 @@ struct record_opts {
 	bool	     sample_weight;
 	bool	     sample_time;
 	bool	     period;
+	bool	     sample_intr_regs;
 	unsigned int freq;
 	unsigned int mmap_pages;
 	unsigned int user_freq;
diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
index 7be3897..09b9e8d 100644
--- a/tools/perf/util/event.h
+++ b/tools/perf/util/event.h
@@ -188,6 +188,7 @@ struct perf_sample {
 	struct ip_callchain *callchain;
 	struct branch_stack *branch_stack;
 	struct regs_dump  user_regs;
+	struct regs_dump  intr_regs;
 	struct stack_dump user_stack;
 	struct sample_read read;
 };
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 12b4396..34344ff 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -661,6 +661,11 @@ void perf_evsel__config(struct perf_evsel *evsel, struct record_opts *opts)
 	if (callchain_param.enabled && !evsel->no_aux_samples)
 		perf_evsel__config_callgraph(evsel);
 
+	if (opts->sample_intr_regs) {
+		attr->sample_regs_intr = PERF_REGS_MASK;
+		perf_evsel__set_sample_bit(evsel, REGS_INTR);
+	}
+
 	if (target__has_cpu(&opts->target))
 		perf_evsel__set_sample_bit(evsel, CPU);
 
@@ -1037,6 +1042,7 @@ static size_t perf_event_attr__fprintf(struct perf_event_attr *attr, FILE *fp)
 	ret += PRINT_ATTR_X64(branch_sample_type);
 	ret += PRINT_ATTR_X64(sample_regs_user);
 	ret += PRINT_ATTR_U32(sample_stack_user);
+	ret += PRINT_ATTR_X64(sample_regs_intr);
 
 	ret += fprintf(fp, "%.60s\n", graph_dotted_line);
 
@@ -1536,6 +1542,23 @@ int perf_evsel__parse_sample(struct perf_evsel *evsel, union perf_event *event,
 		array++;
 	}
 
+	data->intr_regs.abi = PERF_SAMPLE_REGS_ABI_NONE;
+	if (type & PERF_SAMPLE_REGS_INTR) {
+		OVERFLOW_CHECK_u64(array);
+		data->intr_regs.abi = *array;
+		array++;
+
+		if (data->intr_regs.abi != PERF_SAMPLE_REGS_ABI_NONE) {
+			u64 mask = evsel->attr.sample_regs_intr;
+
+			sz = hweight_long(mask) * sizeof(u64);
+			OVERFLOW_CHECK(array, sz, max_size);
+			data->intr_regs.mask = mask;
+			data->intr_regs.regs = (u64 *)array;
+			array = (void *)array + sz;
+		}
+	}
+
 	return 0;
 }
 
@@ -1631,6 +1654,16 @@ size_t perf_event__sample_event_size(const struct perf_sample *sample, u64 type,
 	if (type & PERF_SAMPLE_TRANSACTION)
 		result += sizeof(u64);
 
+	if (type & PERF_SAMPLE_REGS_INTR) {
+		if (sample->intr_regs.abi) {
+			result += sizeof(u64);
+			sz = hweight_long(sample->intr_regs.mask) * sizeof(u64);
+			result += sz;
+		} else {
+			result += sizeof(u64);
+		}
+	}
+
 	return result;
 }
 
@@ -1809,6 +1842,17 @@ int perf_event__synthesize_sample(union perf_event *event, u64 type,
 		array++;
 	}
 
+	if (type & PERF_SAMPLE_REGS_INTR) {
+		if (sample->intr_regs.abi) {
+			*array++ = sample->intr_regs.abi;
+			sz = hweight_long(sample->intr_regs.mask) * sizeof(u64);
+			memcpy(array, sample->intr_regs.regs, sz);
+			array = (void *)array + sz;
+		} else {
+			*array++ = 0;
+		}
+	}
+
 	return 0;
 }
 
@@ -1938,7 +1982,7 @@ static int sample_type__fprintf(FILE *fp, bool *first, u64 value)
 		bit_name(READ), bit_name(CALLCHAIN), bit_name(ID), bit_name(CPU),
 		bit_name(PERIOD), bit_name(STREAM_ID), bit_name(RAW),
 		bit_name(BRANCH_STACK), bit_name(REGS_USER), bit_name(STACK_USER),
-		bit_name(IDENTIFIER),
+		bit_name(IDENTIFIER), bit_name(REGS_INTR),
 		{ .name = NULL, }
 	};
 #undef bit_name
diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index 76442ca..05fab7a 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -2143,6 +2143,7 @@ static const int attr_file_abi_sizes[] = {
 	[1] = PERF_ATTR_SIZE_VER1,
 	[2] = PERF_ATTR_SIZE_VER2,
 	[3] = PERF_ATTR_SIZE_VER3,
+	[4] = PERF_ATTR_SIZE_VER4,
 	0,
 };
 
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index f4478ce..6ac62ae 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -592,15 +592,46 @@ static void regs_dump__printf(u64 mask, u64 *regs)
 	}
 }
 
+static const char *regs_abi[] = {
+	[PERF_SAMPLE_REGS_ABI_NONE] = "none",
+	[PERF_SAMPLE_REGS_ABI_32] = "32-bit",
+	[PERF_SAMPLE_REGS_ABI_64] = "64-bit",
+};
+
+static inline const char *regs_dump_abi(struct regs_dump *d)
+{
+	if (d->abi > PERF_SAMPLE_REGS_ABI_64)
+		return "unknown";
+
+	return regs_abi[d->abi];
+}
+
+static void regs__printf(const char *type, struct regs_dump *regs)
+{
+	u64 mask = regs->mask;
+
+	printf("... %s regs: mask 0x%" PRIx64 " ABI %s\n",
+	       type,
+	       mask,
+	       regs_dump_abi(regs));
+
+	regs_dump__printf(mask, regs->regs);
+}
+
 static void regs_user__printf(struct perf_sample *sample)
 {
 	struct regs_dump *user_regs = &sample->user_regs;
 
-	if (user_regs->regs) {
-		u64 mask = user_regs->mask;
-		printf("... user regs: mask 0x%" PRIx64 "\n", mask);
-		regs_dump__printf(mask, user_regs->regs);
-	}
+	if (user_regs->regs)
+		regs__printf("user", user_regs);
+}
+
+static void regs_intr__printf(struct perf_sample *sample)
+{
+	struct regs_dump *intr_regs = &sample->intr_regs;
+
+	if (intr_regs->regs)
+		regs__printf("intr", intr_regs);
 }
 
 static void stack_user__printf(struct stack_dump *dump)
@@ -699,6 +730,9 @@ static void dump_sample(struct perf_evsel *evsel, union perf_event *event,
 	if (sample_type & PERF_SAMPLE_REGS_USER)
 		regs_user__printf(sample);
 
+	if (sample_type & PERF_SAMPLE_REGS_INTR)
+		regs_intr__printf(sample);
+
 	if (sample_type & PERF_SAMPLE_STACK_USER)
 		stack_user__printf(&sample->user_stack);
 

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [tip:perf/core] perf/tests: Add interrupted state sample parsing test
  2014-09-24 11:48 ` [PATCH v7 4/6] perf/tests: add interrupted state sample parsing test Stephane Eranian
@ 2014-11-16 12:36   ` tip-bot for Stephane Eranian
  0 siblings, 0 replies; 23+ messages in thread
From: tip-bot for Stephane Eranian @ 2014-11-16 12:36 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: eranian, mingo, rusty, acme, tglx, torvalds, hpa, bp,
	linux-kernel, jean.pihet, peterz, jolsa

Commit-ID:  26ff0f0af79d0a9fc1f783d45e470059b840c7c1
Gitweb:     http://git.kernel.org/tip/26ff0f0af79d0a9fc1f783d45e470059b840c7c1
Author:     Stephane Eranian <eranian@google.com>
AuthorDate: Wed, 24 Sep 2014 13:48:40 +0200
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Sun, 16 Nov 2014 11:42:01 +0100

perf/tests: Add interrupted state sample parsing test

This patch updates the sample parsing test with support
for the sampling of machine interrupted state.

The patch modifies the do_test() code to sahred the sample
regts bitmask between user and intr regs.

Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: cebbert.lkml@gmail.com
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Jean Pihet <jean.pihet@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Link: http://lkml.kernel.org/r/1411559322-16548-5-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 tools/perf/tests/sample-parsing.c | 55 ++++++++++++++++++++++++++++-----------
 1 file changed, 40 insertions(+), 15 deletions(-)

diff --git a/tools/perf/tests/sample-parsing.c b/tools/perf/tests/sample-parsing.c
index ca292f9..4908c64 100644
--- a/tools/perf/tests/sample-parsing.c
+++ b/tools/perf/tests/sample-parsing.c
@@ -126,16 +126,28 @@ static bool samples_same(const struct perf_sample *s1,
 	if (type & PERF_SAMPLE_TRANSACTION)
 		COMP(transaction);
 
+	if (type & PERF_SAMPLE_REGS_INTR) {
+		size_t sz = hweight_long(s1->intr_regs.mask) * sizeof(u64);
+
+		COMP(intr_regs.mask);
+		COMP(intr_regs.abi);
+		if (s1->intr_regs.abi &&
+		    (!s1->intr_regs.regs || !s2->intr_regs.regs ||
+		     memcmp(s1->intr_regs.regs, s2->intr_regs.regs, sz))) {
+			pr_debug("Samples differ at 'intr_regs'\n");
+			return false;
+		}
+	}
+
 	return true;
 }
 
-static int do_test(u64 sample_type, u64 sample_regs_user, u64 read_format)
+static int do_test(u64 sample_type, u64 sample_regs, u64 read_format)
 {
 	struct perf_evsel evsel = {
 		.needs_swap = false,
 		.attr = {
 			.sample_type = sample_type,
-			.sample_regs_user = sample_regs_user,
 			.read_format = read_format,
 		},
 	};
@@ -154,7 +166,7 @@ static int do_test(u64 sample_type, u64 sample_regs_user, u64 read_format)
 		/* 1 branch_entry */
 		.data = {1, 211, 212, 213},
 	};
-	u64 user_regs[64];
+	u64 regs[64];
 	const u64 raw_data[] = {0x123456780a0b0c0dULL, 0x1102030405060708ULL};
 	const u64 data[] = {0x2211443366558877ULL, 0, 0xaabbccddeeff4321ULL};
 	struct perf_sample sample = {
@@ -176,8 +188,8 @@ static int do_test(u64 sample_type, u64 sample_regs_user, u64 read_format)
 		.branch_stack	= &branch_stack.branch_stack,
 		.user_regs	= {
 			.abi	= PERF_SAMPLE_REGS_ABI_64,
-			.mask	= sample_regs_user,
-			.regs	= user_regs,
+			.mask	= sample_regs,
+			.regs	= regs,
 		},
 		.user_stack	= {
 			.size	= sizeof(data),
@@ -187,14 +199,25 @@ static int do_test(u64 sample_type, u64 sample_regs_user, u64 read_format)
 			.time_enabled = 0x030a59d664fca7deULL,
 			.time_running = 0x011b6ae553eb98edULL,
 		},
+		.intr_regs	= {
+			.abi	= PERF_SAMPLE_REGS_ABI_64,
+			.mask	= sample_regs,
+			.regs	= regs,
+		},
 	};
 	struct sample_read_value values[] = {{1, 5}, {9, 3}, {2, 7}, {6, 4},};
 	struct perf_sample sample_out;
 	size_t i, sz, bufsz;
 	int err, ret = -1;
 
-	for (i = 0; i < sizeof(user_regs); i++)
-		*(i + (u8 *)user_regs) = i & 0xfe;
+	if (sample_type & PERF_SAMPLE_REGS_USER)
+		evsel.attr.sample_regs_user = sample_regs;
+
+	if (sample_type & PERF_SAMPLE_REGS_INTR)
+		evsel.attr.sample_regs_intr = sample_regs;
+
+	for (i = 0; i < sizeof(regs); i++)
+		*(i + (u8 *)regs) = i & 0xfe;
 
 	if (read_format & PERF_FORMAT_GROUP) {
 		sample.read.group.nr     = 4;
@@ -271,7 +294,7 @@ int test__sample_parsing(void)
 {
 	const u64 rf[] = {4, 5, 6, 7, 12, 13, 14, 15};
 	u64 sample_type;
-	u64 sample_regs_user;
+	u64 sample_regs;
 	size_t i;
 	int err;
 
@@ -280,7 +303,7 @@ int test__sample_parsing(void)
 	 * were added.  Please actually update the test rather than just change
 	 * the condition below.
 	 */
-	if (PERF_SAMPLE_MAX > PERF_SAMPLE_TRANSACTION << 1) {
+	if (PERF_SAMPLE_MAX > PERF_SAMPLE_REGS_INTR << 1) {
 		pr_debug("sample format has changed, some new PERF_SAMPLE_ bit was introduced - test needs updating\n");
 		return -1;
 	}
@@ -297,22 +320,24 @@ int test__sample_parsing(void)
 			}
 			continue;
 		}
+		sample_regs = 0;
 
 		if (sample_type == PERF_SAMPLE_REGS_USER)
-			sample_regs_user = 0x3fff;
-		else
-			sample_regs_user = 0;
+			sample_regs = 0x3fff;
+
+		if (sample_type == PERF_SAMPLE_REGS_INTR)
+			sample_regs = 0xff0fff;
 
-		err = do_test(sample_type, sample_regs_user, 0);
+		err = do_test(sample_type, sample_regs, 0);
 		if (err)
 			return err;
 	}
 
 	/* Test all sample format bits together */
 	sample_type = PERF_SAMPLE_MAX - 1;
-	sample_regs_user = 0x3fff;
+	sample_regs = 0x3fff; /* shared yb intr and user regs */
 	for (i = 0; i < ARRAY_SIZE(rf); i++) {
-		err = do_test(sample_type, sample_regs_user, rf[i]);
+		err = do_test(sample_type, sample_regs, rf[i]);
 		if (err)
 			return err;
 	}

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [tip:perf/core] perf record: Add new -I option to sample interrupted machine state
  2014-09-24 11:48 ` [PATCH v7 5/6] perf record: add new -I option to sample interrupted machine state Stephane Eranian
@ 2014-11-16 12:36   ` tip-bot for Stephane Eranian
  0 siblings, 0 replies; 23+ messages in thread
From: tip-bot for Stephane Eranian @ 2014-11-16 12:36 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: acme, hpa, mingo, tglx, torvalds, khandual, adrian.hunter,
	eranian, peterz, standby24x7, linux-kernel

Commit-ID:  4b6c51773d86883a2e80cffadbe4f178ac1babd8
Gitweb:     http://git.kernel.org/tip/4b6c51773d86883a2e80cffadbe4f178ac1babd8
Author:     Stephane Eranian <eranian@google.com>
AuthorDate: Wed, 24 Sep 2014 13:48:41 +0200
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Sun, 16 Nov 2014 11:42:02 +0100

perf record: Add new -I option to sample interrupted machine state

Add -I/--intr-regs option to capture machine state registers at
interrupt.

Add the corresponding man page description

Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1411559322-16548-6-git-send-email-eranian@google.com
Cc: cebbert.lkml@gmail.com
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masanari Iida <standby24x7@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 tools/perf/Documentation/perf-record.txt | 6 ++++++
 tools/perf/builtin-record.c              | 2 ++
 2 files changed, 8 insertions(+)

diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
index 398f8d5..af9a54e 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -214,6 +214,12 @@ if combined with -a or -C options.
 After starting the program, wait msecs before measuring. This is useful to
 filter out the startup phase of the program, which is often very different.
 
+-I::
+--intr-regs::
+Capture machine state (registers) at interrupt, i.e., on counter overflows for
+each sample. List of captured registers depends on the architecture. This option
+is off by default.
+
 SEE ALSO
 --------
 linkperf:perf-stat[1], linkperf:perf-list[1]
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 582c4da..8648c6d 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -811,6 +811,8 @@ struct option __record_options[] = {
 		    "sample transaction flags (special events only)"),
 	OPT_BOOLEAN(0, "per-thread", &record.opts.target.per_thread,
 		    "use per-thread mmaps"),
+	OPT_BOOLEAN('I', "intr-regs", &record.opts.sample_intr_regs,
+		    "Sample machine registers on interrupt"),
 	OPT_END()
 };
 

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [tip:perf/core] perf: Improve the perf_sample_data struct layout
  2014-09-24 11:48 ` [PATCH v7 6/6] perf: improve perf_sample_data struct layout Stephane Eranian
@ 2014-11-16 12:37   ` tip-bot for Peter Zijlstra
  0 siblings, 0 replies; 23+ messages in thread
From: tip-bot for Peter Zijlstra @ 2014-11-16 12:37 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: peterz, linux-kernel, mingo, torvalds, acme, tglx, hpa

Commit-ID:  2565711fb7d7c28e0cd93c8971b520d1b10b857c
Gitweb:     http://git.kernel.org/tip/2565711fb7d7c28e0cd93c8971b520d1b10b857c
Author:     Peter Zijlstra <peterz@infradead.org>
AuthorDate: Wed, 24 Sep 2014 13:48:42 +0200
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Sun, 16 Nov 2014 11:42:04 +0100

perf: Improve the perf_sample_data struct layout

This patch reorders fields in the perf_sample_data struct in order to
minimize the number of cachelines touched in perf_sample_data_init().
It also removes some intializations which are redundant with the code
in kernel/events/core.c

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1411559322-16548-7-git-send-email-eranian@google.com
Cc: cebbert.lkml@gmail.com
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: jolsa@redhat.com
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 include/linux/perf_event.h | 34 +++++++++++++++++-----------------
 kernel/events/core.c       | 16 ++++++++--------
 2 files changed, 25 insertions(+), 25 deletions(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 68d46d5..486e84c 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -580,35 +580,40 @@ extern u64 perf_event_read_value(struct perf_event *event,
 
 
 struct perf_sample_data {
-	u64				type;
+	/*
+	 * Fields set by perf_sample_data_init(), group so as to
+	 * minimize the cachelines touched.
+	 */
+	u64				addr;
+	struct perf_raw_record		*raw;
+	struct perf_branch_stack	*br_stack;
+	u64				period;
+	u64				weight;
+	u64				txn;
+	union  perf_mem_data_src	data_src;
 
+	/*
+	 * The other fields, optionally {set,used} by
+	 * perf_{prepare,output}_sample().
+	 */
+	u64				type;
 	u64				ip;
 	struct {
 		u32	pid;
 		u32	tid;
 	}				tid_entry;
 	u64				time;
-	u64				addr;
 	u64				id;
 	u64				stream_id;
 	struct {
 		u32	cpu;
 		u32	reserved;
 	}				cpu_entry;
-	u64				period;
-	union  perf_mem_data_src	data_src;
 	struct perf_callchain_entry	*callchain;
-	struct perf_raw_record		*raw;
-	struct perf_branch_stack	*br_stack;
 	struct perf_regs		regs_user;
 	struct perf_regs		regs_intr;
 	u64				stack_user_size;
-	u64				weight;
-	/*
-	 * Transaction flags for abort events:
-	 */
-	u64				txn;
-};
+} ____cacheline_aligned;
 
 /* default value for data source */
 #define PERF_MEM_NA (PERF_MEM_S(OP, NA)   |\
@@ -625,14 +630,9 @@ static inline void perf_sample_data_init(struct perf_sample_data *data,
 	data->raw  = NULL;
 	data->br_stack = NULL;
 	data->period = period;
-	data->regs_user.abi = PERF_SAMPLE_REGS_ABI_NONE;
-	data->regs_user.regs = NULL;
-	data->stack_user_size = 0;
 	data->weight = 0;
 	data->data_src.val = PERF_MEM_NA;
 	data->txn = 0;
-	data->regs_intr.abi = PERF_SAMPLE_REGS_ABI_NONE;
-	data->regs_intr.regs = NULL;
 }
 
 extern void perf_output_sample(struct perf_output_handle *handle,
diff --git a/kernel/events/core.c b/kernel/events/core.c
index c2be159..3e19d3e 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -4471,8 +4471,11 @@ static void perf_sample_regs_user(struct perf_regs *regs_user,
 	}
 
 	if (regs) {
-		regs_user->regs = regs;
 		regs_user->abi  = perf_reg_abi(current);
+		regs_user->regs = regs;
+	} else {
+		regs_user->abi = PERF_SAMPLE_REGS_ABI_NONE;
+		regs_user->regs = NULL;
 	}
 }
 
@@ -4947,12 +4950,13 @@ void perf_prepare_sample(struct perf_event_header *header,
 		header->size += size;
 	}
 
+	if (sample_type & (PERF_SAMPLE_REGS_USER | PERF_SAMPLE_STACK_USER))
+		perf_sample_regs_user(&data->regs_user, regs);
+
 	if (sample_type & PERF_SAMPLE_REGS_USER) {
 		/* regs dump ABI info */
 		int size = sizeof(u64);
 
-		perf_sample_regs_user(&data->regs_user, regs);
-
 		if (data->regs_user.regs) {
 			u64 mask = event->attr.sample_regs_user;
 			size += hweight64(mask) * sizeof(u64);
@@ -4968,15 +4972,11 @@ void perf_prepare_sample(struct perf_event_header *header,
 		 * in case new sample type is added, because we could eat
 		 * up the rest of the sample size.
 		 */
-		struct perf_regs *uregs = &data->regs_user;
 		u16 stack_size = event->attr.sample_stack_user;
 		u16 size = sizeof(u64);
 
-		if (!uregs->abi)
-			perf_sample_regs_user(uregs, regs);
-
 		stack_size = perf_sample_ustack_size(stack_size, header->size,
-						     uregs->regs);
+						     data->regs_user.regs);
 
 		/*
 		 * If there is something to dump, add space for the dump

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH v7 1/6] perf: add ability to sample machine state on interrupt
  2014-09-24 11:48 ` [PATCH v7 1/6] perf: add ability to sample machine state on interrupt Stephane Eranian
  2014-11-16 12:35   ` [tip:perf/core] perf: Add " tip-bot for Stephane Eranian
@ 2014-11-21 21:26   ` Arnaldo Carvalho de Melo
  2014-12-09 13:30     ` Arnaldo Carvalho de Melo
  1 sibling, 1 reply; 23+ messages in thread
From: Arnaldo Carvalho de Melo @ 2014-11-21 21:26 UTC (permalink / raw)
  To: Stephane Eranian; +Cc: linux-kernel, peterz, mingo, ak, jolsa, acme

Em Wed, Sep 24, 2014 at 01:48:37PM +0200, Stephane Eranian escreveu:
> Enable capture of interrupted machine state for each
> sample.
> 
> Registers to sample are passed per event in the
> sample_regs_intr bitmask.
> 
> To sample interrupt machine state, the
> PERF_SAMPLE_INTR_REGS must be passed in
> sample_type.
> 
> The list of available registers is arch
> dependent and provided by asm/perf_regs.h
> 
> Registers are laid out as u64 in the order
> of the bit order of sample_intr_regs.
> 
> This patch also adds a new ABI version
> PERF_ATTR_SIZE_VER4 because we extend
> the perf_event_attr struct with a new u64
> field.

So, trying to bisect a problem with how the TUI hist_entries browser
renders callchains I got stuck with:

[root@zoo acme]# perf report 
incompatible file format (rerun with -v to learn more)

[root@zoo acme]# perf report -v
file uses a more recent and unsupported ABI (8 bytes extra)

Because the perf.data file was generated with HEAD of perf/core and
probably the above warning comes from a simple check against the ABI
version...

I wonder if we can't just check that there are no sample_types that we
don't know and if not, just process the file anyway, i.e. the sample
will be parseable, no?

For now, in this case, I'll just regenerate the perf.data file with an
older tool...

- Arnaldo
 
> Reviewed-by: Jiri Olsa <jolsa@redhat.com>
> Reviewed-by: Andi Kleen <ak@linux.intel.com>
> Signed-off-by: Stephane Eranian <eranian@google.com>
> ---
>  include/linux/perf_event.h      |    7 ++++--
>  include/uapi/linux/perf_event.h |   15 ++++++++++++-
>  kernel/events/core.c            |   46 +++++++++++++++++++++++++++++++++++++--
>  3 files changed, 63 insertions(+), 5 deletions(-)
> 
> diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
> index 893a0d0..68d46d5 100644
> --- a/include/linux/perf_event.h
> +++ b/include/linux/perf_event.h
> @@ -79,7 +79,7 @@ struct perf_branch_stack {
>  	struct perf_branch_entry	entries[0];
>  };
>  
> -struct perf_regs_user {
> +struct perf_regs {
>  	__u64		abi;
>  	struct pt_regs	*regs;
>  };
> @@ -600,7 +600,8 @@ struct perf_sample_data {
>  	struct perf_callchain_entry	*callchain;
>  	struct perf_raw_record		*raw;
>  	struct perf_branch_stack	*br_stack;
> -	struct perf_regs_user		regs_user;
> +	struct perf_regs		regs_user;
> +	struct perf_regs		regs_intr;
>  	u64				stack_user_size;
>  	u64				weight;
>  	/*
> @@ -630,6 +631,8 @@ static inline void perf_sample_data_init(struct perf_sample_data *data,
>  	data->weight = 0;
>  	data->data_src.val = PERF_MEM_NA;
>  	data->txn = 0;
> +	data->regs_intr.abi = PERF_SAMPLE_REGS_ABI_NONE;
> +	data->regs_intr.regs = NULL;
>  }
>  
>  extern void perf_output_sample(struct perf_output_handle *handle,
> diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
> index 9269de2..48d4a01 100644
> --- a/include/uapi/linux/perf_event.h
> +++ b/include/uapi/linux/perf_event.h
> @@ -137,8 +137,9 @@ enum perf_event_sample_format {
>  	PERF_SAMPLE_DATA_SRC			= 1U << 15,
>  	PERF_SAMPLE_IDENTIFIER			= 1U << 16,
>  	PERF_SAMPLE_TRANSACTION			= 1U << 17,
> +	PERF_SAMPLE_REGS_INTR			= 1U << 18,
>  
> -	PERF_SAMPLE_MAX = 1U << 18,		/* non-ABI */
> +	PERF_SAMPLE_MAX = 1U << 19,		/* non-ABI */
>  };
>  
>  /*
> @@ -238,6 +239,7 @@ enum perf_event_read_format {
>  #define PERF_ATTR_SIZE_VER2	80	/* add: branch_sample_type */
>  #define PERF_ATTR_SIZE_VER3	96	/* add: sample_regs_user */
>  					/* add: sample_stack_user */
> +#define PERF_ATTR_SIZE_VER4	104	/* add: sample_regs_intr */
>  
>  /*
>   * Hardware event_id to monitor via a performance monitoring event:
> @@ -334,6 +336,15 @@ struct perf_event_attr {
>  
>  	/* Align to u64. */
>  	__u32	__reserved_2;
> +	/*
> +	 * Defines set of regs to dump for each sample
> +	 * state captured on:
> +	 *  - precise = 0: PMU interrupt
> +	 *  - precise > 0: sampled instruction
> +	 *
> +	 * See asm/perf_regs.h for details.
> +	 */
> +	__u64	sample_regs_intr;
>  };
>  
>  #define perf_flags(attr)	(*(&(attr)->read_format + 1))
> @@ -686,6 +697,8 @@ enum perf_event_type {
>  	 *	{ u64			weight;   } && PERF_SAMPLE_WEIGHT
>  	 *	{ u64			data_src; } && PERF_SAMPLE_DATA_SRC
>  	 *	{ u64			transaction; } && PERF_SAMPLE_TRANSACTION
> +	 *	{ u64			abi; # enum perf_sample_regs_abi
> +	 *	  u64			regs[weight(mask)]; } && PERF_SAMPLE_REGS_INTR
>  	 * };
>  	 */
>  	PERF_RECORD_SAMPLE			= 9,
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index eaa636e..7941343 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -4430,7 +4430,7 @@ perf_output_sample_regs(struct perf_output_handle *handle,
>  	}
>  }
>  
> -static void perf_sample_regs_user(struct perf_regs_user *regs_user,
> +static void perf_sample_regs_user(struct perf_regs *regs_user,
>  				  struct pt_regs *regs)
>  {
>  	if (!user_mode(regs)) {
> @@ -4446,6 +4446,14 @@ static void perf_sample_regs_user(struct perf_regs_user *regs_user,
>  	}
>  }
>  
> +static void perf_sample_regs_intr(struct perf_regs *regs_intr,
> +				  struct pt_regs *regs)
> +{
> +	regs_intr->regs = regs;
> +	regs_intr->abi  = perf_reg_abi(current);
> +}
> +
> +
>  /*
>   * Get remaining task size from user stack pointer.
>   *
> @@ -4827,6 +4835,23 @@ void perf_output_sample(struct perf_output_handle *handle,
>  	if (sample_type & PERF_SAMPLE_TRANSACTION)
>  		perf_output_put(handle, data->txn);
>  
> +	if (sample_type & PERF_SAMPLE_REGS_INTR) {
> +		u64 abi = data->regs_intr.abi;
> +		/*
> +		 * If there are no regs to dump, notice it through
> +		 * first u64 being zero (PERF_SAMPLE_REGS_ABI_NONE).
> +		 */
> +		perf_output_put(handle, abi);
> +
> +		if (abi) {
> +			u64 mask = event->attr.sample_regs_intr;
> +
> +			perf_output_sample_regs(handle,
> +						data->regs_intr.regs,
> +						mask);
> +		}
> +	}
> +
>  	if (!event->attr.watermark) {
>  		int wakeup_events = event->attr.wakeup_events;
>  
> @@ -4913,7 +4938,7 @@ void perf_prepare_sample(struct perf_event_header *header,
>  		 * in case new sample type is added, because we could eat
>  		 * up the rest of the sample size.
>  		 */
> -		struct perf_regs_user *uregs = &data->regs_user;
> +		struct perf_regs *uregs = &data->regs_user;
>  		u16 stack_size = event->attr.sample_stack_user;
>  		u16 size = sizeof(u64);
>  
> @@ -4934,6 +4959,21 @@ void perf_prepare_sample(struct perf_event_header *header,
>  		data->stack_user_size = stack_size;
>  		header->size += size;
>  	}
> +
> +	if (sample_type & PERF_SAMPLE_REGS_INTR) {
> +		/* regs dump ABI info */
> +		int size = sizeof(u64);
> +
> +		perf_sample_regs_intr(&data->regs_intr, regs);
> +
> +		if (data->regs_intr.regs) {
> +			u64 mask = event->attr.sample_regs_intr;
> +
> +			size += hweight64(mask) * sizeof(u64);
> +		}
> +
> +		header->size += size;
> +	}
>  }
>  
>  static void perf_event_output(struct perf_event *event,
> @@ -7134,6 +7174,8 @@ static int perf_copy_attr(struct perf_event_attr __user *uattr,
>  			ret = -EINVAL;
>  	}
>  
> +	if (attr->sample_type & PERF_SAMPLE_REGS_INTR)
> +		ret = perf_reg_validate(attr->sample_regs_intr);
>  out:
>  	return ret;
>  
> -- 
> 1.7.9.5

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v7 1/6] perf: add ability to sample machine state on interrupt
  2014-11-21 21:26   ` [PATCH v7 1/6] perf: add " Arnaldo Carvalho de Melo
@ 2014-12-09 13:30     ` Arnaldo Carvalho de Melo
  2014-12-09 13:39       ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 23+ messages in thread
From: Arnaldo Carvalho de Melo @ 2014-12-09 13:30 UTC (permalink / raw)
  To: Stephane Eranian; +Cc: linux-kernel, peterz, mingo, ak, Jiri Olsa, acme

Em Fri, Nov 21, 2014 at 07:26:31PM -0200, Arnaldo Carvalho de Melo escreveu:
> Em Wed, Sep 24, 2014 at 01:48:37PM +0200, Stephane Eranian escreveu:
> > Enable capture of interrupted machine state for each
> > sample.
> > 
> > Registers to sample are passed per event in the
> > sample_regs_intr bitmask.
> > 
> > To sample interrupt machine state, the
> > PERF_SAMPLE_INTR_REGS must be passed in
> > sample_type.
> > 
> > The list of available registers is arch
> > dependent and provided by asm/perf_regs.h
> > 
> > Registers are laid out as u64 in the order
> > of the bit order of sample_intr_regs.
> > 
> > This patch also adds a new ABI version
> > PERF_ATTR_SIZE_VER4 because we extend
> > the perf_event_attr struct with a new u64
> > field.

I think this problem is also related to this changeset:

[root@zoo ~]# perf test 15
15: struct perf_event_attr setup                           :FAILED
'/home/acme/libexec/perf-core/tests/attr/test-stat-default' - match
failure
 Ok
[root@zoo ~]# perf test -v 15
15: struct perf_event_attr setup                           :
--- start ---
test child forked, pid 17464
running
'/home/acme/libexec/perf-core/tests/attr/test-record-branch-filter-hv'
unsupp
'/home/acme/libexec/perf-core/tests/attr/test-record-branch-filter-hv'
running '/home/acme/libexec/perf-core/tests/attr/test-record-count'
unsupp  '/home/acme/libexec/perf-core/tests/attr/test-record-count'
running '/home/acme/libexec/perf-core/tests/attr/test-record-basic'
unsupp  '/home/acme/libexec/perf-core/tests/attr/test-record-basic'
running '/home/acme/libexec/perf-core/tests/attr/test-stat-default'
expected size=96, got 104
FAILED '/home/acme/libexec/perf-core/tests/attr/test-stat-default' -
match failure
test child finished with 0
---- end ----
struct perf_event_attr setup: Ok
[root@zoo ~]# uname -r
3.17.3-200.fc20.x86_64
[root@zoo ~]#

Checking if this is just a matter of updating the test entry.

- Arnaldo

> 
> So, trying to bisect a problem with how the TUI hist_entries browser
> renders callchains I got stuck with:
> 
> [root@zoo acme]# perf report 
> incompatible file format (rerun with -v to learn more)
> 
> [root@zoo acme]# perf report -v
> file uses a more recent and unsupported ABI (8 bytes extra)
> 
> Because the perf.data file was generated with HEAD of perf/core and
> probably the above warning comes from a simple check against the ABI
> version...
> 
> I wonder if we can't just check that there are no sample_types that we
> don't know and if not, just process the file anyway, i.e. the sample
> will be parseable, no?
> 
> For now, in this case, I'll just regenerate the perf.data file with an
> older tool...
> 
> - Arnaldo
>  
> > Reviewed-by: Jiri Olsa <jolsa@redhat.com>
> > Reviewed-by: Andi Kleen <ak@linux.intel.com>
> > Signed-off-by: Stephane Eranian <eranian@google.com>
> > ---
> >  include/linux/perf_event.h      |    7 ++++--
> >  include/uapi/linux/perf_event.h |   15 ++++++++++++-
> >  kernel/events/core.c            |   46 +++++++++++++++++++++++++++++++++++++--
> >  3 files changed, 63 insertions(+), 5 deletions(-)
> > 
> > diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
> > index 893a0d0..68d46d5 100644
> > --- a/include/linux/perf_event.h
> > +++ b/include/linux/perf_event.h
> > @@ -79,7 +79,7 @@ struct perf_branch_stack {
> >  	struct perf_branch_entry	entries[0];
> >  };
> >  
> > -struct perf_regs_user {
> > +struct perf_regs {
> >  	__u64		abi;
> >  	struct pt_regs	*regs;
> >  };
> > @@ -600,7 +600,8 @@ struct perf_sample_data {
> >  	struct perf_callchain_entry	*callchain;
> >  	struct perf_raw_record		*raw;
> >  	struct perf_branch_stack	*br_stack;
> > -	struct perf_regs_user		regs_user;
> > +	struct perf_regs		regs_user;
> > +	struct perf_regs		regs_intr;
> >  	u64				stack_user_size;
> >  	u64				weight;
> >  	/*
> > @@ -630,6 +631,8 @@ static inline void perf_sample_data_init(struct perf_sample_data *data,
> >  	data->weight = 0;
> >  	data->data_src.val = PERF_MEM_NA;
> >  	data->txn = 0;
> > +	data->regs_intr.abi = PERF_SAMPLE_REGS_ABI_NONE;
> > +	data->regs_intr.regs = NULL;
> >  }
> >  
> >  extern void perf_output_sample(struct perf_output_handle *handle,
> > diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
> > index 9269de2..48d4a01 100644
> > --- a/include/uapi/linux/perf_event.h
> > +++ b/include/uapi/linux/perf_event.h
> > @@ -137,8 +137,9 @@ enum perf_event_sample_format {
> >  	PERF_SAMPLE_DATA_SRC			= 1U << 15,
> >  	PERF_SAMPLE_IDENTIFIER			= 1U << 16,
> >  	PERF_SAMPLE_TRANSACTION			= 1U << 17,
> > +	PERF_SAMPLE_REGS_INTR			= 1U << 18,
> >  
> > -	PERF_SAMPLE_MAX = 1U << 18,		/* non-ABI */
> > +	PERF_SAMPLE_MAX = 1U << 19,		/* non-ABI */
> >  };
> >  
> >  /*
> > @@ -238,6 +239,7 @@ enum perf_event_read_format {
> >  #define PERF_ATTR_SIZE_VER2	80	/* add: branch_sample_type */
> >  #define PERF_ATTR_SIZE_VER3	96	/* add: sample_regs_user */
> >  					/* add: sample_stack_user */
> > +#define PERF_ATTR_SIZE_VER4	104	/* add: sample_regs_intr */
> >  
> >  /*
> >   * Hardware event_id to monitor via a performance monitoring event:
> > @@ -334,6 +336,15 @@ struct perf_event_attr {
> >  
> >  	/* Align to u64. */
> >  	__u32	__reserved_2;
> > +	/*
> > +	 * Defines set of regs to dump for each sample
> > +	 * state captured on:
> > +	 *  - precise = 0: PMU interrupt
> > +	 *  - precise > 0: sampled instruction
> > +	 *
> > +	 * See asm/perf_regs.h for details.
> > +	 */
> > +	__u64	sample_regs_intr;
> >  };
> >  
> >  #define perf_flags(attr)	(*(&(attr)->read_format + 1))
> > @@ -686,6 +697,8 @@ enum perf_event_type {
> >  	 *	{ u64			weight;   } && PERF_SAMPLE_WEIGHT
> >  	 *	{ u64			data_src; } && PERF_SAMPLE_DATA_SRC
> >  	 *	{ u64			transaction; } && PERF_SAMPLE_TRANSACTION
> > +	 *	{ u64			abi; # enum perf_sample_regs_abi
> > +	 *	  u64			regs[weight(mask)]; } && PERF_SAMPLE_REGS_INTR
> >  	 * };
> >  	 */
> >  	PERF_RECORD_SAMPLE			= 9,
> > diff --git a/kernel/events/core.c b/kernel/events/core.c
> > index eaa636e..7941343 100644
> > --- a/kernel/events/core.c
> > +++ b/kernel/events/core.c
> > @@ -4430,7 +4430,7 @@ perf_output_sample_regs(struct perf_output_handle *handle,
> >  	}
> >  }
> >  
> > -static void perf_sample_regs_user(struct perf_regs_user *regs_user,
> > +static void perf_sample_regs_user(struct perf_regs *regs_user,
> >  				  struct pt_regs *regs)
> >  {
> >  	if (!user_mode(regs)) {
> > @@ -4446,6 +4446,14 @@ static void perf_sample_regs_user(struct perf_regs_user *regs_user,
> >  	}
> >  }
> >  
> > +static void perf_sample_regs_intr(struct perf_regs *regs_intr,
> > +				  struct pt_regs *regs)
> > +{
> > +	regs_intr->regs = regs;
> > +	regs_intr->abi  = perf_reg_abi(current);
> > +}
> > +
> > +
> >  /*
> >   * Get remaining task size from user stack pointer.
> >   *
> > @@ -4827,6 +4835,23 @@ void perf_output_sample(struct perf_output_handle *handle,
> >  	if (sample_type & PERF_SAMPLE_TRANSACTION)
> >  		perf_output_put(handle, data->txn);
> >  
> > +	if (sample_type & PERF_SAMPLE_REGS_INTR) {
> > +		u64 abi = data->regs_intr.abi;
> > +		/*
> > +		 * If there are no regs to dump, notice it through
> > +		 * first u64 being zero (PERF_SAMPLE_REGS_ABI_NONE).
> > +		 */
> > +		perf_output_put(handle, abi);
> > +
> > +		if (abi) {
> > +			u64 mask = event->attr.sample_regs_intr;
> > +
> > +			perf_output_sample_regs(handle,
> > +						data->regs_intr.regs,
> > +						mask);
> > +		}
> > +	}
> > +
> >  	if (!event->attr.watermark) {
> >  		int wakeup_events = event->attr.wakeup_events;
> >  
> > @@ -4913,7 +4938,7 @@ void perf_prepare_sample(struct perf_event_header *header,
> >  		 * in case new sample type is added, because we could eat
> >  		 * up the rest of the sample size.
> >  		 */
> > -		struct perf_regs_user *uregs = &data->regs_user;
> > +		struct perf_regs *uregs = &data->regs_user;
> >  		u16 stack_size = event->attr.sample_stack_user;
> >  		u16 size = sizeof(u64);
> >  
> > @@ -4934,6 +4959,21 @@ void perf_prepare_sample(struct perf_event_header *header,
> >  		data->stack_user_size = stack_size;
> >  		header->size += size;
> >  	}
> > +
> > +	if (sample_type & PERF_SAMPLE_REGS_INTR) {
> > +		/* regs dump ABI info */
> > +		int size = sizeof(u64);
> > +
> > +		perf_sample_regs_intr(&data->regs_intr, regs);
> > +
> > +		if (data->regs_intr.regs) {
> > +			u64 mask = event->attr.sample_regs_intr;
> > +
> > +			size += hweight64(mask) * sizeof(u64);
> > +		}
> > +
> > +		header->size += size;
> > +	}
> >  }
> >  
> >  static void perf_event_output(struct perf_event *event,
> > @@ -7134,6 +7174,8 @@ static int perf_copy_attr(struct perf_event_attr __user *uattr,
> >  			ret = -EINVAL;
> >  	}
> >  
> > +	if (attr->sample_type & PERF_SAMPLE_REGS_INTR)
> > +		ret = perf_reg_validate(attr->sample_regs_intr);
> >  out:
> >  	return ret;
> >  
> > -- 
> > 1.7.9.5

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v7 1/6] perf: add ability to sample machine state on interrupt
  2014-12-09 13:30     ` Arnaldo Carvalho de Melo
@ 2014-12-09 13:39       ` Arnaldo Carvalho de Melo
  2014-12-09 13:53         ` perf tests: Fix attr tests size values interrupt Jiri Olsa
  0 siblings, 1 reply; 23+ messages in thread
From: Arnaldo Carvalho de Melo @ 2014-12-09 13:39 UTC (permalink / raw)
  To: Stephane Eranian; +Cc: linux-kernel, peterz, mingo, ak, Jiri Olsa, acme

Em Tue, Dec 09, 2014 at 11:30:31AM -0200, Arnaldo Carvalho de Melo escreveu:
> Em Fri, Nov 21, 2014 at 07:26:31PM -0200, Arnaldo Carvalho de Melo escreveu:
> > Em Wed, Sep 24, 2014 at 01:48:37PM +0200, Stephane Eranian escreveu:
> > > This patch also adds a new ABI version
> > > PERF_ATTR_SIZE_VER4 because we extend
> > > the perf_event_attr struct with a new u64
> > > field.
 
> I think this problem is also related to this changeset:
 
> [root@zoo ~]# perf test -v 15
> 15: struct perf_event_attr setup                           :
<SNIP>
> running '/home/acme/libexec/perf-core/tests/attr/test-stat-default'
> expected size=96, got 104
> FAILED '/home/acme/libexec/perf-core/tests/attr/test-stat-default' -
> match failure
> test child finished with 0
> ---- end ----
> struct perf_event_attr setup: Ok
> [root@zoo ~]# uname -r
> 3.17.3-200.fc20.x86_64
> [root@zoo ~]#
> 
> Checking if this is just a matter of updating the test entry.

Well, here I think that size variable needs to somehow be tested against
the value of another field, the ABI one, so that for each ABI we test
against the rightsize, that is, after this cset, 8 bytes (sizeof u64)
bigger, Jiri?

- Arnaldo

^ permalink raw reply	[flat|nested] 23+ messages in thread

* perf tests: Fix attr tests size values interrupt
  2014-12-09 13:39       ` Arnaldo Carvalho de Melo
@ 2014-12-09 13:53         ` Jiri Olsa
  2014-12-09 13:59           ` Arnaldo Carvalho de Melo
  2014-12-12  8:18           ` [tip:perf/urgent] perf tests: Fix attr tests size values to cope with machine state on interrupt ABI changes tip-bot for Jiri Olsa
  0 siblings, 2 replies; 23+ messages in thread
From: Jiri Olsa @ 2014-12-09 13:53 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Stephane Eranian, linux-kernel, peterz, mingo, ak, acme

On Tue, Dec 09, 2014 at 11:39:47AM -0200, Arnaldo Carvalho de Melo wrote:
> Em Tue, Dec 09, 2014 at 11:30:31AM -0200, Arnaldo Carvalho de Melo escreveu:
> > Em Fri, Nov 21, 2014 at 07:26:31PM -0200, Arnaldo Carvalho de Melo escreveu:
> > > Em Wed, Sep 24, 2014 at 01:48:37PM +0200, Stephane Eranian escreveu:
> > > > This patch also adds a new ABI version
> > > > PERF_ATTR_SIZE_VER4 because we extend
> > > > the perf_event_attr struct with a new u64
> > > > field.
>  
> > I think this problem is also related to this changeset:
>  
> > [root@zoo ~]# perf test -v 15
> > 15: struct perf_event_attr setup                           :
> <SNIP>
> > running '/home/acme/libexec/perf-core/tests/attr/test-stat-default'
> > expected size=96, got 104
> > FAILED '/home/acme/libexec/perf-core/tests/attr/test-stat-default' -
> > match failure
> > test child finished with 0
> > ---- end ----
> > struct perf_event_attr setup: Ok
> > [root@zoo ~]# uname -r
> > 3.17.3-200.fc20.x86_64
> > [root@zoo ~]#
> > 
> > Checking if this is just a matter of updating the test entry.
> 
> Well, here I think that size variable needs to somehow be tested against
> the value of another field, the ABI one, so that for each ABI we test
> against the rightsize, that is, after this cset, 8 bytes (sizeof u64)
> bigger, Jiri?

well this patch enlarged the perf_event_attr
so the test needs to be adjusted like in patch below

[jolsa@krava perf]$ ./perf test attr -vv
15: struct perf_event_attr setup                           :
--- start ---
test child forked, pid 9719
running './tests/attr/test-stat-group1'
  'PERF_TEST_ATTR=/tmp/tmp4drvul ./perf stat -o /tmp/tmp4drvul/perf.data -e '{cycles,instructions}' kill >/dev/null 2>&1' ret 1 
expected size=96, got 104
FAILED './tests/attr/test-stat-group1' - match failure

jirka


---
Following change adjusted 'struct perf_event_attr', but let
the attr test's sizes untouched:
  60e2364e60e8 perf: Add ability to sample machine state on interrupt

Adjusting test size values for attr test.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
diff --git a/tools/perf/tests/attr/base-record b/tools/perf/tests/attr/base-record
index f710b92ccff6..d3095dafed36 100644
--- a/tools/perf/tests/attr/base-record
+++ b/tools/perf/tests/attr/base-record
@@ -5,7 +5,7 @@ group_fd=-1
 flags=0|8
 cpu=*
 type=0|1
-size=96
+size=104
 config=0
 sample_period=4000
 sample_type=263
diff --git a/tools/perf/tests/attr/base-stat b/tools/perf/tests/attr/base-stat
index dc3ada2470c0..872ed7e24c7c 100644
--- a/tools/perf/tests/attr/base-stat
+++ b/tools/perf/tests/attr/base-stat
@@ -5,7 +5,7 @@ group_fd=-1
 flags=0|8
 cpu=*
 type=0
-size=96
+size=104
 config=0
 sample_period=0
 sample_type=0

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: perf tests: Fix attr tests size values interrupt
  2014-12-09 13:53         ` perf tests: Fix attr tests size values interrupt Jiri Olsa
@ 2014-12-09 13:59           ` Arnaldo Carvalho de Melo
  2014-12-12  8:18           ` [tip:perf/urgent] perf tests: Fix attr tests size values to cope with machine state on interrupt ABI changes tip-bot for Jiri Olsa
  1 sibling, 0 replies; 23+ messages in thread
From: Arnaldo Carvalho de Melo @ 2014-12-09 13:59 UTC (permalink / raw)
  To: Jiri Olsa; +Cc: Stephane Eranian, linux-kernel, peterz, mingo, ak, acme

Em Tue, Dec 09, 2014 at 02:53:01PM +0100, Jiri Olsa escreveu:
> On Tue, Dec 09, 2014 at 11:39:47AM -0200, Arnaldo Carvalho de Melo wrote:
> > Em Tue, Dec 09, 2014 at 11:30:31AM -0200, Arnaldo Carvalho de Melo escreveu:
> > > Em Fri, Nov 21, 2014 at 07:26:31PM -0200, Arnaldo Carvalho de Melo escreveu:
> > > > Em Wed, Sep 24, 2014 at 01:48:37PM +0200, Stephane Eranian escreveu:
> > > > > This patch also adds a new ABI version
> > > > > PERF_ATTR_SIZE_VER4 because we extend
> > > > > the perf_event_attr struct with a new u64
> > > > > field.
> >  
> > > I think this problem is also related to this changeset:
> >  
> > > [root@zoo ~]# perf test -v 15
> > > 15: struct perf_event_attr setup                           :
> > <SNIP>
> > > running '/home/acme/libexec/perf-core/tests/attr/test-stat-default'
> > > expected size=96, got 104
> > > FAILED '/home/acme/libexec/perf-core/tests/attr/test-stat-default' -
> > > match failure
> > > test child finished with 0
> > > ---- end ----
> > > struct perf_event_attr setup: Ok
> > > [root@zoo ~]# uname -r
> > > 3.17.3-200.fc20.x86_64
> > > [root@zoo ~]#
> > > 
> > > Checking if this is just a matter of updating the test entry.
> > 
> > Well, here I think that size variable needs to somehow be tested against
> > the value of another field, the ABI one, so that for each ABI we test
> > against the rightsize, that is, after this cset, 8 bytes (sizeof u64)
> > bigger, Jiri?
> 
> well this patch enlarged the perf_event_attr
> so the test needs to be adjusted like in patch below

Ok, got it, since the perf.data file is generated in the test, it will
have the current value, that now is 104, ok, applying your patch.

Thanks,

- Arnaldo
 
> [jolsa@krava perf]$ ./perf test attr -vv
> 15: struct perf_event_attr setup                           :
> --- start ---
> test child forked, pid 9719
> running './tests/attr/test-stat-group1'
>   'PERF_TEST_ATTR=/tmp/tmp4drvul ./perf stat -o /tmp/tmp4drvul/perf.data -e '{cycles,instructions}' kill >/dev/null 2>&1' ret 1 
> expected size=96, got 104
> FAILED './tests/attr/test-stat-group1' - match failure
> 
> jirka
> 
> 
> ---
> Following change adjusted 'struct perf_event_attr', but let
> the attr test's sizes untouched:
>   60e2364e60e8 perf: Add ability to sample machine state on interrupt
> 
> Adjusting test size values for attr test.
> 
> Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> ---
> diff --git a/tools/perf/tests/attr/base-record b/tools/perf/tests/attr/base-record
> index f710b92ccff6..d3095dafed36 100644
> --- a/tools/perf/tests/attr/base-record
> +++ b/tools/perf/tests/attr/base-record
> @@ -5,7 +5,7 @@ group_fd=-1
>  flags=0|8
>  cpu=*
>  type=0|1
> -size=96
> +size=104
>  config=0
>  sample_period=4000
>  sample_type=263
> diff --git a/tools/perf/tests/attr/base-stat b/tools/perf/tests/attr/base-stat
> index dc3ada2470c0..872ed7e24c7c 100644
> --- a/tools/perf/tests/attr/base-stat
> +++ b/tools/perf/tests/attr/base-stat
> @@ -5,7 +5,7 @@ group_fd=-1
>  flags=0|8
>  cpu=*
>  type=0
> -size=96
> +size=104
>  config=0
>  sample_period=0
>  sample_type=0

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [tip:perf/urgent] perf tests: Fix attr tests size values to cope with machine state on interrupt ABI changes
  2014-12-09 13:53         ` perf tests: Fix attr tests size values interrupt Jiri Olsa
  2014-12-09 13:59           ` Arnaldo Carvalho de Melo
@ 2014-12-12  8:18           ` tip-bot for Jiri Olsa
  1 sibling, 0 replies; 23+ messages in thread
From: tip-bot for Jiri Olsa @ 2014-12-12  8:18 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: peterz, linux-kernel, ak, mingo, tglx, jolsa, hpa, eranian, acme

Commit-ID:  75226c577c8869ae1449cf92d781edda0177f1cf
Gitweb:     http://git.kernel.org/tip/75226c577c8869ae1449cf92d781edda0177f1cf
Author:     Jiri Olsa <jolsa@redhat.com>
AuthorDate: Tue, 9 Dec 2014 14:53:01 +0100
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Tue, 9 Dec 2014 11:02:43 -0300

perf tests: Fix attr tests size values to cope with machine state on interrupt ABI changes

Following change adjusted 'struct perf_event_attr', but let
the attr test's sizes untouched:
  60e2364e60e8 perf: Add ability to sample machine state on interrupt

  [jolsa@krava perf]$ ./perf test attr -vv
  --- start ---
  test child forked, pid 9719
  running './tests/attr/test-stat-group1'
    'PERF_TEST_ATTR=/tmp/tmp4drvul ./perf stat -o /tmp/tmp4drvul/perf.data -e '{cycles,instructions}' kill >/dev/null 2>&1' ret 1
  expected size=96, got 104
  FAILED './tests/attr/test-stat-group1' - match failure

Adjusting test size values for attr test.

Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20141209135301.GC6784@krava.brq.redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/tests/attr/base-record | 2 +-
 tools/perf/tests/attr/base-stat   | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/perf/tests/attr/base-record b/tools/perf/tests/attr/base-record
index f710b92..d3095da 100644
--- a/tools/perf/tests/attr/base-record
+++ b/tools/perf/tests/attr/base-record
@@ -5,7 +5,7 @@ group_fd=-1
 flags=0|8
 cpu=*
 type=0|1
-size=96
+size=104
 config=0
 sample_period=4000
 sample_type=263
diff --git a/tools/perf/tests/attr/base-stat b/tools/perf/tests/attr/base-stat
index dc3ada2..872ed7e 100644
--- a/tools/perf/tests/attr/base-stat
+++ b/tools/perf/tests/attr/base-stat
@@ -5,7 +5,7 @@ group_fd=-1
 flags=0|8
 cpu=*
 type=0
-size=96
+size=104
 config=0
 sample_period=0
 sample_type=0

^ permalink raw reply related	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2014-12-12  8:19 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-09-24 11:48 [PATCH v7 0/6] perf: add ability to sample interrupted machine state Stephane Eranian
2014-09-24 11:48 ` [PATCH v7 1/6] perf: add ability to sample machine state on interrupt Stephane Eranian
2014-11-16 12:35   ` [tip:perf/core] perf: Add " tip-bot for Stephane Eranian
2014-11-21 21:26   ` [PATCH v7 1/6] perf: add " Arnaldo Carvalho de Melo
2014-12-09 13:30     ` Arnaldo Carvalho de Melo
2014-12-09 13:39       ` Arnaldo Carvalho de Melo
2014-12-09 13:53         ` perf tests: Fix attr tests size values interrupt Jiri Olsa
2014-12-09 13:59           ` Arnaldo Carvalho de Melo
2014-12-12  8:18           ` [tip:perf/urgent] perf tests: Fix attr tests size values to cope with machine state on interrupt ABI changes tip-bot for Jiri Olsa
2014-09-24 11:48 ` [PATCH v7 2/6] perf/x86: add support for sampling PEBS machine state registers Stephane Eranian
2014-11-16 12:35   ` [tip:perf/core] perf/x86: Add " tip-bot for Stephane Eranian
2014-09-24 11:48 ` [PATCH v7 3/6] perf tools: add core support for sampling intr machine state regs Stephane Eranian
2014-11-16 12:36   ` [tip:perf/core] perf tools: Add " tip-bot for Stephane Eranian
2014-09-24 11:48 ` [PATCH v7 4/6] perf/tests: add interrupted state sample parsing test Stephane Eranian
2014-11-16 12:36   ` [tip:perf/core] perf/tests: Add " tip-bot for Stephane Eranian
2014-09-24 11:48 ` [PATCH v7 5/6] perf record: add new -I option to sample interrupted machine state Stephane Eranian
2014-11-16 12:36   ` [tip:perf/core] perf record: Add " tip-bot for Stephane Eranian
2014-09-24 11:48 ` [PATCH v7 6/6] perf: improve perf_sample_data struct layout Stephane Eranian
2014-11-16 12:37   ` [tip:perf/core] perf: Improve the " tip-bot for Peter Zijlstra
2014-09-25  9:26 ` [PATCH v7 0/6] perf: add ability to sample interrupted machine state Peter Zijlstra
2014-09-25 10:32   ` Stephane Eranian
2014-09-25 14:29     ` Peter Zijlstra
2014-09-25 16:22       ` Stephane Eranian

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.