linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Fix Skylake PEBS data source for perf v5
@ 2017-08-16 22:21 Andi Kleen
  2017-08-16 22:21 ` [PATCH v5 1/4] perf/x86: Move Nehalem PEBS code to flag Andi Kleen
                   ` (4 more replies)
  0 siblings, 5 replies; 15+ messages in thread
From: Andi Kleen @ 2017-08-16 22:21 UTC (permalink / raw)
  To: peterz, acme; +Cc: jolsa, linux-kernel

Fix data source reporting for Skylake and Skylake Server.
The encodings have changed to express support for L4 and persistent
memory. 

The first patch is a (independent) cleanup.

The second is for the kernel and the third/fourth for perf/tools.
The kernel part and perf tools will compile independently.

Also available in 
git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-misc.git perf/skx-data-src-7

v1:
Initial post
v2:
Merged some patches.
Change encoding to use special bit for each combination instead
of modifiers.
v3: 
Switch to new generic lvlnum indication
v4:
Repost. No changes.
v5:
ported to latest tree. Retested remote HITM.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v5 1/4] perf/x86: Move Nehalem PEBS code to flag
  2017-08-16 22:21 Fix Skylake PEBS data source for perf v5 Andi Kleen
@ 2017-08-16 22:21 ` Andi Kleen
  2017-08-25 11:53   ` [tip:perf/core] " tip-bot for Andi Kleen
  2017-08-16 22:21 ` [PATCH v5 2/4] perf/x86: Fix data source decoding for Skylake Andi Kleen
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 15+ messages in thread
From: Andi Kleen @ 2017-08-16 22:21 UTC (permalink / raw)
  To: peterz, acme; +Cc: jolsa, linux-kernel, Andi Kleen

From: Andi Kleen <ak@linux.intel.com>

Minor cleanup: use an explicit x86_pmu flag to handle the
missing Lock / TLB information on Nehalem, instead of always
checking the model number for each PEBS sample.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 arch/x86/events/intel/core.c | 1 +
 arch/x86/events/intel/ds.c   | 5 +----
 arch/x86/events/perf_event.h | 3 ++-
 3 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 98b0f0729527..c3439a36dcf9 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -3905,6 +3905,7 @@ __init int intel_pmu_init(void)
 
 		intel_pmu_pebs_data_source_nhm();
 		x86_add_quirk(intel_nehalem_quirk);
+		x86_pmu.pebs_no_tlb = 1;
 
 		pr_cont("Nehalem events, ");
 		break;
diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index a322fed5f8ed..3ccdf8cb4495 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -149,8 +149,6 @@ static u64 load_latency_data(u64 status)
 {
 	union intel_x86_pebs_dse dse;
 	u64 val;
-	int model = boot_cpu_data.x86_model;
-	int fam = boot_cpu_data.x86;
 
 	dse.val = status;
 
@@ -162,8 +160,7 @@ static u64 load_latency_data(u64 status)
 	/*
 	 * Nehalem models do not support TLB, Lock infos
 	 */
-	if (fam == 0x6 && (model == 26 || model == 30
-	    || model == 31 || model == 46)) {
+	if (x86_pmu.pebs_no_tlb) {
 		val |= P(TLB, NA) | P(LOCK, NA);
 		return val;
 	}
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index 476aec3a4cab..2e9636e4068f 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -591,7 +591,8 @@ struct x86_pmu {
 			pebs		:1,
 			pebs_active	:1,
 			pebs_broken	:1,
-			pebs_prec_dist	:1;
+			pebs_prec_dist	:1,
+			pebs_no_tlb	:1;
 	int		pebs_record_size;
 	int		pebs_buffer_size;
 	void		(*drain_pebs)(struct pt_regs *regs);
-- 
2.9.4

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v5 2/4] perf/x86: Fix data source decoding for Skylake
  2017-08-16 22:21 Fix Skylake PEBS data source for perf v5 Andi Kleen
  2017-08-16 22:21 ` [PATCH v5 1/4] perf/x86: Move Nehalem PEBS code to flag Andi Kleen
@ 2017-08-16 22:21 ` Andi Kleen
  2017-08-25 11:53   ` [tip:perf/core] " tip-bot for Andi Kleen
  2017-08-16 22:21 ` [PATCH v5 3/4] perf, tools: Add support for printing new mem_info encodings Andi Kleen
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 15+ messages in thread
From: Andi Kleen @ 2017-08-16 22:21 UTC (permalink / raw)
  To: peterz, acme; +Cc: jolsa, linux-kernel, Andi Kleen

From: Andi Kleen <ak@linux.intel.com>

Skylake changed the encoding of the PEBS data source field.
Some combinations are not available anymore, but some new cases
e.g. for L4 cache hit are added.

Fix up the conversion table for Skylake, similar as had been done
for Nehalem.

On Skylake server the encoding for L4 actually means persistent
memory. Handle this case too.

To properly describe it in the abstracted perf format I had to add
some new fields. Since a hit can have only one level add a new
field that is an enumeration, not a bit field to describe
the level. It can describe any level. Some numbers are also
used to describe PMEM and LFB.

Also add a new generic remote flag that can be combined with
the generic level to signify a remote cache.

And there is an extension field for the snoop indication to handle
the Forward state.

I didn't add a generic flag for hops because it's not needed
for Skylake.

I changed the existing encodings for older CPUs to also fill in the
new level and remote fields.

v2: Merge with persistent memory patch.
Add explicit bit for each case instead of using generic modifier.
v3: Rework with new lvlnum and remote fields.
Change older CPUs to report the new fields too.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 arch/x86/events/intel/core.c    |  2 ++
 arch/x86/events/intel/ds.c      | 51 ++++++++++++++++++++++++++---------------
 arch/x86/events/perf_event.h    |  2 ++
 include/uapi/linux/perf_event.h | 30 ++++++++++++++++++++++--
 4 files changed, 64 insertions(+), 21 deletions(-)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index c3439a36dcf9..6f342001ec6a 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -4208,6 +4208,8 @@ __init int intel_pmu_init(void)
 						  skl_format_attr);
 		WARN_ON(!x86_pmu.format_attrs);
 		x86_pmu.cpu_events = hsw_events_attrs;
+		intel_pmu_pebs_data_source_skl(
+			boot_cpu_data.x86_model == INTEL_FAM6_SKYLAKE_X);
 		pr_cont("Skylake events, ");
 		break;
 
diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index 3ccdf8cb4495..98e36e0c791c 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -49,34 +49,47 @@ union intel_x86_pebs_dse {
  */
 #define P(a, b) PERF_MEM_S(a, b)
 #define OP_LH (P(OP, LOAD) | P(LVL, HIT))
+#define LEVEL(x) P(LVLNUM, x)
+#define REM P(REMOTE, REMOTE)
 #define SNOOP_NONE_MISS (P(SNOOP, NONE) | P(SNOOP, MISS))
 
 /* Version for Sandy Bridge and later */
 static u64 pebs_data_source[] = {
-	P(OP, LOAD) | P(LVL, MISS) | P(LVL, L3) | P(SNOOP, NA),/* 0x00:ukn L3 */
-	OP_LH | P(LVL, L1)  | P(SNOOP, NONE),	/* 0x01: L1 local */
-	OP_LH | P(LVL, LFB) | P(SNOOP, NONE),	/* 0x02: LFB hit */
-	OP_LH | P(LVL, L2)  | P(SNOOP, NONE),	/* 0x03: L2 hit */
-	OP_LH | P(LVL, L3)  | P(SNOOP, NONE),	/* 0x04: L3 hit */
-	OP_LH | P(LVL, L3)  | P(SNOOP, MISS),	/* 0x05: L3 hit, snoop miss */
-	OP_LH | P(LVL, L3)  | P(SNOOP, HIT),	/* 0x06: L3 hit, snoop hit */
-	OP_LH | P(LVL, L3)  | P(SNOOP, HITM),	/* 0x07: L3 hit, snoop hitm */
-	OP_LH | P(LVL, REM_CCE1) | P(SNOOP, HIT),  /* 0x08: L3 miss snoop hit */
-	OP_LH | P(LVL, REM_CCE1) | P(SNOOP, HITM), /* 0x09: L3 miss snoop hitm*/
-	OP_LH | P(LVL, LOC_RAM)  | P(SNOOP, HIT),  /* 0x0a: L3 miss, shared */
-	OP_LH | P(LVL, REM_RAM1) | P(SNOOP, HIT),  /* 0x0b: L3 miss, shared */
-	OP_LH | P(LVL, LOC_RAM)  | SNOOP_NONE_MISS,/* 0x0c: L3 miss, excl */
-	OP_LH | P(LVL, REM_RAM1) | SNOOP_NONE_MISS,/* 0x0d: L3 miss, excl */
-	OP_LH | P(LVL, IO)  | P(SNOOP, NONE), /* 0x0e: I/O */
-	OP_LH | P(LVL, UNC) | P(SNOOP, NONE), /* 0x0f: uncached */
+	P(OP, LOAD) | P(LVL, MISS) | LEVEL(L3) | P(SNOOP, NA),/* 0x00:ukn L3 */
+	OP_LH | P(LVL, L1)  | LEVEL(L1) | P(SNOOP, NONE),  /* 0x01: L1 local */
+	OP_LH | P(LVL, LFB) | LEVEL(LFB) | P(SNOOP, NONE), /* 0x02: LFB hit */
+	OP_LH | P(LVL, L2)  | LEVEL(L2) | P(SNOOP, NONE),  /* 0x03: L2 hit */
+	OP_LH | P(LVL, L3)  | LEVEL(L3) | P(SNOOP, NONE),  /* 0x04: L3 hit */
+	OP_LH | P(LVL, L3)  | LEVEL(L3) | P(SNOOP, MISS),  /* 0x05: L3 hit, snoop miss */
+	OP_LH | P(LVL, L3)  | LEVEL(L3) | P(SNOOP, HIT),   /* 0x06: L3 hit, snoop hit */
+	OP_LH | P(LVL, L3)  | LEVEL(L3) | P(SNOOP, HITM),  /* 0x07: L3 hit, snoop hitm */
+	OP_LH | P(LVL, REM_CCE1) | REM | LEVEL(L3) | P(SNOOP, HIT),  /* 0x08: L3 miss snoop hit */
+	OP_LH | P(LVL, REM_CCE1) | REM | LEVEL(L3) | P(SNOOP, HITM), /* 0x09: L3 miss snoop hitm*/
+	OP_LH | P(LVL, LOC_RAM)  | LEVEL(RAM) | P(SNOOP, HIT),       /* 0x0a: L3 miss, shared */
+	OP_LH | P(LVL, REM_RAM1) | REM | LEVEL(L3) | P(SNOOP, HIT),  /* 0x0b: L3 miss, shared */
+	OP_LH | P(LVL, LOC_RAM)  | LEVEL(RAM) | SNOOP_NONE_MISS,     /* 0x0c: L3 miss, excl */
+	OP_LH | P(LVL, REM_RAM1) | LEVEL(RAM) | REM | SNOOP_NONE_MISS, /* 0x0d: L3 miss, excl */
+	OP_LH | P(LVL, IO)  | LEVEL(NA) | P(SNOOP, NONE), /* 0x0e: I/O */
+	OP_LH | P(LVL, UNC) | LEVEL(NA) | P(SNOOP, NONE), /* 0x0f: uncached */
 };
 
 /* Patch up minor differences in the bits */
 void __init intel_pmu_pebs_data_source_nhm(void)
 {
-	pebs_data_source[0x05] = OP_LH | P(LVL, L3)  | P(SNOOP, HIT);
-	pebs_data_source[0x06] = OP_LH | P(LVL, L3)  | P(SNOOP, HITM);
-	pebs_data_source[0x07] = OP_LH | P(LVL, L3)  | P(SNOOP, HITM);
+	pebs_data_source[0x05] = OP_LH | P(LVL, L3) | LEVEL(L3) | P(SNOOP, HIT);
+	pebs_data_source[0x06] = OP_LH | P(LVL, L3) | LEVEL(L3) | P(SNOOP, HITM);
+	pebs_data_source[0x07] = OP_LH | P(LVL, L3) | LEVEL(L3) | P(SNOOP, HITM);
+}
+
+void __init intel_pmu_pebs_data_source_skl(bool pmem)
+{
+	u64 pmem_or_l4 = pmem ? LEVEL(PMEM) : LEVEL(L4);
+
+	pebs_data_source[0x08] = OP_LH | pmem_or_l4 | P(SNOOP, HIT);
+	pebs_data_source[0x09] = OP_LH | pmem_or_l4 | REM | P(SNOOP, HIT);
+	pebs_data_source[0x0b] = OP_LH | LEVEL(RAM) | REM | P(SNOOP, NONE);
+	pebs_data_source[0x0c] = OP_LH | LEVEL(ANY_CACHE) | REM | P(SNOOPX, FWD);
+	pebs_data_source[0x0d] = OP_LH | LEVEL(ANY_CACHE) | REM | P(SNOOP, HITM);
 }
 
 static u64 precise_store_data(u64 status)
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index 2e9636e4068f..0f7dad8bd358 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -948,6 +948,8 @@ void intel_pmu_lbr_init_knl(void);
 
 void intel_pmu_pebs_data_source_nhm(void);
 
+void intel_pmu_pebs_data_source_skl(bool pmem);
+
 int intel_pmu_setup_lbr_filter(struct perf_event *event);
 
 void intel_pt_interrupt(void);
diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index 642db5fa3286..2a37ae925d85 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -954,14 +954,20 @@ union perf_mem_data_src {
 			mem_snoop:5,	/* snoop mode */
 			mem_lock:2,	/* lock instr */
 			mem_dtlb:7,	/* tlb access */
-			mem_rsvd:31;
+			mem_lvl_num:4,	/* memory hierarchy level number */
+			mem_remote:1,   /* remote */
+			mem_snoopx:2,	/* snoop mode, ext */
+			mem_rsvd:24;
 	};
 };
 #elif defined(__BIG_ENDIAN_BITFIELD)
 union perf_mem_data_src {
 	__u64 val;
 	struct {
-		__u64	mem_rsvd:31,
+		__u64	mem_rsvd:24,
+			mem_snoopx:2,	/* snoop mode, ext */
+			mem_remote:1,   /* remote */
+			mem_lvl_num:4,	/* memory hierarchy level number */
 			mem_dtlb:7,	/* tlb access */
 			mem_lock:2,	/* lock instr */
 			mem_snoop:5,	/* snoop mode */
@@ -998,6 +1004,22 @@ union perf_mem_data_src {
 #define PERF_MEM_LVL_UNC	0x2000 /* Uncached memory */
 #define PERF_MEM_LVL_SHIFT	5
 
+#define PERF_MEM_REMOTE_REMOTE	0x01  /* Remote */
+#define PERF_MEM_REMOTE_SHIFT	37
+
+#define PERF_MEM_LVLNUM_L1	0x01 /* L1 */
+#define PERF_MEM_LVLNUM_L2	0x02 /* L2 */
+#define PERF_MEM_LVLNUM_L3	0x03 /* L3 */
+#define PERF_MEM_LVLNUM_L4	0x04 /* L4 */
+/* 5-0xa available */
+#define PERF_MEM_LVLNUM_ANY_CACHE 0x0b /* Any cache */
+#define PERF_MEM_LVLNUM_LFB	0x0c /* LFB */
+#define PERF_MEM_LVLNUM_RAM	0x0d /* RAM */
+#define PERF_MEM_LVLNUM_PMEM	0x0e /* PMEM */
+#define PERF_MEM_LVLNUM_NA	0x0f /* N/A */
+
+#define PERF_MEM_LVLNUM_SHIFT	33
+
 /* snoop mode */
 #define PERF_MEM_SNOOP_NA	0x01 /* not available */
 #define PERF_MEM_SNOOP_NONE	0x02 /* no snoop */
@@ -1006,6 +1028,10 @@ union perf_mem_data_src {
 #define PERF_MEM_SNOOP_HITM	0x10 /* snoop hit modified */
 #define PERF_MEM_SNOOP_SHIFT	19
 
+#define PERF_MEM_SNOOPX_FWD	0x01 /* forward */
+/* 1 free */
+#define PERF_MEM_SNOOPX_SHIFT	37
+
 /* locked instruction */
 #define PERF_MEM_LOCK_NA	0x01 /* not available */
 #define PERF_MEM_LOCK_LOCKED	0x02 /* locked transaction */
-- 
2.9.4

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v5 3/4] perf, tools: Add support for printing new mem_info encodings
  2017-08-16 22:21 Fix Skylake PEBS data source for perf v5 Andi Kleen
  2017-08-16 22:21 ` [PATCH v5 1/4] perf/x86: Move Nehalem PEBS code to flag Andi Kleen
  2017-08-16 22:21 ` [PATCH v5 2/4] perf/x86: Fix data source decoding for Skylake Andi Kleen
@ 2017-08-16 22:21 ` Andi Kleen
  2017-08-23 13:01   ` Jiri Olsa
  2017-08-24  8:23   ` [tip:perf/core] perf " tip-bot for Andi Kleen
  2017-08-16 22:21 ` [PATCH v5 4/4] perf, tools: Add test cases for new data source encoding Andi Kleen
  2017-08-22 15:37 ` Fix Skylake PEBS data source for perf v5 Arnaldo Carvalho de Melo
  4 siblings, 2 replies; 15+ messages in thread
From: Andi Kleen @ 2017-08-16 22:21 UTC (permalink / raw)
  To: peterz, acme; +Cc: jolsa, linux-kernel, Andi Kleen

From: Andi Kleen <ak@linux.intel.com>

Add decoding for the new lvlx and snoopx field meminfo field
added earlier to the kernel so that "perf mem report" and
other tools can print it properly.

v2: Merge with persistent memory patch.
Switch to new bit encoding for each combination.
v3: Switch to generic lvlnum field.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 tools/include/uapi/linux/perf_event.h | 30 ++++++++++++++++++++++--
 tools/perf/util/mem-events.c          | 43 ++++++++++++++++++++++++++++++++---
 2 files changed, 68 insertions(+), 5 deletions(-)

diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h
index 642db5fa3286..2a37ae925d85 100644
--- a/tools/include/uapi/linux/perf_event.h
+++ b/tools/include/uapi/linux/perf_event.h
@@ -954,14 +954,20 @@ union perf_mem_data_src {
 			mem_snoop:5,	/* snoop mode */
 			mem_lock:2,	/* lock instr */
 			mem_dtlb:7,	/* tlb access */
-			mem_rsvd:31;
+			mem_lvl_num:4,	/* memory hierarchy level number */
+			mem_remote:1,   /* remote */
+			mem_snoopx:2,	/* snoop mode, ext */
+			mem_rsvd:24;
 	};
 };
 #elif defined(__BIG_ENDIAN_BITFIELD)
 union perf_mem_data_src {
 	__u64 val;
 	struct {
-		__u64	mem_rsvd:31,
+		__u64	mem_rsvd:24,
+			mem_snoopx:2,	/* snoop mode, ext */
+			mem_remote:1,   /* remote */
+			mem_lvl_num:4,	/* memory hierarchy level number */
 			mem_dtlb:7,	/* tlb access */
 			mem_lock:2,	/* lock instr */
 			mem_snoop:5,	/* snoop mode */
@@ -998,6 +1004,22 @@ union perf_mem_data_src {
 #define PERF_MEM_LVL_UNC	0x2000 /* Uncached memory */
 #define PERF_MEM_LVL_SHIFT	5
 
+#define PERF_MEM_REMOTE_REMOTE	0x01  /* Remote */
+#define PERF_MEM_REMOTE_SHIFT	37
+
+#define PERF_MEM_LVLNUM_L1	0x01 /* L1 */
+#define PERF_MEM_LVLNUM_L2	0x02 /* L2 */
+#define PERF_MEM_LVLNUM_L3	0x03 /* L3 */
+#define PERF_MEM_LVLNUM_L4	0x04 /* L4 */
+/* 5-0xa available */
+#define PERF_MEM_LVLNUM_ANY_CACHE 0x0b /* Any cache */
+#define PERF_MEM_LVLNUM_LFB	0x0c /* LFB */
+#define PERF_MEM_LVLNUM_RAM	0x0d /* RAM */
+#define PERF_MEM_LVLNUM_PMEM	0x0e /* PMEM */
+#define PERF_MEM_LVLNUM_NA	0x0f /* N/A */
+
+#define PERF_MEM_LVLNUM_SHIFT	33
+
 /* snoop mode */
 #define PERF_MEM_SNOOP_NA	0x01 /* not available */
 #define PERF_MEM_SNOOP_NONE	0x02 /* no snoop */
@@ -1006,6 +1028,10 @@ union perf_mem_data_src {
 #define PERF_MEM_SNOOP_HITM	0x10 /* snoop hit modified */
 #define PERF_MEM_SNOOP_SHIFT	19
 
+#define PERF_MEM_SNOOPX_FWD	0x01 /* forward */
+/* 1 free */
+#define PERF_MEM_SNOOPX_SHIFT	37
+
 /* locked instruction */
 #define PERF_MEM_LOCK_NA	0x01 /* not available */
 #define PERF_MEM_LOCK_LOCKED	0x02 /* locked transaction */
diff --git a/tools/perf/util/mem-events.c b/tools/perf/util/mem-events.c
index 06f5a3a4295c..ced4f3fff035 100644
--- a/tools/perf/util/mem-events.c
+++ b/tools/perf/util/mem-events.c
@@ -166,11 +166,20 @@ static const char * const mem_lvl[] = {
 	"Uncached",
 };
 
+static const char * const mem_lvlnum[] = {
+	[PERF_MEM_LVLNUM_ANY_CACHE] = "Any cache",
+	[PERF_MEM_LVLNUM_LFB] = "LFB",
+	[PERF_MEM_LVLNUM_RAM] = "RAM",
+	[PERF_MEM_LVLNUM_PMEM] = "PMEM",
+	[PERF_MEM_LVLNUM_NA] = "N/A",
+};
+
 int perf_mem__lvl_scnprintf(char *out, size_t sz, struct mem_info *mem_info)
 {
 	size_t i, l = 0;
 	u64 m =  PERF_MEM_LVL_NA;
 	u64 hit, miss;
+	int printed;
 
 	if (mem_info)
 		m  = mem_info->data_src.mem_lvl;
@@ -184,17 +193,37 @@ int perf_mem__lvl_scnprintf(char *out, size_t sz, struct mem_info *mem_info)
 	/* already taken care of */
 	m &= ~(PERF_MEM_LVL_HIT|PERF_MEM_LVL_MISS);
 
+
+	if (mem_info && mem_info->data_src.mem_remote) {
+		strcat(out, "Remote ");
+		l += 7;
+	}
+
+	printed = 0;
 	for (i = 0; m && i < ARRAY_SIZE(mem_lvl); i++, m >>= 1) {
 		if (!(m & 0x1))
 			continue;
-		if (l) {
+		if (printed++) {
 			strcat(out, " or ");
 			l += 4;
 		}
 		l += scnprintf(out + l, sz - l, mem_lvl[i]);
 	}
-	if (*out == '\0')
-		l += scnprintf(out, sz - l, "N/A");
+
+	if (mem_info && mem_info->data_src.mem_lvl_num) {
+		int lvl = mem_info->data_src.mem_lvl_num;
+		if (printed++) {
+			strcat(out, " or ");
+			l += 4;
+		}
+		if (mem_lvlnum[lvl])
+			l += scnprintf(out + l, sz - l, mem_lvlnum[lvl]);
+		else
+			l += scnprintf(out + l, sz - l, "L%d", lvl);
+	}
+
+	if (l == 0)
+		l += scnprintf(out + l, sz - l, "N/A");
 	if (hit)
 		l += scnprintf(out + l, sz - l, " hit");
 	if (miss)
@@ -231,6 +260,14 @@ int perf_mem__snp_scnprintf(char *out, size_t sz, struct mem_info *mem_info)
 		}
 		l += scnprintf(out + l, sz - l, snoop_access[i]);
 	}
+	if (mem_info &&
+	     (mem_info->data_src.mem_snoopx & PERF_MEM_SNOOPX_FWD)) {
+		if (l) {
+			strcat(out, " or ");
+			l += 4;
+		}
+		l += scnprintf(out + l, sz - l, "Fwd");
+	}
 
 	if (*out == '\0')
 		l += scnprintf(out, sz - l, "N/A");
-- 
2.9.4

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v5 4/4] perf, tools: Add test cases for new data source encoding
  2017-08-16 22:21 Fix Skylake PEBS data source for perf v5 Andi Kleen
                   ` (2 preceding siblings ...)
  2017-08-16 22:21 ` [PATCH v5 3/4] perf, tools: Add support for printing new mem_info encodings Andi Kleen
@ 2017-08-16 22:21 ` Andi Kleen
  2017-08-24  8:23   ` [tip:perf/core] perf test: " tip-bot for Andi Kleen
  2017-08-22 15:37 ` Fix Skylake PEBS data source for perf v5 Arnaldo Carvalho de Melo
  4 siblings, 1 reply; 15+ messages in thread
From: Andi Kleen @ 2017-08-16 22:21 UTC (permalink / raw)
  To: peterz, acme; +Cc: jolsa, linux-kernel, Andi Kleen

From: Andi Kleen <ak@linux.intel.com>

Add some simple tests to perf test to test data source printing.

v2: Make the tests actually checked for the correct name of Forward
v3: Adjust to new encoding
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 tools/perf/tests/Build          |  1 +
 tools/perf/tests/builtin-test.c |  4 ++++
 tools/perf/tests/mem.c          | 53 +++++++++++++++++++++++++++++++++++++++++
 tools/perf/tests/tests.h        |  1 +
 4 files changed, 59 insertions(+)
 create mode 100644 tools/perf/tests/mem.c

diff --git a/tools/perf/tests/Build b/tools/perf/tests/Build
index 84222bdb8689..87bf3edb037c 100644
--- a/tools/perf/tests/Build
+++ b/tools/perf/tests/Build
@@ -34,6 +34,7 @@ perf-y += thread-map.o
 perf-y += llvm.o llvm-src-base.o llvm-src-kbuild.o llvm-src-prologue.o llvm-src-relocation.o
 perf-y += bpf.o
 perf-y += topology.o
+perf-y += mem.o
 perf-y += cpumap.o
 perf-y += stat.o
 perf-y += event_update.o
diff --git a/tools/perf/tests/builtin-test.c b/tools/perf/tests/builtin-test.c
index 9ecc44e68990..377bea009163 100644
--- a/tools/perf/tests/builtin-test.c
+++ b/tools/perf/tests/builtin-test.c
@@ -48,6 +48,10 @@ static struct test generic_tests[] = {
 		.func = test__basic_mmap,
 	},
 	{
+		.desc = "Test data source output",
+		.func = test__mem,
+	},
+	{
 		.desc = "Parse event definition strings",
 		.func = test__parse_events,
 	},
diff --git a/tools/perf/tests/mem.c b/tools/perf/tests/mem.c
new file mode 100644
index 000000000000..ee203b9ff44c
--- /dev/null
+++ b/tools/perf/tests/mem.c
@@ -0,0 +1,53 @@
+#include "util/mem-events.h"
+#include "util/symbol.h"
+#include "linux/perf_event.h"
+#include "util/debug.h"
+#include "tests.h"
+#include <string.h>
+
+static int check(union perf_mem_data_src data_src,
+		  const char *string)
+{
+	char out[100];
+	char failure[100];
+	struct mem_info mi = { .data_src = data_src };
+
+	int n;
+
+	n = perf_mem__snp_scnprintf(out, sizeof out, &mi);
+	n += perf_mem__lvl_scnprintf(out + n, sizeof out - n, &mi);
+	snprintf(failure, sizeof failure, "unexpected %s", out);
+	TEST_ASSERT_VAL(failure, !strcmp(string, out));
+	return 0;
+}
+
+int test__mem(struct test *text __maybe_unused, int subtest __maybe_unused)
+{
+	int ret = 0;
+
+	ret |= check(((union perf_mem_data_src) {
+				.mem_lvl = PERF_MEM_LVL_HIT,
+				.mem_lvl_num = 4 }), "N/AL4 hit");
+
+	ret |= check(((union perf_mem_data_src) {
+				.mem_lvl = PERF_MEM_LVL_HIT,
+				.mem_lvl_num = 4,
+				.mem_remote = 1	}), "N/ARemote L4 hit");
+
+	ret |= check(((union perf_mem_data_src) {
+				.mem_lvl = PERF_MEM_LVL_MISS,
+				.mem_lvl_num = PERF_MEM_LVLNUM_PMEM }), "N/APMEM miss");
+
+	ret |= check(((union perf_mem_data_src) {
+				.mem_lvl = PERF_MEM_LVL_MISS,
+				.mem_lvl_num = PERF_MEM_LVLNUM_PMEM,
+				.mem_remote =1 }), "N/ARemote PMEM miss");
+
+	ret |= check(((union perf_mem_data_src) {
+				.mem_snoopx = PERF_MEM_SNOOPX_FWD,
+				.mem_lvl = PERF_MEM_LVL_MISS,
+				.mem_lvl_num = PERF_MEM_LVLNUM_RAM,
+				.mem_remote = 1	}), "FwdRemote RAM miss");
+
+	return ret;
+}
diff --git a/tools/perf/tests/tests.h b/tools/perf/tests/tests.h
index c46ae818aac8..921412a6a880 100644
--- a/tools/perf/tests/tests.h
+++ b/tools/perf/tests/tests.h
@@ -58,6 +58,7 @@ int test__python_use(struct test *test, int subtest);
 int test__bp_signal(struct test *test, int subtest);
 int test__bp_signal_overflow(struct test *test, int subtest);
 int test__task_exit(struct test *test, int subtest);
+int test__mem(struct test *test, int subtest);
 int test__sw_clock_freq(struct test *test, int subtest);
 int test__code_reading(struct test *test, int subtest);
 int test__sample_parsing(struct test *test, int subtest);
-- 
2.9.4

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: Fix Skylake PEBS data source for perf v5
  2017-08-16 22:21 Fix Skylake PEBS data source for perf v5 Andi Kleen
                   ` (3 preceding siblings ...)
  2017-08-16 22:21 ` [PATCH v5 4/4] perf, tools: Add test cases for new data source encoding Andi Kleen
@ 2017-08-22 15:37 ` Arnaldo Carvalho de Melo
  4 siblings, 0 replies; 15+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-08-22 15:37 UTC (permalink / raw)
  To: Andi Kleen; +Cc: peterz, jolsa, linux-kernel

Em Wed, Aug 16, 2017 at 03:21:52PM -0700, Andi Kleen escreveu:
> Fix data source reporting for Skylake and Skylake Server.
> The encodings have changed to express support for L4 and persistent
> memory. 
> 
> The first patch is a (independent) cleanup.
> 
> The second is for the kernel and the third/fourth for perf/tools.
> The kernel part and perf tools will compile independently.

I got the tools part, Peter had some trouble applying it, will get all
merged soon in tip, thanks,

- Arnaldo
 
> Also available in 
> git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-misc.git perf/skx-data-src-7
> 
> v1:
> Initial post
> v2:
> Merged some patches.
> Change encoding to use special bit for each combination instead
> of modifiers.
> v3: 
> Switch to new generic lvlnum indication
> v4:
> Repost. No changes.
> v5:
> ported to latest tree. Retested remote HITM.
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v5 3/4] perf, tools: Add support for printing new mem_info encodings
  2017-08-16 22:21 ` [PATCH v5 3/4] perf, tools: Add support for printing new mem_info encodings Andi Kleen
@ 2017-08-23 13:01   ` Jiri Olsa
  2017-08-23 14:00     ` Andi Kleen
  2017-08-24  8:23   ` [tip:perf/core] perf " tip-bot for Andi Kleen
  1 sibling, 1 reply; 15+ messages in thread
From: Jiri Olsa @ 2017-08-23 13:01 UTC (permalink / raw)
  To: Andi Kleen; +Cc: peterz, acme, jolsa, linux-kernel, Andi Kleen, Joe Mario

On Wed, Aug 16, 2017 at 03:21:55PM -0700, Andi Kleen wrote:

SNIP

>  int perf_mem__lvl_scnprintf(char *out, size_t sz, struct mem_info *mem_info)
>  {
>  	size_t i, l = 0;
>  	u64 m =  PERF_MEM_LVL_NA;
>  	u64 hit, miss;
> +	int printed;
>  
>  	if (mem_info)
>  		m  = mem_info->data_src.mem_lvl;
> @@ -184,17 +193,37 @@ int perf_mem__lvl_scnprintf(char *out, size_t sz, struct mem_info *mem_info)
>  	/* already taken care of */
>  	m &= ~(PERF_MEM_LVL_HIT|PERF_MEM_LVL_MISS);
>  
> +
> +	if (mem_info && mem_info->data_src.mem_remote) {
> +		strcat(out, "Remote ");
> +		l += 7;
> +	}

Andi,
how is this 'Remote' different from the remote levels in mem_lvl?

        "Remote RAM (1 hop)",
        "Remote RAM (2 hops)",
        "Remote Cache (1 hop)",
        "Remote Cache (2 hops)",

thanks,
jirka

> +
> +	printed = 0;
>  	for (i = 0; m && i < ARRAY_SIZE(mem_lvl); i++, m >>= 1) {
>  		if (!(m & 0x1))
>  			continue;
> -		if (l) {
> +		if (printed++) {
>  			strcat(out, " or ");
>  			l += 4;
>  		}
>  		l += scnprintf(out + l, sz - l, mem_lvl[i]);
>  	}
> -	if (*out == '\0')
> -		l += scnprintf(out, sz - l, "N/A");
> +
> +	if (mem_info && mem_info->data_src.mem_lvl_num) {
> +		int lvl = mem_info->data_src.mem_lvl_num;
> +		if (printed++) {
> +			strcat(out, " or ");
> +			l += 4;
> +		}
> +		if (mem_lvlnum[lvl])
> +			l += scnprintf(out + l, sz - l, mem_lvlnum[lvl]);
> +		else
> +			l += scnprintf(out + l, sz - l, "L%d", lvl);
> +	}

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v5 3/4] perf, tools: Add support for printing new mem_info encodings
  2017-08-23 13:01   ` Jiri Olsa
@ 2017-08-23 14:00     ` Andi Kleen
  2017-08-23 14:14       ` Jiri Olsa
  0 siblings, 1 reply; 15+ messages in thread
From: Andi Kleen @ 2017-08-23 14:00 UTC (permalink / raw)
  To: Jiri Olsa; +Cc: Andi Kleen, peterz, acme, jolsa, linux-kernel, Joe Mario

> Andi,
> how is this 'Remote' different from the remote levels in mem_lvl?
> 
>         "Remote RAM (1 hop)",
>         "Remote RAM (2 hops)",
>         "Remote Cache (1 hop)",
>         "Remote Cache (2 hops)",

It applies to any other level. This is needed to express
"Remote unknown level", as is reported by Skylake.

-Andi

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v5 3/4] perf, tools: Add support for printing new mem_info encodings
  2017-08-23 14:00     ` Andi Kleen
@ 2017-08-23 14:14       ` Jiri Olsa
  2017-08-23 15:59         ` Andi Kleen
  0 siblings, 1 reply; 15+ messages in thread
From: Jiri Olsa @ 2017-08-23 14:14 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Andi Kleen, peterz, acme, jolsa, linux-kernel, Joe Mario

On Wed, Aug 23, 2017 at 07:00:32AM -0700, Andi Kleen wrote:
> > Andi,
> > how is this 'Remote' different from the remote levels in mem_lvl?
> > 
> >         "Remote RAM (1 hop)",
> >         "Remote RAM (2 hops)",
> >         "Remote Cache (1 hop)",
> >         "Remote Cache (2 hops)",
> 
> It applies to any other level. This is needed to express
> "Remote unknown level", as is reported by Skylake.

so if I find HITM with this flag set I should count it
as remote HITM then? something like attached.. untested

thanks,
jirka

---
diff --git a/tools/perf/util/mem-events.c b/tools/perf/util/mem-events.c
index 06f5a3a4295c..65e22b9e59f9 100644
--- a/tools/perf/util/mem-events.c
+++ b/tools/perf/util/mem-events.c
@@ -279,6 +279,7 @@ int c2c_decode_stats(struct c2c_stats *stats, struct mem_info *mi)
 	u64 lvl    = data_src->mem_lvl;
 	u64 snoop  = data_src->mem_snoop;
 	u64 lock   = data_src->mem_lock;
+	bool mr    = data_src->mem_remote;
 	int err = 0;
 
 #define HITM_INC(__f)		\
@@ -324,7 +325,7 @@ do {				\
 			}
 
 			if ((lvl & P(LVL, REM_RAM1)) ||
-			    (lvl & P(LVL, REM_RAM2))) {
+			    (lvl & P(LVL, REM_RAM2)) || mr) {
 				stats->rmt_dram++;
 				if (snoop & P(SNOOP, HIT))
 					stats->ld_shared++;
@@ -334,7 +335,7 @@ do {				\
 		}
 
 		if ((lvl & P(LVL, REM_CCE1)) ||
-		    (lvl & P(LVL, REM_CCE2))) {
+		    (lvl & P(LVL, REM_CCE2)) || mr) {
 			if (snoop & P(SNOOP, HIT))
 				stats->rmt_hit++;
 			else if (snoop & P(SNOOP, HITM))

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v5 3/4] perf, tools: Add support for printing new mem_info encodings
  2017-08-23 14:14       ` Jiri Olsa
@ 2017-08-23 15:59         ` Andi Kleen
  2017-08-24  8:23           ` Jiri Olsa
  0 siblings, 1 reply; 15+ messages in thread
From: Andi Kleen @ 2017-08-23 15:59 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Andi Kleen, Andi Kleen, peterz, acme, jolsa, linux-kernel, Joe Mario

> so if I find HITM with this flag set I should count it
> as remote HITM then? something like attached.. untested

You mean for c2c? Yes looks reasonable.

-Andi

> 
> thanks,
> jirka
> 
> ---
> diff --git a/tools/perf/util/mem-events.c b/tools/perf/util/mem-events.c
> index 06f5a3a4295c..65e22b9e59f9 100644
> --- a/tools/perf/util/mem-events.c
> +++ b/tools/perf/util/mem-events.c
> @@ -279,6 +279,7 @@ int c2c_decode_stats(struct c2c_stats *stats, struct mem_info *mi)
>  	u64 lvl    = data_src->mem_lvl;
>  	u64 snoop  = data_src->mem_snoop;
>  	u64 lock   = data_src->mem_lock;
> +	bool mr    = data_src->mem_remote;
>  	int err = 0;
>  
>  #define HITM_INC(__f)		\
> @@ -324,7 +325,7 @@ do {				\
>  			}
>  
>  			if ((lvl & P(LVL, REM_RAM1)) ||
> -			    (lvl & P(LVL, REM_RAM2))) {
> +			    (lvl & P(LVL, REM_RAM2)) || mr) {
>  				stats->rmt_dram++;
>  				if (snoop & P(SNOOP, HIT))
>  					stats->ld_shared++;
> @@ -334,7 +335,7 @@ do {				\
>  		}
>  
>  		if ((lvl & P(LVL, REM_CCE1)) ||
> -		    (lvl & P(LVL, REM_CCE2))) {
> +		    (lvl & P(LVL, REM_CCE2)) || mr) {
>  			if (snoop & P(SNOOP, HIT))
>  				stats->rmt_hit++;
>  			else if (snoop & P(SNOOP, HITM))
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [tip:perf/core] perf tools: Add support for printing new mem_info encodings
  2017-08-16 22:21 ` [PATCH v5 3/4] perf, tools: Add support for printing new mem_info encodings Andi Kleen
  2017-08-23 13:01   ` Jiri Olsa
@ 2017-08-24  8:23   ` tip-bot for Andi Kleen
  1 sibling, 0 replies; 15+ messages in thread
From: tip-bot for Andi Kleen @ 2017-08-24  8:23 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: linux-kernel, acme, mingo, peterz, hpa, ak, tglx, jolsa

Commit-ID:  52839e653b5629bd237ad2ecdd8fdd897fcd5712
Gitweb:     http://git.kernel.org/tip/52839e653b5629bd237ad2ecdd8fdd897fcd5712
Author:     Andi Kleen <ak@linux.intel.com>
AuthorDate: Wed, 16 Aug 2017 15:21:55 -0700
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Tue, 22 Aug 2017 12:30:25 -0300

perf tools: Add support for printing new mem_info encodings

Add decoding for the new "lvlx" and "snoopx" meminfo fields added
earlier to the kernel so that "perf mem report" and other tools can
print it properly.

v2: Merge with persistent memory patch.
Switch to new bit encoding for each combination.

v3: Switch to generic lvlnum field.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20170816222156.19953-4-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/include/uapi/linux/perf_event.h | 30 ++++++++++++++++++++++--
 tools/perf/util/mem-events.c          | 43 ++++++++++++++++++++++++++++++++---
 2 files changed, 68 insertions(+), 5 deletions(-)

diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h
index 642db5f..2a37ae9 100644
--- a/tools/include/uapi/linux/perf_event.h
+++ b/tools/include/uapi/linux/perf_event.h
@@ -954,14 +954,20 @@ union perf_mem_data_src {
 			mem_snoop:5,	/* snoop mode */
 			mem_lock:2,	/* lock instr */
 			mem_dtlb:7,	/* tlb access */
-			mem_rsvd:31;
+			mem_lvl_num:4,	/* memory hierarchy level number */
+			mem_remote:1,   /* remote */
+			mem_snoopx:2,	/* snoop mode, ext */
+			mem_rsvd:24;
 	};
 };
 #elif defined(__BIG_ENDIAN_BITFIELD)
 union perf_mem_data_src {
 	__u64 val;
 	struct {
-		__u64	mem_rsvd:31,
+		__u64	mem_rsvd:24,
+			mem_snoopx:2,	/* snoop mode, ext */
+			mem_remote:1,   /* remote */
+			mem_lvl_num:4,	/* memory hierarchy level number */
 			mem_dtlb:7,	/* tlb access */
 			mem_lock:2,	/* lock instr */
 			mem_snoop:5,	/* snoop mode */
@@ -998,6 +1004,22 @@ union perf_mem_data_src {
 #define PERF_MEM_LVL_UNC	0x2000 /* Uncached memory */
 #define PERF_MEM_LVL_SHIFT	5
 
+#define PERF_MEM_REMOTE_REMOTE	0x01  /* Remote */
+#define PERF_MEM_REMOTE_SHIFT	37
+
+#define PERF_MEM_LVLNUM_L1	0x01 /* L1 */
+#define PERF_MEM_LVLNUM_L2	0x02 /* L2 */
+#define PERF_MEM_LVLNUM_L3	0x03 /* L3 */
+#define PERF_MEM_LVLNUM_L4	0x04 /* L4 */
+/* 5-0xa available */
+#define PERF_MEM_LVLNUM_ANY_CACHE 0x0b /* Any cache */
+#define PERF_MEM_LVLNUM_LFB	0x0c /* LFB */
+#define PERF_MEM_LVLNUM_RAM	0x0d /* RAM */
+#define PERF_MEM_LVLNUM_PMEM	0x0e /* PMEM */
+#define PERF_MEM_LVLNUM_NA	0x0f /* N/A */
+
+#define PERF_MEM_LVLNUM_SHIFT	33
+
 /* snoop mode */
 #define PERF_MEM_SNOOP_NA	0x01 /* not available */
 #define PERF_MEM_SNOOP_NONE	0x02 /* no snoop */
@@ -1006,6 +1028,10 @@ union perf_mem_data_src {
 #define PERF_MEM_SNOOP_HITM	0x10 /* snoop hit modified */
 #define PERF_MEM_SNOOP_SHIFT	19
 
+#define PERF_MEM_SNOOPX_FWD	0x01 /* forward */
+/* 1 free */
+#define PERF_MEM_SNOOPX_SHIFT	37
+
 /* locked instruction */
 #define PERF_MEM_LOCK_NA	0x01 /* not available */
 #define PERF_MEM_LOCK_LOCKED	0x02 /* locked transaction */
diff --git a/tools/perf/util/mem-events.c b/tools/perf/util/mem-events.c
index 06f5a3a..ced4f3f 100644
--- a/tools/perf/util/mem-events.c
+++ b/tools/perf/util/mem-events.c
@@ -166,11 +166,20 @@ static const char * const mem_lvl[] = {
 	"Uncached",
 };
 
+static const char * const mem_lvlnum[] = {
+	[PERF_MEM_LVLNUM_ANY_CACHE] = "Any cache",
+	[PERF_MEM_LVLNUM_LFB] = "LFB",
+	[PERF_MEM_LVLNUM_RAM] = "RAM",
+	[PERF_MEM_LVLNUM_PMEM] = "PMEM",
+	[PERF_MEM_LVLNUM_NA] = "N/A",
+};
+
 int perf_mem__lvl_scnprintf(char *out, size_t sz, struct mem_info *mem_info)
 {
 	size_t i, l = 0;
 	u64 m =  PERF_MEM_LVL_NA;
 	u64 hit, miss;
+	int printed;
 
 	if (mem_info)
 		m  = mem_info->data_src.mem_lvl;
@@ -184,17 +193,37 @@ int perf_mem__lvl_scnprintf(char *out, size_t sz, struct mem_info *mem_info)
 	/* already taken care of */
 	m &= ~(PERF_MEM_LVL_HIT|PERF_MEM_LVL_MISS);
 
+
+	if (mem_info && mem_info->data_src.mem_remote) {
+		strcat(out, "Remote ");
+		l += 7;
+	}
+
+	printed = 0;
 	for (i = 0; m && i < ARRAY_SIZE(mem_lvl); i++, m >>= 1) {
 		if (!(m & 0x1))
 			continue;
-		if (l) {
+		if (printed++) {
 			strcat(out, " or ");
 			l += 4;
 		}
 		l += scnprintf(out + l, sz - l, mem_lvl[i]);
 	}
-	if (*out == '\0')
-		l += scnprintf(out, sz - l, "N/A");
+
+	if (mem_info && mem_info->data_src.mem_lvl_num) {
+		int lvl = mem_info->data_src.mem_lvl_num;
+		if (printed++) {
+			strcat(out, " or ");
+			l += 4;
+		}
+		if (mem_lvlnum[lvl])
+			l += scnprintf(out + l, sz - l, mem_lvlnum[lvl]);
+		else
+			l += scnprintf(out + l, sz - l, "L%d", lvl);
+	}
+
+	if (l == 0)
+		l += scnprintf(out + l, sz - l, "N/A");
 	if (hit)
 		l += scnprintf(out + l, sz - l, " hit");
 	if (miss)
@@ -231,6 +260,14 @@ int perf_mem__snp_scnprintf(char *out, size_t sz, struct mem_info *mem_info)
 		}
 		l += scnprintf(out + l, sz - l, snoop_access[i]);
 	}
+	if (mem_info &&
+	     (mem_info->data_src.mem_snoopx & PERF_MEM_SNOOPX_FWD)) {
+		if (l) {
+			strcat(out, " or ");
+			l += 4;
+		}
+		l += scnprintf(out + l, sz - l, "Fwd");
+	}
 
 	if (*out == '\0')
 		l += scnprintf(out, sz - l, "N/A");

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v5 3/4] perf, tools: Add support for printing new mem_info encodings
  2017-08-23 15:59         ` Andi Kleen
@ 2017-08-24  8:23           ` Jiri Olsa
  0 siblings, 0 replies; 15+ messages in thread
From: Jiri Olsa @ 2017-08-24  8:23 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Andi Kleen, peterz, acme, jolsa, linux-kernel, Joe Mario

On Wed, Aug 23, 2017 at 08:59:00AM -0700, Andi Kleen wrote:
> > so if I find HITM with this flag set I should count it
> > as remote HITM then? something like attached.. untested
> 
> You mean for c2c? Yes looks reasonable.

yes, it seems to fix c2c to find remote HITMs again
on skylake.. I'll post the full patch soon

thanks,
jirka

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [tip:perf/core] perf test: Add test cases for new data source encoding
  2017-08-16 22:21 ` [PATCH v5 4/4] perf, tools: Add test cases for new data source encoding Andi Kleen
@ 2017-08-24  8:23   ` tip-bot for Andi Kleen
  0 siblings, 0 replies; 15+ messages in thread
From: tip-bot for Andi Kleen @ 2017-08-24  8:23 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: tglx, ak, jolsa, hpa, acme, mingo, peterz, linux-kernel

Commit-ID:  3067eaa7ce2dbcde89d87277cdbc91c211480060
Gitweb:     http://git.kernel.org/tip/3067eaa7ce2dbcde89d87277cdbc91c211480060
Author:     Andi Kleen <ak@linux.intel.com>
AuthorDate: Wed, 16 Aug 2017 15:21:56 -0700
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Tue, 22 Aug 2017 13:23:10 -0300

perf test: Add test cases for new data source encoding

Add some simple tests to perf test to test data source printing.

v2: Make the tests actually checked for the correct name of Forward
v3: Adjust to new encoding

Committer notes:

Avoid the in place declaration to make this build with older compilers,
for instance, in Debian 7 we get:

  tests/mem.c: In function 'test__mem':
  tests/mem.c:30:5: error: missing initializer [-Werror=missing-field-initializers]
  tests/mem.c:30:5: error: (near initialization for '(anonymous).<anonymous>.mem_snoop') [-Werror=missing-field-initializers]

So just zero a struct, then go on building the unions as needed,
reusing settings from the previous test, i.e. local -> remote, etc.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20170816222156.19953-5-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/tests/Build          |  1 +
 tools/perf/tests/builtin-test.c |  4 +++
 tools/perf/tests/mem.c          | 56 +++++++++++++++++++++++++++++++++++++++++
 tools/perf/tests/tests.h        |  1 +
 4 files changed, 62 insertions(+)

diff --git a/tools/perf/tests/Build b/tools/perf/tests/Build
index 84222bd..87bf3ed 100644
--- a/tools/perf/tests/Build
+++ b/tools/perf/tests/Build
@@ -34,6 +34,7 @@ perf-y += thread-map.o
 perf-y += llvm.o llvm-src-base.o llvm-src-kbuild.o llvm-src-prologue.o llvm-src-relocation.o
 perf-y += bpf.o
 perf-y += topology.o
+perf-y += mem.o
 perf-y += cpumap.o
 perf-y += stat.o
 perf-y += event_update.o
diff --git a/tools/perf/tests/builtin-test.c b/tools/perf/tests/builtin-test.c
index 9ecc44e..377bea0 100644
--- a/tools/perf/tests/builtin-test.c
+++ b/tools/perf/tests/builtin-test.c
@@ -48,6 +48,10 @@ static struct test generic_tests[] = {
 		.func = test__basic_mmap,
 	},
 	{
+		.desc = "Test data source output",
+		.func = test__mem,
+	},
+	{
 		.desc = "Parse event definition strings",
 		.func = test__parse_events,
 	},
diff --git a/tools/perf/tests/mem.c b/tools/perf/tests/mem.c
new file mode 100644
index 0000000..21952e1
--- /dev/null
+++ b/tools/perf/tests/mem.c
@@ -0,0 +1,56 @@
+#include "util/mem-events.h"
+#include "util/symbol.h"
+#include "linux/perf_event.h"
+#include "util/debug.h"
+#include "tests.h"
+#include <string.h>
+
+static int check(union perf_mem_data_src data_src,
+		  const char *string)
+{
+	char out[100];
+	char failure[100];
+	struct mem_info mi = { .data_src = data_src };
+
+	int n;
+
+	n = perf_mem__snp_scnprintf(out, sizeof out, &mi);
+	n += perf_mem__lvl_scnprintf(out + n, sizeof out - n, &mi);
+	snprintf(failure, sizeof failure, "unexpected %s", out);
+	TEST_ASSERT_VAL(failure, !strcmp(string, out));
+	return 0;
+}
+
+int test__mem(struct test *text __maybe_unused, int subtest __maybe_unused)
+{
+	int ret = 0;
+	union perf_mem_data_src src;
+
+	memset(&src, 0, sizeof(src));
+
+	src.mem_lvl = PERF_MEM_LVL_HIT;
+	src.mem_lvl_num = 4;
+
+	ret |= check(src, "N/AL4 hit");
+
+	src.mem_remote = 1;
+
+	ret |= check(src, "N/ARemote L4 hit");
+
+	src.mem_lvl = PERF_MEM_LVL_MISS;
+	src.mem_lvl_num = PERF_MEM_LVLNUM_PMEM;
+	src.mem_remote = 0;
+
+	ret |= check(src, "N/APMEM miss");
+
+	src.mem_remote = 1;
+
+	ret |= check(src, "N/ARemote PMEM miss");
+
+	src.mem_snoopx = PERF_MEM_SNOOPX_FWD;
+	src.mem_lvl_num = PERF_MEM_LVLNUM_RAM;
+
+	ret |= check(src , "FwdRemote RAM miss");
+
+	return ret;
+}
diff --git a/tools/perf/tests/tests.h b/tools/perf/tests/tests.h
index c46ae81..921412a 100644
--- a/tools/perf/tests/tests.h
+++ b/tools/perf/tests/tests.h
@@ -58,6 +58,7 @@ int test__python_use(struct test *test, int subtest);
 int test__bp_signal(struct test *test, int subtest);
 int test__bp_signal_overflow(struct test *test, int subtest);
 int test__task_exit(struct test *test, int subtest);
+int test__mem(struct test *test, int subtest);
 int test__sw_clock_freq(struct test *test, int subtest);
 int test__code_reading(struct test *test, int subtest);
 int test__sample_parsing(struct test *test, int subtest);

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [tip:perf/core] perf/x86: Move Nehalem PEBS code to flag
  2017-08-16 22:21 ` [PATCH v5 1/4] perf/x86: Move Nehalem PEBS code to flag Andi Kleen
@ 2017-08-25 11:53   ` tip-bot for Andi Kleen
  0 siblings, 0 replies; 15+ messages in thread
From: tip-bot for Andi Kleen @ 2017-08-25 11:53 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: mingo, tglx, linux-kernel, peterz, hpa, ak, torvalds

Commit-ID:  95298355143f9765f0d40ed57dce7fa6571cc623
Gitweb:     http://git.kernel.org/tip/95298355143f9765f0d40ed57dce7fa6571cc623
Author:     Andi Kleen <ak@linux.intel.com>
AuthorDate: Wed, 16 Aug 2017 15:21:53 -0700
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Fri, 25 Aug 2017 11:04:16 +0200

perf/x86: Move Nehalem PEBS code to flag

Minor cleanup: use an explicit x86_pmu flag to handle the
missing Lock / TLB information on Nehalem, instead of always
checking the model number for each PEBS sample.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: acme@kernel.org
Cc: jolsa@kernel.org
Link: http://lkml.kernel.org/r/20170816222156.19953-2-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/events/intel/core.c | 1 +
 arch/x86/events/intel/ds.c   | 5 +----
 arch/x86/events/perf_event.h | 3 ++-
 3 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 98b0f07..c3439a3 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -3905,6 +3905,7 @@ __init int intel_pmu_init(void)
 
 		intel_pmu_pebs_data_source_nhm();
 		x86_add_quirk(intel_nehalem_quirk);
+		x86_pmu.pebs_no_tlb = 1;
 
 		pr_cont("Nehalem events, ");
 		break;
diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index a322fed..3ccdf8c 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -149,8 +149,6 @@ static u64 load_latency_data(u64 status)
 {
 	union intel_x86_pebs_dse dse;
 	u64 val;
-	int model = boot_cpu_data.x86_model;
-	int fam = boot_cpu_data.x86;
 
 	dse.val = status;
 
@@ -162,8 +160,7 @@ static u64 load_latency_data(u64 status)
 	/*
 	 * Nehalem models do not support TLB, Lock infos
 	 */
-	if (fam == 0x6 && (model == 26 || model == 30
-	    || model == 31 || model == 46)) {
+	if (x86_pmu.pebs_no_tlb) {
 		val |= P(TLB, NA) | P(LOCK, NA);
 		return val;
 	}
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index 476aec3..2e9636e 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -591,7 +591,8 @@ struct x86_pmu {
 			pebs		:1,
 			pebs_active	:1,
 			pebs_broken	:1,
-			pebs_prec_dist	:1;
+			pebs_prec_dist	:1,
+			pebs_no_tlb	:1;
 	int		pebs_record_size;
 	int		pebs_buffer_size;
 	void		(*drain_pebs)(struct pt_regs *regs);

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [tip:perf/core] perf/x86: Fix data source decoding for Skylake
  2017-08-16 22:21 ` [PATCH v5 2/4] perf/x86: Fix data source decoding for Skylake Andi Kleen
@ 2017-08-25 11:53   ` tip-bot for Andi Kleen
  0 siblings, 0 replies; 15+ messages in thread
From: tip-bot for Andi Kleen @ 2017-08-25 11:53 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: mpe, mingo, peterz, tglx, ak, hpa, torvalds, linux-kernel, maddy

Commit-ID:  6ae5fa61d27dcb055f4198bcf6c8dbbf1bb33f52
Gitweb:     http://git.kernel.org/tip/6ae5fa61d27dcb055f4198bcf6c8dbbf1bb33f52
Author:     Andi Kleen <ak@linux.intel.com>
AuthorDate: Wed, 16 Aug 2017 15:21:54 -0700
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Fri, 25 Aug 2017 11:04:17 +0200

perf/x86: Fix data source decoding for Skylake

Skylake changed the encoding of the PEBS data source field.
Some combinations are not available anymore, but some new cases
e.g. for L4 cache hit are added.

Fix up the conversion table for Skylake, similar as had been done
for Nehalem.

On Skylake server the encoding for L4 actually means persistent
memory. Handle this case too.

To properly describe it in the abstracted perf format I had to add
some new fields. Since a hit can have only one level add a new
field that is an enumeration, not a bit field to describe
the level. It can describe any level. Some numbers are also
used to describe PMEM and LFB.

Also add a new generic remote flag that can be combined with
the generic level to signify a remote cache.

And there is an extension field for the snoop indication to handle
the Forward state.

I didn't add a generic flag for hops because it's not needed
for Skylake.

I changed the existing encodings for older CPUs to also fill in the
new level and remote fields.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: acme@kernel.org
Cc: jolsa@kernel.org
Link: http://lkml.kernel.org/r/20170816222156.19953-3-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/events/intel/core.c    |  2 ++
 arch/x86/events/intel/ds.c      | 51 ++++++++++++++++++++++++++---------------
 arch/x86/events/perf_event.h    |  2 ++
 include/uapi/linux/perf_event.h | 30 ++++++++++++++++++++++--
 4 files changed, 64 insertions(+), 21 deletions(-)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index c3439a3..6f34200 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -4208,6 +4208,8 @@ __init int intel_pmu_init(void)
 						  skl_format_attr);
 		WARN_ON(!x86_pmu.format_attrs);
 		x86_pmu.cpu_events = hsw_events_attrs;
+		intel_pmu_pebs_data_source_skl(
+			boot_cpu_data.x86_model == INTEL_FAM6_SKYLAKE_X);
 		pr_cont("Skylake events, ");
 		break;
 
diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index 3ccdf8c..98e36e0 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -49,34 +49,47 @@ union intel_x86_pebs_dse {
  */
 #define P(a, b) PERF_MEM_S(a, b)
 #define OP_LH (P(OP, LOAD) | P(LVL, HIT))
+#define LEVEL(x) P(LVLNUM, x)
+#define REM P(REMOTE, REMOTE)
 #define SNOOP_NONE_MISS (P(SNOOP, NONE) | P(SNOOP, MISS))
 
 /* Version for Sandy Bridge and later */
 static u64 pebs_data_source[] = {
-	P(OP, LOAD) | P(LVL, MISS) | P(LVL, L3) | P(SNOOP, NA),/* 0x00:ukn L3 */
-	OP_LH | P(LVL, L1)  | P(SNOOP, NONE),	/* 0x01: L1 local */
-	OP_LH | P(LVL, LFB) | P(SNOOP, NONE),	/* 0x02: LFB hit */
-	OP_LH | P(LVL, L2)  | P(SNOOP, NONE),	/* 0x03: L2 hit */
-	OP_LH | P(LVL, L3)  | P(SNOOP, NONE),	/* 0x04: L3 hit */
-	OP_LH | P(LVL, L3)  | P(SNOOP, MISS),	/* 0x05: L3 hit, snoop miss */
-	OP_LH | P(LVL, L3)  | P(SNOOP, HIT),	/* 0x06: L3 hit, snoop hit */
-	OP_LH | P(LVL, L3)  | P(SNOOP, HITM),	/* 0x07: L3 hit, snoop hitm */
-	OP_LH | P(LVL, REM_CCE1) | P(SNOOP, HIT),  /* 0x08: L3 miss snoop hit */
-	OP_LH | P(LVL, REM_CCE1) | P(SNOOP, HITM), /* 0x09: L3 miss snoop hitm*/
-	OP_LH | P(LVL, LOC_RAM)  | P(SNOOP, HIT),  /* 0x0a: L3 miss, shared */
-	OP_LH | P(LVL, REM_RAM1) | P(SNOOP, HIT),  /* 0x0b: L3 miss, shared */
-	OP_LH | P(LVL, LOC_RAM)  | SNOOP_NONE_MISS,/* 0x0c: L3 miss, excl */
-	OP_LH | P(LVL, REM_RAM1) | SNOOP_NONE_MISS,/* 0x0d: L3 miss, excl */
-	OP_LH | P(LVL, IO)  | P(SNOOP, NONE), /* 0x0e: I/O */
-	OP_LH | P(LVL, UNC) | P(SNOOP, NONE), /* 0x0f: uncached */
+	P(OP, LOAD) | P(LVL, MISS) | LEVEL(L3) | P(SNOOP, NA),/* 0x00:ukn L3 */
+	OP_LH | P(LVL, L1)  | LEVEL(L1) | P(SNOOP, NONE),  /* 0x01: L1 local */
+	OP_LH | P(LVL, LFB) | LEVEL(LFB) | P(SNOOP, NONE), /* 0x02: LFB hit */
+	OP_LH | P(LVL, L2)  | LEVEL(L2) | P(SNOOP, NONE),  /* 0x03: L2 hit */
+	OP_LH | P(LVL, L3)  | LEVEL(L3) | P(SNOOP, NONE),  /* 0x04: L3 hit */
+	OP_LH | P(LVL, L3)  | LEVEL(L3) | P(SNOOP, MISS),  /* 0x05: L3 hit, snoop miss */
+	OP_LH | P(LVL, L3)  | LEVEL(L3) | P(SNOOP, HIT),   /* 0x06: L3 hit, snoop hit */
+	OP_LH | P(LVL, L3)  | LEVEL(L3) | P(SNOOP, HITM),  /* 0x07: L3 hit, snoop hitm */
+	OP_LH | P(LVL, REM_CCE1) | REM | LEVEL(L3) | P(SNOOP, HIT),  /* 0x08: L3 miss snoop hit */
+	OP_LH | P(LVL, REM_CCE1) | REM | LEVEL(L3) | P(SNOOP, HITM), /* 0x09: L3 miss snoop hitm*/
+	OP_LH | P(LVL, LOC_RAM)  | LEVEL(RAM) | P(SNOOP, HIT),       /* 0x0a: L3 miss, shared */
+	OP_LH | P(LVL, REM_RAM1) | REM | LEVEL(L3) | P(SNOOP, HIT),  /* 0x0b: L3 miss, shared */
+	OP_LH | P(LVL, LOC_RAM)  | LEVEL(RAM) | SNOOP_NONE_MISS,     /* 0x0c: L3 miss, excl */
+	OP_LH | P(LVL, REM_RAM1) | LEVEL(RAM) | REM | SNOOP_NONE_MISS, /* 0x0d: L3 miss, excl */
+	OP_LH | P(LVL, IO)  | LEVEL(NA) | P(SNOOP, NONE), /* 0x0e: I/O */
+	OP_LH | P(LVL, UNC) | LEVEL(NA) | P(SNOOP, NONE), /* 0x0f: uncached */
 };
 
 /* Patch up minor differences in the bits */
 void __init intel_pmu_pebs_data_source_nhm(void)
 {
-	pebs_data_source[0x05] = OP_LH | P(LVL, L3)  | P(SNOOP, HIT);
-	pebs_data_source[0x06] = OP_LH | P(LVL, L3)  | P(SNOOP, HITM);
-	pebs_data_source[0x07] = OP_LH | P(LVL, L3)  | P(SNOOP, HITM);
+	pebs_data_source[0x05] = OP_LH | P(LVL, L3) | LEVEL(L3) | P(SNOOP, HIT);
+	pebs_data_source[0x06] = OP_LH | P(LVL, L3) | LEVEL(L3) | P(SNOOP, HITM);
+	pebs_data_source[0x07] = OP_LH | P(LVL, L3) | LEVEL(L3) | P(SNOOP, HITM);
+}
+
+void __init intel_pmu_pebs_data_source_skl(bool pmem)
+{
+	u64 pmem_or_l4 = pmem ? LEVEL(PMEM) : LEVEL(L4);
+
+	pebs_data_source[0x08] = OP_LH | pmem_or_l4 | P(SNOOP, HIT);
+	pebs_data_source[0x09] = OP_LH | pmem_or_l4 | REM | P(SNOOP, HIT);
+	pebs_data_source[0x0b] = OP_LH | LEVEL(RAM) | REM | P(SNOOP, NONE);
+	pebs_data_source[0x0c] = OP_LH | LEVEL(ANY_CACHE) | REM | P(SNOOPX, FWD);
+	pebs_data_source[0x0d] = OP_LH | LEVEL(ANY_CACHE) | REM | P(SNOOP, HITM);
 }
 
 static u64 precise_store_data(u64 status)
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index 2e9636e..0f7dad8 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -948,6 +948,8 @@ void intel_pmu_lbr_init_knl(void);
 
 void intel_pmu_pebs_data_source_nhm(void);
 
+void intel_pmu_pebs_data_source_skl(bool pmem);
+
 int intel_pmu_setup_lbr_filter(struct perf_event *event);
 
 void intel_pt_interrupt(void);
diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index 642db5f..2a37ae9 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -954,14 +954,20 @@ union perf_mem_data_src {
 			mem_snoop:5,	/* snoop mode */
 			mem_lock:2,	/* lock instr */
 			mem_dtlb:7,	/* tlb access */
-			mem_rsvd:31;
+			mem_lvl_num:4,	/* memory hierarchy level number */
+			mem_remote:1,   /* remote */
+			mem_snoopx:2,	/* snoop mode, ext */
+			mem_rsvd:24;
 	};
 };
 #elif defined(__BIG_ENDIAN_BITFIELD)
 union perf_mem_data_src {
 	__u64 val;
 	struct {
-		__u64	mem_rsvd:31,
+		__u64	mem_rsvd:24,
+			mem_snoopx:2,	/* snoop mode, ext */
+			mem_remote:1,   /* remote */
+			mem_lvl_num:4,	/* memory hierarchy level number */
 			mem_dtlb:7,	/* tlb access */
 			mem_lock:2,	/* lock instr */
 			mem_snoop:5,	/* snoop mode */
@@ -998,6 +1004,22 @@ union perf_mem_data_src {
 #define PERF_MEM_LVL_UNC	0x2000 /* Uncached memory */
 #define PERF_MEM_LVL_SHIFT	5
 
+#define PERF_MEM_REMOTE_REMOTE	0x01  /* Remote */
+#define PERF_MEM_REMOTE_SHIFT	37
+
+#define PERF_MEM_LVLNUM_L1	0x01 /* L1 */
+#define PERF_MEM_LVLNUM_L2	0x02 /* L2 */
+#define PERF_MEM_LVLNUM_L3	0x03 /* L3 */
+#define PERF_MEM_LVLNUM_L4	0x04 /* L4 */
+/* 5-0xa available */
+#define PERF_MEM_LVLNUM_ANY_CACHE 0x0b /* Any cache */
+#define PERF_MEM_LVLNUM_LFB	0x0c /* LFB */
+#define PERF_MEM_LVLNUM_RAM	0x0d /* RAM */
+#define PERF_MEM_LVLNUM_PMEM	0x0e /* PMEM */
+#define PERF_MEM_LVLNUM_NA	0x0f /* N/A */
+
+#define PERF_MEM_LVLNUM_SHIFT	33
+
 /* snoop mode */
 #define PERF_MEM_SNOOP_NA	0x01 /* not available */
 #define PERF_MEM_SNOOP_NONE	0x02 /* no snoop */
@@ -1006,6 +1028,10 @@ union perf_mem_data_src {
 #define PERF_MEM_SNOOP_HITM	0x10 /* snoop hit modified */
 #define PERF_MEM_SNOOP_SHIFT	19
 
+#define PERF_MEM_SNOOPX_FWD	0x01 /* forward */
+/* 1 free */
+#define PERF_MEM_SNOOPX_SHIFT	37
+
 /* locked instruction */
 #define PERF_MEM_LOCK_NA	0x01 /* not available */
 #define PERF_MEM_LOCK_LOCKED	0x02 /* locked transaction */

^ permalink raw reply related	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2017-08-25 11:58 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-16 22:21 Fix Skylake PEBS data source for perf v5 Andi Kleen
2017-08-16 22:21 ` [PATCH v5 1/4] perf/x86: Move Nehalem PEBS code to flag Andi Kleen
2017-08-25 11:53   ` [tip:perf/core] " tip-bot for Andi Kleen
2017-08-16 22:21 ` [PATCH v5 2/4] perf/x86: Fix data source decoding for Skylake Andi Kleen
2017-08-25 11:53   ` [tip:perf/core] " tip-bot for Andi Kleen
2017-08-16 22:21 ` [PATCH v5 3/4] perf, tools: Add support for printing new mem_info encodings Andi Kleen
2017-08-23 13:01   ` Jiri Olsa
2017-08-23 14:00     ` Andi Kleen
2017-08-23 14:14       ` Jiri Olsa
2017-08-23 15:59         ` Andi Kleen
2017-08-24  8:23           ` Jiri Olsa
2017-08-24  8:23   ` [tip:perf/core] perf " tip-bot for Andi Kleen
2017-08-16 22:21 ` [PATCH v5 4/4] perf, tools: Add test cases for new data source encoding Andi Kleen
2017-08-24  8:23   ` [tip:perf/core] perf test: " tip-bot for Andi Kleen
2017-08-22 15:37 ` Fix Skylake PEBS data source for perf v5 Arnaldo Carvalho de Melo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).