linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 0/7] Cavium ThunderX uncore PMU support
@ 2016-02-12 16:55 Jan Glauber
  2016-02-12 16:55 ` [RFC PATCH 1/7] arm64/perf: Basic uncore counter support for Cavium ThunderX Jan Glauber
                   ` (7 more replies)
  0 siblings, 8 replies; 18+ messages in thread
From: Jan Glauber @ 2016-02-12 16:55 UTC (permalink / raw)
  To: Will Deacon, Mark Rutland; +Cc: linux-kernel, linux-arm-kernel, Jan Glauber

Hi,

this patch series provides access to various counters on the ThunderX SOC.

For details of the implementation see patch #1.

Patches #2-7 add ther various ThunderX specific PMUs.

I did not want to put these file into arch/arm64/kernel so I added a
"uncore" directory. Maybe this should be put somewhere under drivers/
instead.

Feedback welcome!

Jan


Jan Glauber (7):
  arm64/perf: Basic uncore counter support for Cavium ThunderX
  arm64/perf: Cavium ThunderX L2C TAD uncore support
  arm64/perf: Cavium ThunderX L2C CBC uncore support
  arm64/perf: Cavium ThunderX LMC uncore support
  arm64/perf: Cavium ThunderX OCX LNE uncore support
  arm64/perf: Cavium ThunderX OCX FRC uncore support
  arm64/perf: Cavium ThunderX OCX TLK uncore support

 arch/arm64/kernel/Makefile                       |   1 +
 arch/arm64/kernel/uncore/Makefile                |   7 +
 arch/arm64/kernel/uncore/uncore_cavium.c         | 229 +++++++++
 arch/arm64/kernel/uncore/uncore_cavium.h         |  97 ++++
 arch/arm64/kernel/uncore/uncore_cavium_l2c_cbc.c | 239 +++++++++
 arch/arm64/kernel/uncore/uncore_cavium_l2c_tad.c | 600 +++++++++++++++++++++++
 arch/arm64/kernel/uncore/uncore_cavium_lmc.c     | 201 ++++++++
 arch/arm64/kernel/uncore/uncore_cavium_ocx_frc.c | 248 ++++++++++
 arch/arm64/kernel/uncore/uncore_cavium_ocx_lne.c | 270 ++++++++++
 arch/arm64/kernel/uncore/uncore_cavium_ocx_tlk.c | 366 ++++++++++++++
 10 files changed, 2258 insertions(+)
 create mode 100644 arch/arm64/kernel/uncore/Makefile
 create mode 100644 arch/arm64/kernel/uncore/uncore_cavium.c
 create mode 100644 arch/arm64/kernel/uncore/uncore_cavium.h
 create mode 100644 arch/arm64/kernel/uncore/uncore_cavium_l2c_cbc.c
 create mode 100644 arch/arm64/kernel/uncore/uncore_cavium_l2c_tad.c
 create mode 100644 arch/arm64/kernel/uncore/uncore_cavium_lmc.c
 create mode 100644 arch/arm64/kernel/uncore/uncore_cavium_ocx_frc.c
 create mode 100644 arch/arm64/kernel/uncore/uncore_cavium_ocx_lne.c
 create mode 100644 arch/arm64/kernel/uncore/uncore_cavium_ocx_tlk.c

-- 
1.9.1

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [RFC PATCH 1/7] arm64/perf: Basic uncore counter support for Cavium ThunderX
  2016-02-12 16:55 [RFC PATCH 0/7] Cavium ThunderX uncore PMU support Jan Glauber
@ 2016-02-12 16:55 ` Jan Glauber
  2016-02-12 17:36   ` Mark Rutland
  2016-02-12 16:55 ` [RFC PATCH 2/7] arm64/perf: Cavium ThunderX L2C TAD uncore support Jan Glauber
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 18+ messages in thread
From: Jan Glauber @ 2016-02-12 16:55 UTC (permalink / raw)
  To: Will Deacon, Mark Rutland; +Cc: linux-kernel, linux-arm-kernel, Jan Glauber

Provide uncore facilities for non-CPU performance counter units.
Based on Intel/AMD uncore pmu support.

The uncore PMUs can be found under /sys/bus/event_source/devices.
All counters are exported via sysfs in the corresponding events
files under the PMU directory so the perf tool can list the event names.

There are 2 points that are special in this implementation:

1) The PMU detection solely relies on PCI device detection. If a
   matching PCI device is found the PMU is created. The code can deal
   with multiple units of the same type, e.g. more than one memory
   controller.

2) Counters are summarized across the different units of the same type,
   e.g. L2C TAD 0..7 is presented as a single counter (adding the
   values from TAD 0 to 7). Although losing the ability to read a
   single value the merged values are easier to use and yield
   enough information.

Signed-off-by: Jan Glauber <jglauber@cavium.com>
---
 arch/arm64/kernel/Makefile               |   1 +
 arch/arm64/kernel/uncore/Makefile        |   1 +
 arch/arm64/kernel/uncore/uncore_cavium.c | 210 +++++++++++++++++++++++++++++++
 arch/arm64/kernel/uncore/uncore_cavium.h |  73 +++++++++++
 4 files changed, 285 insertions(+)
 create mode 100644 arch/arm64/kernel/uncore/Makefile
 create mode 100644 arch/arm64/kernel/uncore/uncore_cavium.c
 create mode 100644 arch/arm64/kernel/uncore/uncore_cavium.h

diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index 83cd7e6..c2d2810 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -42,6 +42,7 @@ arm64-obj-$(CONFIG_PCI)			+= pci.o
 arm64-obj-$(CONFIG_ARMV8_DEPRECATED)	+= armv8_deprecated.o
 arm64-obj-$(CONFIG_ACPI)		+= acpi.o
 arm64-obj-$(CONFIG_PARAVIRT)		+= paravirt.o
+arm64-obj-$(CONFIG_ARCH_THUNDER)	+= uncore/
 
 obj-y					+= $(arm64-obj-y) vdso/
 obj-m					+= $(arm64-obj-m)
diff --git a/arch/arm64/kernel/uncore/Makefile b/arch/arm64/kernel/uncore/Makefile
new file mode 100644
index 0000000..b9c72c2
--- /dev/null
+++ b/arch/arm64/kernel/uncore/Makefile
@@ -0,0 +1 @@
+obj-$(CONFIG_ARCH_THUNDER) += uncore_cavium.o
diff --git a/arch/arm64/kernel/uncore/uncore_cavium.c b/arch/arm64/kernel/uncore/uncore_cavium.c
new file mode 100644
index 0000000..0cfcc83
--- /dev/null
+++ b/arch/arm64/kernel/uncore/uncore_cavium.c
@@ -0,0 +1,210 @@
+/*
+ * Cavium Thunder uncore PMU support. Derived from Intel and AMD uncore code.
+ *
+ * Copyright (C) 2015,2016 Cavium Inc.
+ * Author: Jan Glauber <jan.glauber@cavium.com>
+ */
+
+#include <linux/types.h>
+#include <linux/slab.h>
+#include <linux/init.h>
+#include <linux/cpu.h>
+#include <linux/io.h>
+#include <linux/perf_event.h>
+#include <linux/pci.h>
+
+#include <asm/cpufeature.h>
+#include <asm/cputype.h>
+
+#include "uncore_cavium.h"
+
+int thunder_uncore_version;
+
+struct thunder_uncore *event_to_thunder_uncore(struct perf_event *event)
+{
+	return NULL;
+}
+
+void thunder_uncore_read(struct perf_event *event)
+{
+	struct thunder_uncore *uncore = event_to_thunder_uncore(event);
+	struct hw_perf_event *hwc = &event->hw;
+	u64 prev, new = 0;
+	s64 delta;
+	int i;
+
+	/*
+	 * since we do not enable counter overflow interrupts,
+	 * we do not have to worry about prev_count changing on us
+	 */
+
+	prev = local64_read(&hwc->prev_count);
+
+	/* read counter values from all units */
+	for (i = 0; i < uncore->nr_units; i++)
+		new += readq(map_offset(hwc->event_base, uncore, i));
+
+	local64_set(&hwc->prev_count, new);
+	delta = new - prev;
+	local64_add(delta, &event->count);
+}
+
+void thunder_uncore_del(struct perf_event *event, int flags)
+{
+	struct thunder_uncore *uncore = event_to_thunder_uncore(event);
+	struct hw_perf_event *hwc = &event->hw;
+	int i;
+
+	event->pmu->stop(event, PERF_EF_UPDATE);
+
+	for (i = 0; i < uncore->num_counters; i++) {
+		if (cmpxchg(&uncore->events[i], event, NULL) == event)
+			break;
+	}
+	hwc->idx = -1;
+}
+
+int thunder_uncore_event_init(struct perf_event *event)
+{
+	struct hw_perf_event *hwc = &event->hw;
+	struct thunder_uncore *uncore;
+
+	if (event->attr.type != event->pmu->type)
+		return -ENOENT;
+
+	/* we do not support sampling */
+	if (is_sampling_event(event))
+		return -EINVAL;
+
+	/* counters do not have these bits */
+	if (event->attr.exclude_user	||
+	    event->attr.exclude_kernel	||
+	    event->attr.exclude_host	||
+	    event->attr.exclude_guest	||
+	    event->attr.exclude_hv	||
+	    event->attr.exclude_idle)
+		return -EINVAL;
+
+	/* and we do not enable counter overflow interrupts */
+
+	uncore = event_to_thunder_uncore(event);
+	if (!uncore)
+		return -ENODEV;
+	if (!uncore->event_valid(event->attr.config))
+		return -EINVAL;
+
+	hwc->config = event->attr.config;
+	hwc->idx = -1;
+
+	/* and we don't care about CPU */
+
+	return 0;
+}
+
+static cpumask_t thunder_active_mask;
+
+static ssize_t thunder_uncore_attr_show_cpumask(struct device *dev,
+						struct device_attribute *attr,
+						char *buf)
+{
+	cpumask_t *active_mask = &thunder_active_mask;
+
+	/*
+	 * Thunder uncore events are independent from CPUs. Provide a cpumask
+	 * nevertheless to prevent perf from adding the event per-cpu and just
+	 * set the mask to one online CPU.
+	 */
+	cpumask_set_cpu(cpumask_first(cpu_online_mask), active_mask);
+
+	return cpumap_print_to_pagebuf(true, buf, active_mask);
+}
+static DEVICE_ATTR(cpumask, S_IRUGO, thunder_uncore_attr_show_cpumask, NULL);
+
+static struct attribute *thunder_uncore_attrs[] = {
+	&dev_attr_cpumask.attr,
+	NULL,
+};
+
+struct attribute_group thunder_uncore_attr_group = {
+	.attrs = thunder_uncore_attrs,
+};
+
+ssize_t thunder_events_sysfs_show(struct device *dev,
+				  struct device_attribute *attr,
+				  char *page)
+{
+	struct perf_pmu_events_attr *pmu_attr =
+		container_of(attr, struct perf_pmu_events_attr, attr);
+
+	if (pmu_attr->event_str)
+		return sprintf(page, "%s", pmu_attr->event_str);
+
+	return 0;
+}
+
+int __init thunder_uncore_setup(struct thunder_uncore *uncore, int id,
+			 unsigned long offset, unsigned long size,
+			 struct pmu *pmu)
+{
+	struct pci_dev *pdev = NULL;
+	pci_bus_addr_t start;
+	int ret, node = 0;
+
+	/* detect PCI devices */
+	do {
+		pdev = pci_get_device(PCI_VENDOR_ID_CAVIUM, id, pdev);
+		if (!pdev)
+			break;
+		start = pci_resource_start(pdev, 0);
+		uncore->pdevs[node].pdev = pdev;
+		uncore->pdevs[node].base = start;
+		uncore->pdevs[node].map = ioremap(start + offset, size);
+		node++;
+		if (node >= MAX_NR_UNCORE_PDEVS) {
+			pr_err("reached pdev limit\n");
+			break;
+		}
+	} while (1);
+
+	if (!node)
+		return -ENODEV;
+
+	uncore->nr_units = node;
+
+	ret = perf_pmu_register(pmu, pmu->name, -1);
+	if (ret)
+		goto fail;
+
+	uncore->pmu = pmu;
+	return 0;
+
+fail:
+	for (node = 0; node < MAX_NR_UNCORE_PDEVS; node++) {
+		pdev = uncore->pdevs[node].pdev;
+		if (!pdev)
+			break;
+		iounmap(uncore->pdevs[node].map);
+		pci_dev_put(pdev);
+	}
+	return ret;
+}
+
+static int __init thunder_uncore_init(void)
+{
+	unsigned long implementor = read_cpuid_implementor();
+	unsigned long part_number = read_cpuid_part_number();
+	u32 variant;
+
+	if (implementor != ARM_CPU_IMP_CAVIUM ||
+	    part_number != CAVIUM_CPU_PART_THUNDERX)
+		return -ENODEV;
+
+	/* detect pass2 which contains different counters */
+	variant = MIDR_VARIANT(read_cpuid_id());
+	if (variant == 1)
+		thunder_uncore_version = 1;
+	pr_info("PMU version: %d\n", thunder_uncore_version);
+
+	return 0;
+}
+late_initcall(thunder_uncore_init);
diff --git a/arch/arm64/kernel/uncore/uncore_cavium.h b/arch/arm64/kernel/uncore/uncore_cavium.h
new file mode 100644
index 0000000..acd121d
--- /dev/null
+++ b/arch/arm64/kernel/uncore/uncore_cavium.h
@@ -0,0 +1,73 @@
+#include <linux/types.h>
+#include <linux/slab.h>
+#include <linux/init.h>
+#include <linux/cpu.h>
+#include <linux/io.h>
+#include <linux/perf_event.h>
+#include <linux/pci.h>
+
+#include <asm/cpufeature.h>
+#include <asm/cputype.h>
+
+#undef pr_fmt
+#define pr_fmt(fmt)     "thunderx_uncore: " fmt
+
+enum uncore_type {
+	NOP_TYPE,
+};
+
+extern int thunder_uncore_version;
+
+#define MAX_NR_UNCORE_PDEVS		16
+
+/* maximum number of parallel hardware counters for all uncore parts */
+#define MAX_COUNTERS			64
+
+/* generic uncore struct for different pmu types */
+struct thunder_uncore {
+	int num_counters;
+	int nr_units;
+	int type;
+	struct pmu *pmu;
+	int (*event_valid)(u64);
+	struct {
+		unsigned long base;
+		void __iomem *map;
+		struct pci_dev *pdev;
+	} pdevs[MAX_NR_UNCORE_PDEVS];
+	struct perf_event *events[MAX_COUNTERS];
+};
+
+#define EVENT_PTR(_id) (&event_attr_##_id.attr.attr)
+
+#define EVENT_ATTR(_name, _val)						   \
+static struct perf_pmu_events_attr event_attr_##_name = {		   \
+	.attr	   = __ATTR(_name, 0444, thunder_events_sysfs_show, NULL), \
+	.event_str = "event=" __stringify(_val),			   \
+};
+
+#define EVENT_ATTR_STR(_name, _str)					   \
+static struct perf_pmu_events_attr event_attr_##_name = {		   \
+	.attr	   = __ATTR(_name, 0444, thunder_events_sysfs_show, NULL), \
+	.event_str = _str,						   \
+};
+
+static inline void __iomem *map_offset(unsigned long addr,
+				struct thunder_uncore *uncore, int unit)
+{
+	return (void __iomem *) (addr + uncore->pdevs[unit].map);
+}
+
+extern struct attribute_group thunder_uncore_attr_group;
+
+/* Prototypes */
+struct thunder_uncore *event_to_thunder_uncore(struct perf_event *event);
+void thunder_uncore_del(struct perf_event *event, int flags);
+int thunder_uncore_event_init(struct perf_event *event);
+void thunder_uncore_read(struct perf_event *event);
+int thunder_uncore_setup(struct thunder_uncore *uncore, int id,
+			 unsigned long offset, unsigned long size,
+			 struct pmu *pmu);
+ssize_t thunder_events_sysfs_show(struct device *dev,
+				  struct device_attribute *attr,
+				  char *page);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [RFC PATCH 2/7] arm64/perf: Cavium ThunderX L2C TAD uncore support
  2016-02-12 16:55 [RFC PATCH 0/7] Cavium ThunderX uncore PMU support Jan Glauber
  2016-02-12 16:55 ` [RFC PATCH 1/7] arm64/perf: Basic uncore counter support for Cavium ThunderX Jan Glauber
@ 2016-02-12 16:55 ` Jan Glauber
  2016-02-12 16:55 ` [RFC PATCH 3/7] arm64/perf: Cavium ThunderX L2C CBC " Jan Glauber
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 18+ messages in thread
From: Jan Glauber @ 2016-02-12 16:55 UTC (permalink / raw)
  To: Will Deacon, Mark Rutland; +Cc: linux-kernel, linux-arm-kernel, Jan Glauber

Support counters of the L2 Cache tag and data units.

Signed-off-by: Jan Glauber <jglauber@cavium.com>
---
 arch/arm64/kernel/uncore/Makefile                |   3 +-
 arch/arm64/kernel/uncore/uncore_cavium.c         |   6 +-
 arch/arm64/kernel/uncore/uncore_cavium.h         |   6 +-
 arch/arm64/kernel/uncore/uncore_cavium_l2c_tad.c | 600 +++++++++++++++++++++++
 4 files changed, 612 insertions(+), 3 deletions(-)
 create mode 100644 arch/arm64/kernel/uncore/uncore_cavium_l2c_tad.c

diff --git a/arch/arm64/kernel/uncore/Makefile b/arch/arm64/kernel/uncore/Makefile
index b9c72c2..6a16caf 100644
--- a/arch/arm64/kernel/uncore/Makefile
+++ b/arch/arm64/kernel/uncore/Makefile
@@ -1 +1,2 @@
-obj-$(CONFIG_ARCH_THUNDER) += uncore_cavium.o
+obj-$(CONFIG_ARCH_THUNDER) += uncore_cavium.o		\
+			      uncore_cavium_l2c_tad.o
diff --git a/arch/arm64/kernel/uncore/uncore_cavium.c b/arch/arm64/kernel/uncore/uncore_cavium.c
index 0cfcc83..b625caf 100644
--- a/arch/arm64/kernel/uncore/uncore_cavium.c
+++ b/arch/arm64/kernel/uncore/uncore_cavium.c
@@ -22,7 +22,10 @@ int thunder_uncore_version;
 
 struct thunder_uncore *event_to_thunder_uncore(struct perf_event *event)
 {
-	return NULL;
+	if (event->pmu->type == thunder_l2c_tad_pmu.type)
+		return thunder_uncore_l2c_tad;
+	else
+		return NULL;
 }
 
 void thunder_uncore_read(struct perf_event *event)
@@ -205,6 +208,7 @@ static int __init thunder_uncore_init(void)
 		thunder_uncore_version = 1;
 	pr_info("PMU version: %d\n", thunder_uncore_version);
 
+	thunder_uncore_l2c_tad_setup();
 	return 0;
 }
 late_initcall(thunder_uncore_init);
diff --git a/arch/arm64/kernel/uncore/uncore_cavium.h b/arch/arm64/kernel/uncore/uncore_cavium.h
index acd121d..90e6a2d 100644
--- a/arch/arm64/kernel/uncore/uncore_cavium.h
+++ b/arch/arm64/kernel/uncore/uncore_cavium.h
@@ -13,7 +13,7 @@
 #define pr_fmt(fmt)     "thunderx_uncore: " fmt
 
 enum uncore_type {
-	NOP_TYPE,
+	L2C_TAD_TYPE,
 };
 
 extern int thunder_uncore_version;
@@ -59,6 +59,8 @@ static inline void __iomem *map_offset(unsigned long addr,
 }
 
 extern struct attribute_group thunder_uncore_attr_group;
+extern struct thunder_uncore *thunder_uncore_l2c_tad;
+extern struct pmu thunder_l2c_tad_pmu;
 
 /* Prototypes */
 struct thunder_uncore *event_to_thunder_uncore(struct perf_event *event);
@@ -71,3 +73,5 @@ int thunder_uncore_setup(struct thunder_uncore *uncore, int id,
 ssize_t thunder_events_sysfs_show(struct device *dev,
 				  struct device_attribute *attr,
 				  char *page);
+
+int thunder_uncore_l2c_tad_setup(void);
diff --git a/arch/arm64/kernel/uncore/uncore_cavium_l2c_tad.c b/arch/arm64/kernel/uncore/uncore_cavium_l2c_tad.c
new file mode 100644
index 0000000..bf45b4a
--- /dev/null
+++ b/arch/arm64/kernel/uncore/uncore_cavium_l2c_tad.c
@@ -0,0 +1,600 @@
+/*
+ * Cavium Thunder uncore PMU support, L2C TAD counters.
+ *
+ * Copyright 2016 Cavium Inc.
+ * Author: Jan Glauber <jan.glauber@cavium.com>
+ */
+
+#include <linux/types.h>
+#include <linux/slab.h>
+#include <linux/init.h>
+#include <linux/cpu.h>
+#include <linux/io.h>
+#include <linux/perf_event.h>
+#include <linux/pci.h>
+
+#include <asm/cpufeature.h>
+#include <asm/cputype.h>
+
+#include "uncore_cavium.h"
+
+#ifndef PCI_DEVICE_ID_THUNDER_L2C_TAD
+#define PCI_DEVICE_ID_THUNDER_L2C_TAD	0xa02e
+#endif
+
+#define L2C_TAD_NR_COUNTERS             4
+#define L2C_TAD_CONTROL_OFFSET		0x10000
+#define L2C_TAD_COUNTER_OFFSET		0x100
+
+/* L2C TAD event list */
+#define L2C_TAD_EVENTS_DISABLED		0x00
+
+#define L2C_TAD_EVENT_L2T_HIT		0x01
+#define L2C_TAD_EVENT_L2T_MISS		0x02
+#define L2C_TAD_EVENT_L2T_NOALLOC	0x03
+#define L2C_TAD_EVENT_L2_VIC		0x04
+#define L2C_TAD_EVENT_SC_FAIL		0x05
+#define L2C_TAD_EVENT_SC_PASS		0x06
+#define L2C_TAD_EVENT_LFB_OCC		0x07
+#define L2C_TAD_EVENT_WAIT_LFB		0x08
+#define L2C_TAD_EVENT_WAIT_VAB		0x09
+
+#define L2C_TAD_EVENT_RTG_HIT		0x41
+#define L2C_TAD_EVENT_RTG_MISS		0x42
+#define L2C_TAD_EVENT_L2_RTG_VIC	0x44
+#define L2C_TAD_EVENT_L2_OPEN_OCI	0x48
+
+#define L2C_TAD_EVENT_QD0_IDX		0x80
+#define L2C_TAD_EVENT_QD0_RDAT		0x81
+#define L2C_TAD_EVENT_QD0_BNKS		0x82
+#define L2C_TAD_EVENT_QD0_WDAT		0x83
+
+#define L2C_TAD_EVENT_QD1_IDX		0x90
+#define L2C_TAD_EVENT_QD1_RDAT		0x91
+#define L2C_TAD_EVENT_QD1_BNKS		0x92
+#define L2C_TAD_EVENT_QD1_WDAT		0x93
+
+#define L2C_TAD_EVENT_QD2_IDX		0xa0
+#define L2C_TAD_EVENT_QD2_RDAT		0xa1
+#define L2C_TAD_EVENT_QD2_BNKS		0xa2
+#define L2C_TAD_EVENT_QD2_WDAT		0xa3
+
+#define L2C_TAD_EVENT_QD3_IDX		0xb0
+#define L2C_TAD_EVENT_QD3_RDAT		0xb1
+#define L2C_TAD_EVENT_QD3_BNKS		0xb2
+#define L2C_TAD_EVENT_QD3_WDAT		0xb3
+
+#define L2C_TAD_EVENT_QD4_IDX		0xc0
+#define L2C_TAD_EVENT_QD4_RDAT		0xc1
+#define L2C_TAD_EVENT_QD4_BNKS		0xc2
+#define L2C_TAD_EVENT_QD4_WDAT		0xc3
+
+#define L2C_TAD_EVENT_QD5_IDX		0xd0
+#define L2C_TAD_EVENT_QD5_RDAT		0xd1
+#define L2C_TAD_EVENT_QD5_BNKS		0xd2
+#define L2C_TAD_EVENT_QD5_WDAT		0xd3
+
+#define L2C_TAD_EVENT_QD6_IDX		0xe0
+#define L2C_TAD_EVENT_QD6_RDAT		0xe1
+#define L2C_TAD_EVENT_QD6_BNKS		0xe2
+#define L2C_TAD_EVENT_QD6_WDAT		0xe3
+
+#define L2C_TAD_EVENT_QD7_IDX		0xf0
+#define L2C_TAD_EVENT_QD7_RDAT		0xf1
+#define L2C_TAD_EVENT_QD7_BNKS		0xf2
+#define L2C_TAD_EVENT_QD7_WDAT		0xf3
+
+/* pass2 added/changed event list */
+#define L2C_TAD_EVENT_OPEN_CCPI			0x0a
+#define L2C_TAD_EVENT_LOOKUP			0x40
+#define L2C_TAD_EVENT_LOOKUP_XMC_LCL		0x41
+#define L2C_TAD_EVENT_LOOKUP_XMC_RMT		0x42
+#define L2C_TAD_EVENT_LOOKUP_MIB		0x43
+#define L2C_TAD_EVENT_LOOKUP_ALL		0x44
+#define L2C_TAD_EVENT_TAG_ALC_HIT		0x48
+#define L2C_TAD_EVENT_TAG_ALC_MISS		0x49
+#define L2C_TAD_EVENT_TAG_ALC_NALC		0x4a
+#define L2C_TAD_EVENT_TAG_NALC_HIT		0x4b
+#define L2C_TAD_EVENT_TAG_NALC_MISS		0x4c
+#define L2C_TAD_EVENT_LMC_WR			0x4e
+#define L2C_TAD_EVENT_LMC_SBLKDTY		0x4f
+#define L2C_TAD_EVENT_TAG_ALC_RTG_HIT		0x50
+#define L2C_TAD_EVENT_TAG_ALC_RTG_HITE		0x51
+#define L2C_TAD_EVENT_TAG_ALC_RTG_HITS		0x52
+#define L2C_TAD_EVENT_TAG_ALC_RTG_MISS		0x53
+#define L2C_TAD_EVENT_TAG_NALC_RTG_HIT		0x54
+#define L2C_TAD_EVENT_TAG_NALC_RTG_MISS		0x55
+#define L2C_TAD_EVENT_TAG_NALC_RTG_HITE		0x56
+#define L2C_TAD_EVENT_TAG_NALC_RTG_HITS		0x57
+#define L2C_TAD_EVENT_TAG_ALC_LCL_EVICT		0x58
+#define L2C_TAD_EVENT_TAG_ALC_LCL_CLNVIC	0x59
+#define L2C_TAD_EVENT_TAG_ALC_LCL_DTYVIC	0x5a
+#define L2C_TAD_EVENT_TAG_ALC_RMT_EVICT		0x5b
+#define L2C_TAD_EVENT_TAG_ALC_RMT_VIC		0x5c
+#define L2C_TAD_EVENT_RTG_ALC			0x5d
+#define L2C_TAD_EVENT_RTG_ALC_HIT		0x5e
+#define L2C_TAD_EVENT_RTG_ALC_HITWB		0x5f
+#define L2C_TAD_EVENT_STC_TOTAL			0x60
+#define L2C_TAD_EVENT_STC_TOTAL_FAIL		0x61
+#define L2C_TAD_EVENT_STC_RMT			0x62
+#define L2C_TAD_EVENT_STC_RMT_FAIL		0x63
+#define L2C_TAD_EVENT_STC_LCL			0x64
+#define L2C_TAD_EVENT_STC_LCL_FAIL		0x65
+#define L2C_TAD_EVENT_OCI_RTG_WAIT		0x68
+#define L2C_TAD_EVENT_OCI_FWD_CYC_HIT		0x69
+#define L2C_TAD_EVENT_OCI_FWD_RACE		0x6a
+#define L2C_TAD_EVENT_OCI_HAKS			0x6b
+#define L2C_TAD_EVENT_OCI_FLDX_TAG_E_NODAT	0x6c
+#define L2C_TAD_EVENT_OCI_FLDX_TAG_E_DAT	0x6d
+#define L2C_TAD_EVENT_OCI_RLDD			0x6e
+#define L2C_TAD_EVENT_OCI_RLDD_PEMD		0x6f
+#define L2C_TAD_EVENT_OCI_RRQ_DAT_CNT		0x70
+#define L2C_TAD_EVENT_OCI_RRQ_DAT_DMASK		0x71
+#define L2C_TAD_EVENT_OCI_RSP_DAT_CNT		0x72
+#define L2C_TAD_EVENT_OCI_RSP_DAT_DMASK		0x73
+#define L2C_TAD_EVENT_OCI_RSP_DAT_VICD_CNT	0x74
+#define L2C_TAD_EVENT_OCI_RSP_DAT_VICD_DMASK	0x75
+#define L2C_TAD_EVENT_OCI_RTG_ALC_EVICT		0x76
+#define L2C_TAD_EVENT_OCI_RTG_ALC_VIC		0x77
+
+struct thunder_uncore *thunder_uncore_l2c_tad;
+
+static void thunder_uncore_start(struct perf_event *event, int flags)
+{
+	struct thunder_uncore *uncore = event_to_thunder_uncore(event);
+	struct hw_perf_event *hwc = &event->hw;
+	u64 prev;
+	int i;
+
+	/* restore counter value divided by units into all counters */
+	if (flags & PERF_EF_RELOAD) {
+		prev = local64_read(&hwc->prev_count);
+		prev = prev / uncore->nr_units;
+		for (i = 0; i < uncore->nr_units; i++)
+			writeq(prev, map_offset(hwc->event_base, uncore, i));
+	}
+
+	hwc->state = 0;
+
+	/* write byte in control registers for all units */
+	for (i = 0; i < uncore->nr_units; i++)
+		writeb(hwc->config, map_offset(hwc->config_base, uncore, i));
+
+	perf_event_update_userpage(event);
+}
+
+static void thunder_uncore_stop(struct perf_event *event, int flags)
+{
+	struct thunder_uncore *uncore = event_to_thunder_uncore(event);
+	struct hw_perf_event *hwc = &event->hw;
+	int i;
+
+	/* reset selection value for all units */
+	for (i = 0; i < uncore->nr_units; i++)
+		writeb(L2C_TAD_EVENTS_DISABLED,
+		       map_offset(hwc->config_base, uncore, i));
+	hwc->state |= PERF_HES_STOPPED;
+
+	if ((flags & PERF_EF_UPDATE) && !(hwc->state & PERF_HES_UPTODATE)) {
+		thunder_uncore_read(event);
+		hwc->state |= PERF_HES_UPTODATE;
+	}
+}
+
+static int thunder_uncore_add(struct perf_event *event, int flags)
+	{
+	struct thunder_uncore *uncore = event_to_thunder_uncore(event);
+	struct hw_perf_event *hwc = &event->hw;
+	int i;
+
+	WARN_ON_ONCE(!uncore);
+
+	/* are we already assigned? */
+	if (hwc->idx != -1 && uncore->events[hwc->idx] == event)
+		goto out;
+
+	for (i = 0; i < uncore->num_counters; i++) {
+		if (uncore->events[i] == event) {
+			hwc->idx = i;
+			goto out;
+		}
+	}
+
+	/* if not take the first available counter */
+	hwc->idx = -1;
+	for (i = 0; i < uncore->num_counters; i++) {
+		if (cmpxchg(&uncore->events[i], NULL, event) == NULL) {
+			hwc->idx = i;
+			break;
+		}
+	}
+out:
+	if (hwc->idx == -1)
+		return -EBUSY;
+
+	hwc->config_base = hwc->idx;
+	hwc->event_base = L2C_TAD_COUNTER_OFFSET +
+			  hwc->idx * sizeof(unsigned long long);
+	hwc->state = PERF_HES_UPTODATE | PERF_HES_STOPPED;
+
+	if (flags & PERF_EF_START)
+		thunder_uncore_start(event, PERF_EF_RELOAD);
+
+	return 0;
+}
+
+PMU_FORMAT_ATTR(event, "config:0-7");
+
+static struct attribute *thunder_l2c_tad_format_attr[] = {
+	&format_attr_event.attr,
+	NULL,
+};
+
+static struct attribute_group thunder_l2c_tad_format_group = {
+	.name = "format",
+	.attrs = thunder_l2c_tad_format_attr,
+};
+
+EVENT_ATTR(l2t_hit,	L2C_TAD_EVENT_L2T_HIT);
+EVENT_ATTR(l2t_miss,	L2C_TAD_EVENT_L2T_MISS);
+EVENT_ATTR(l2t_noalloc,	L2C_TAD_EVENT_L2T_NOALLOC);
+EVENT_ATTR(l2_vic,	L2C_TAD_EVENT_L2_VIC);
+EVENT_ATTR(sc_fail,	L2C_TAD_EVENT_SC_FAIL);
+EVENT_ATTR(sc_pass,	L2C_TAD_EVENT_SC_PASS);
+EVENT_ATTR(lfb_occ,	L2C_TAD_EVENT_LFB_OCC);
+EVENT_ATTR(wait_lfb,	L2C_TAD_EVENT_WAIT_LFB);
+EVENT_ATTR(wait_vab,	L2C_TAD_EVENT_WAIT_VAB);
+EVENT_ATTR(rtg_hit,	L2C_TAD_EVENT_RTG_HIT);
+EVENT_ATTR(rtg_miss,	L2C_TAD_EVENT_RTG_MISS);
+EVENT_ATTR(l2_rtg_vic,	L2C_TAD_EVENT_L2_RTG_VIC);
+EVENT_ATTR(l2_open_oci,	L2C_TAD_EVENT_L2_OPEN_OCI);
+
+EVENT_ATTR(qd0_idx,	L2C_TAD_EVENT_QD0_IDX);
+EVENT_ATTR(qd0_rdat,	L2C_TAD_EVENT_QD0_RDAT);
+EVENT_ATTR(qd0_bnks,	L2C_TAD_EVENT_QD0_BNKS);
+EVENT_ATTR(qd0_wdat,	L2C_TAD_EVENT_QD0_WDAT);
+
+EVENT_ATTR(qd1_idx,	L2C_TAD_EVENT_QD1_IDX);
+EVENT_ATTR(qd1_rdat,	L2C_TAD_EVENT_QD1_RDAT);
+EVENT_ATTR(qd1_bnks,	L2C_TAD_EVENT_QD1_BNKS);
+EVENT_ATTR(qd1_wdat,	L2C_TAD_EVENT_QD1_WDAT);
+
+EVENT_ATTR(qd2_idx,	L2C_TAD_EVENT_QD2_IDX);
+EVENT_ATTR(qd2_rdat,	L2C_TAD_EVENT_QD2_RDAT);
+EVENT_ATTR(qd2_bnks,	L2C_TAD_EVENT_QD2_BNKS);
+EVENT_ATTR(qd2_wdat,	L2C_TAD_EVENT_QD2_WDAT);
+
+EVENT_ATTR(qd3_idx,	L2C_TAD_EVENT_QD3_IDX);
+EVENT_ATTR(qd3_rdat,	L2C_TAD_EVENT_QD3_RDAT);
+EVENT_ATTR(qd3_bnks,	L2C_TAD_EVENT_QD3_BNKS);
+EVENT_ATTR(qd3_wdat,	L2C_TAD_EVENT_QD3_WDAT);
+
+EVENT_ATTR(qd4_idx,	L2C_TAD_EVENT_QD4_IDX);
+EVENT_ATTR(qd4_rdat,	L2C_TAD_EVENT_QD4_RDAT);
+EVENT_ATTR(qd4_bnks,	L2C_TAD_EVENT_QD4_BNKS);
+EVENT_ATTR(qd4_wdat,	L2C_TAD_EVENT_QD4_WDAT);
+
+EVENT_ATTR(qd5_idx,	L2C_TAD_EVENT_QD5_IDX);
+EVENT_ATTR(qd5_rdat,	L2C_TAD_EVENT_QD5_RDAT);
+EVENT_ATTR(qd5_bnks,	L2C_TAD_EVENT_QD5_BNKS);
+EVENT_ATTR(qd5_wdat,	L2C_TAD_EVENT_QD5_WDAT);
+
+EVENT_ATTR(qd6_idx,	L2C_TAD_EVENT_QD6_IDX);
+EVENT_ATTR(qd6_rdat,	L2C_TAD_EVENT_QD6_RDAT);
+EVENT_ATTR(qd6_bnks,	L2C_TAD_EVENT_QD6_BNKS);
+EVENT_ATTR(qd6_wdat,	L2C_TAD_EVENT_QD6_WDAT);
+
+EVENT_ATTR(qd7_idx,	L2C_TAD_EVENT_QD7_IDX);
+EVENT_ATTR(qd7_rdat,	L2C_TAD_EVENT_QD7_RDAT);
+EVENT_ATTR(qd7_bnks,	L2C_TAD_EVENT_QD7_BNKS);
+EVENT_ATTR(qd7_wdat,	L2C_TAD_EVENT_QD7_WDAT);
+
+static struct attribute *thunder_l2c_tad_events_attr[] = {
+	EVENT_PTR(l2t_hit),
+	EVENT_PTR(l2t_miss),
+	EVENT_PTR(l2t_noalloc),
+	EVENT_PTR(l2_vic),
+	EVENT_PTR(sc_fail),
+	EVENT_PTR(sc_pass),
+	EVENT_PTR(lfb_occ),
+	EVENT_PTR(wait_lfb),
+	EVENT_PTR(wait_vab),
+	EVENT_PTR(rtg_hit),
+	EVENT_PTR(rtg_miss),
+	EVENT_PTR(l2_rtg_vic),
+	EVENT_PTR(l2_open_oci),
+
+	EVENT_PTR(qd0_idx),
+	EVENT_PTR(qd0_rdat),
+	EVENT_PTR(qd0_bnks),
+	EVENT_PTR(qd0_wdat),
+
+	EVENT_PTR(qd1_idx),
+	EVENT_PTR(qd1_rdat),
+	EVENT_PTR(qd1_bnks),
+	EVENT_PTR(qd1_wdat),
+
+	EVENT_PTR(qd2_idx),
+	EVENT_PTR(qd2_rdat),
+	EVENT_PTR(qd2_bnks),
+	EVENT_PTR(qd2_wdat),
+
+	EVENT_PTR(qd3_idx),
+	EVENT_PTR(qd3_rdat),
+	EVENT_PTR(qd3_bnks),
+	EVENT_PTR(qd3_wdat),
+
+	EVENT_PTR(qd4_idx),
+	EVENT_PTR(qd4_rdat),
+	EVENT_PTR(qd4_bnks),
+	EVENT_PTR(qd4_wdat),
+
+	EVENT_PTR(qd5_idx),
+	EVENT_PTR(qd5_rdat),
+	EVENT_PTR(qd5_bnks),
+	EVENT_PTR(qd5_wdat),
+
+	EVENT_PTR(qd6_idx),
+	EVENT_PTR(qd6_rdat),
+	EVENT_PTR(qd6_bnks),
+	EVENT_PTR(qd6_wdat),
+
+	EVENT_PTR(qd7_idx),
+	EVENT_PTR(qd7_rdat),
+	EVENT_PTR(qd7_bnks),
+	EVENT_PTR(qd7_wdat),
+	NULL,
+};
+
+/* pass2 added/chanegd events */
+EVENT_ATTR(open_ccpi,		L2C_TAD_EVENT_OPEN_CCPI);
+EVENT_ATTR(lookup,		L2C_TAD_EVENT_LOOKUP);
+EVENT_ATTR(lookup_xmc_lcl,	L2C_TAD_EVENT_LOOKUP_XMC_LCL);
+EVENT_ATTR(lookup_xmc_rmt,	L2C_TAD_EVENT_LOOKUP_XMC_RMT);
+EVENT_ATTR(lookup_mib,		L2C_TAD_EVENT_LOOKUP_MIB);
+EVENT_ATTR(lookup_all,		L2C_TAD_EVENT_LOOKUP_ALL);
+
+EVENT_ATTR(tag_alc_hit,		L2C_TAD_EVENT_TAG_ALC_HIT);
+EVENT_ATTR(tag_alc_miss,	L2C_TAD_EVENT_TAG_ALC_MISS);
+EVENT_ATTR(tag_alc_nalc,	L2C_TAD_EVENT_TAG_ALC_NALC);
+EVENT_ATTR(tag_nalc_hit,	L2C_TAD_EVENT_TAG_NALC_HIT);
+EVENT_ATTR(tag_nalc_miss,	L2C_TAD_EVENT_TAG_NALC_MISS);
+
+EVENT_ATTR(lmc_wr,		L2C_TAD_EVENT_LMC_WR);
+EVENT_ATTR(lmc_sblkdty,		L2C_TAD_EVENT_LMC_SBLKDTY);
+
+EVENT_ATTR(tag_alc_rtg_hit,	L2C_TAD_EVENT_TAG_ALC_RTG_HIT);
+EVENT_ATTR(tag_alc_rtg_hite,	L2C_TAD_EVENT_TAG_ALC_RTG_HITE);
+EVENT_ATTR(tag_alc_rtg_hits,	L2C_TAD_EVENT_TAG_ALC_RTG_HITS);
+EVENT_ATTR(tag_alc_rtg_miss,	L2C_TAD_EVENT_TAG_ALC_RTG_MISS);
+EVENT_ATTR(tag_alc_nalc_rtg_hit, L2C_TAD_EVENT_TAG_NALC_RTG_HIT);
+EVENT_ATTR(tag_nalc_rtg_miss,	L2C_TAD_EVENT_TAG_NALC_RTG_MISS);
+EVENT_ATTR(tag_nalc_rtg_hite,	L2C_TAD_EVENT_TAG_NALC_RTG_HITE);
+EVENT_ATTR(tag_nalc_rtg_hits,	L2C_TAD_EVENT_TAG_NALC_RTG_HITS);
+EVENT_ATTR(tag_alc_lcl_evict,	L2C_TAD_EVENT_TAG_ALC_LCL_EVICT);
+EVENT_ATTR(tag_alc_lcl_clnvic,	L2C_TAD_EVENT_TAG_ALC_LCL_CLNVIC);
+EVENT_ATTR(tag_alc_lcl_dtyvic,	L2C_TAD_EVENT_TAG_ALC_LCL_DTYVIC);
+EVENT_ATTR(tag_alc_rmt_evict,	L2C_TAD_EVENT_TAG_ALC_RMT_EVICT);
+EVENT_ATTR(tag_alc_rmt_vic,	L2C_TAD_EVENT_TAG_ALC_RMT_VIC);
+
+EVENT_ATTR(rtg_alc,		L2C_TAD_EVENT_RTG_ALC);
+EVENT_ATTR(rtg_alc_hit,		L2C_TAD_EVENT_RTG_ALC_HIT);
+EVENT_ATTR(rtg_alc_hitwb,	L2C_TAD_EVENT_RTG_ALC_HITWB);
+
+EVENT_ATTR(stc_total,		L2C_TAD_EVENT_STC_TOTAL);
+EVENT_ATTR(stc_total_fail,	L2C_TAD_EVENT_STC_TOTAL_FAIL);
+EVENT_ATTR(stc_rmt,		L2C_TAD_EVENT_STC_RMT);
+EVENT_ATTR(stc_rmt_fail,	L2C_TAD_EVENT_STC_RMT_FAIL);
+EVENT_ATTR(stc_lcl,		L2C_TAD_EVENT_STC_LCL);
+EVENT_ATTR(stc_lcl_fail,	L2C_TAD_EVENT_STC_LCL_FAIL);
+
+EVENT_ATTR(oci_rtg_wait,	L2C_TAD_EVENT_OCI_RTG_WAIT);
+EVENT_ATTR(oci_fwd_cyc_hit,	L2C_TAD_EVENT_OCI_FWD_CYC_HIT);
+EVENT_ATTR(oci_fwd_race,	L2C_TAD_EVENT_OCI_FWD_RACE);
+EVENT_ATTR(oci_haks,		L2C_TAD_EVENT_OCI_HAKS);
+EVENT_ATTR(oci_fldx_tag_e_nodat, L2C_TAD_EVENT_OCI_FLDX_TAG_E_NODAT);
+EVENT_ATTR(oci_fldx_tag_e_dat,	L2C_TAD_EVENT_OCI_FLDX_TAG_E_DAT);
+EVENT_ATTR(oci_rldd,		L2C_TAD_EVENT_OCI_RLDD);
+EVENT_ATTR(oci_rldd_pemd,	L2C_TAD_EVENT_OCI_RLDD_PEMD);
+EVENT_ATTR(oci_rrq_dat_cnt,	L2C_TAD_EVENT_OCI_RRQ_DAT_CNT);
+EVENT_ATTR(oci_rrq_dat_dmask,	L2C_TAD_EVENT_OCI_RRQ_DAT_DMASK);
+EVENT_ATTR(oci_rsp_dat_cnt,	L2C_TAD_EVENT_OCI_RSP_DAT_CNT);
+EVENT_ATTR(oci_rsp_dat_dmaks,	L2C_TAD_EVENT_OCI_RSP_DAT_DMASK);
+EVENT_ATTR(oci_rsp_dat_vicd_cnt, L2C_TAD_EVENT_OCI_RSP_DAT_VICD_CNT);
+EVENT_ATTR(oci_rsp_dat_vicd_dmask, L2C_TAD_EVENT_OCI_RSP_DAT_VICD_DMASK);
+EVENT_ATTR(oci_rtg_alc_evict,	L2C_TAD_EVENT_OCI_RTG_ALC_EVICT);
+EVENT_ATTR(oci_rtg_alc_vic,	L2C_TAD_EVENT_OCI_RTG_ALC_VIC);
+
+static struct attribute *thunder_l2c_tad_pass2_events_attr[] = {
+	EVENT_PTR(l2t_hit),
+	EVENT_PTR(l2t_miss),
+	EVENT_PTR(l2t_noalloc),
+	EVENT_PTR(l2_vic),
+	EVENT_PTR(sc_fail),
+	EVENT_PTR(sc_pass),
+	EVENT_PTR(lfb_occ),
+	EVENT_PTR(wait_lfb),
+	EVENT_PTR(wait_vab),
+	EVENT_PTR(open_ccpi),
+
+	EVENT_PTR(lookup),
+	EVENT_PTR(lookup_xmc_lcl),
+	EVENT_PTR(lookup_xmc_rmt),
+	EVENT_PTR(lookup_mib),
+	EVENT_PTR(lookup_all),
+
+	EVENT_PTR(tag_alc_hit),
+	EVENT_PTR(tag_alc_miss),
+	EVENT_PTR(tag_alc_nalc),
+	EVENT_PTR(tag_nalc_hit),
+	EVENT_PTR(tag_nalc_miss),
+
+	EVENT_PTR(lmc_wr),
+	EVENT_PTR(lmc_sblkdty),
+
+	EVENT_PTR(tag_alc_rtg_hit),
+	EVENT_PTR(tag_alc_rtg_hite),
+	EVENT_PTR(tag_alc_rtg_hits),
+	EVENT_PTR(tag_alc_rtg_miss),
+	EVENT_PTR(tag_alc_nalc_rtg_hit),
+	EVENT_PTR(tag_nalc_rtg_miss),
+	EVENT_PTR(tag_nalc_rtg_hite),
+	EVENT_PTR(tag_nalc_rtg_hits),
+	EVENT_PTR(tag_alc_lcl_evict),
+	EVENT_PTR(tag_alc_lcl_clnvic),
+	EVENT_PTR(tag_alc_lcl_dtyvic),
+	EVENT_PTR(tag_alc_rmt_evict),
+	EVENT_PTR(tag_alc_rmt_vic),
+
+	EVENT_PTR(rtg_alc),
+	EVENT_PTR(rtg_alc_hit),
+	EVENT_PTR(rtg_alc_hitwb),
+
+	EVENT_PTR(stc_total),
+	EVENT_PTR(stc_total_fail),
+	EVENT_PTR(stc_rmt),
+	EVENT_PTR(stc_rmt_fail),
+	EVENT_PTR(stc_lcl),
+	EVENT_PTR(stc_lcl_fail),
+
+	EVENT_PTR(oci_rtg_wait),
+	EVENT_PTR(oci_fwd_cyc_hit),
+	EVENT_PTR(oci_fwd_race),
+	EVENT_PTR(oci_haks),
+	EVENT_PTR(oci_fldx_tag_e_nodat),
+	EVENT_PTR(oci_fldx_tag_e_dat),
+	EVENT_PTR(oci_rldd),
+	EVENT_PTR(oci_rldd_pemd),
+	EVENT_PTR(oci_rrq_dat_cnt),
+	EVENT_PTR(oci_rrq_dat_dmask),
+	EVENT_PTR(oci_rsp_dat_cnt),
+	EVENT_PTR(oci_rsp_dat_dmaks),
+	EVENT_PTR(oci_rsp_dat_vicd_cnt),
+	EVENT_PTR(oci_rsp_dat_vicd_dmask),
+	EVENT_PTR(oci_rtg_alc_evict),
+	EVENT_PTR(oci_rtg_alc_vic),
+
+	EVENT_PTR(qd0_idx),
+	EVENT_PTR(qd0_rdat),
+	EVENT_PTR(qd0_bnks),
+	EVENT_PTR(qd0_wdat),
+
+	EVENT_PTR(qd1_idx),
+	EVENT_PTR(qd1_rdat),
+	EVENT_PTR(qd1_bnks),
+	EVENT_PTR(qd1_wdat),
+
+	EVENT_PTR(qd2_idx),
+	EVENT_PTR(qd2_rdat),
+	EVENT_PTR(qd2_bnks),
+	EVENT_PTR(qd2_wdat),
+
+	EVENT_PTR(qd3_idx),
+	EVENT_PTR(qd3_rdat),
+	EVENT_PTR(qd3_bnks),
+	EVENT_PTR(qd3_wdat),
+
+	EVENT_PTR(qd4_idx),
+	EVENT_PTR(qd4_rdat),
+	EVENT_PTR(qd4_bnks),
+	EVENT_PTR(qd4_wdat),
+
+	EVENT_PTR(qd5_idx),
+	EVENT_PTR(qd5_rdat),
+	EVENT_PTR(qd5_bnks),
+	EVENT_PTR(qd5_wdat),
+
+	EVENT_PTR(qd6_idx),
+	EVENT_PTR(qd6_rdat),
+	EVENT_PTR(qd6_bnks),
+	EVENT_PTR(qd6_wdat),
+
+	EVENT_PTR(qd7_idx),
+	EVENT_PTR(qd7_rdat),
+	EVENT_PTR(qd7_bnks),
+	EVENT_PTR(qd7_wdat),
+	NULL,
+};
+
+static struct attribute_group thunder_l2c_tad_events_group = {
+	.name = "events",
+	.attrs = NULL,
+};
+
+static const struct attribute_group *thunder_l2c_tad_attr_groups[] = {
+	&thunder_uncore_attr_group,
+	&thunder_l2c_tad_format_group,
+	&thunder_l2c_tad_events_group,
+	NULL,
+};
+
+struct pmu thunder_l2c_tad_pmu = {
+	.attr_groups	= thunder_l2c_tad_attr_groups,
+	.name		= "thunder_l2c_tad",
+	.event_init	= thunder_uncore_event_init,
+	.add		= thunder_uncore_add,
+	.del		= thunder_uncore_del,
+	.start		= thunder_uncore_start,
+	.stop		= thunder_uncore_stop,
+	.read		= thunder_uncore_read,
+};
+
+static int event_valid(u64 config)
+{
+	if ((config > 0 && config <= L2C_TAD_EVENT_WAIT_VAB) ||
+	    config == L2C_TAD_EVENT_RTG_HIT ||
+	    config == L2C_TAD_EVENT_RTG_MISS ||
+	    config == L2C_TAD_EVENT_L2_RTG_VIC ||
+	    config == L2C_TAD_EVENT_L2_OPEN_OCI ||
+	    ((config & 0x80) && ((config & 0xf) <= 3)))
+		return 1;
+
+	if (thunder_uncore_version == 1)
+		if (config == L2C_TAD_EVENT_OPEN_CCPI ||
+		    (config >= L2C_TAD_EVENT_LOOKUP &&
+		     config <= L2C_TAD_EVENT_LOOKUP_ALL) ||
+		    (config >= L2C_TAD_EVENT_TAG_ALC_HIT &&
+		     config <= L2C_TAD_EVENT_OCI_RTG_ALC_VIC &&
+		     config != 0x4d &&
+		     config != 0x66 &&
+		     config != 0x67))
+			return 1;
+
+	return 0;
+}
+
+int __init thunder_uncore_l2c_tad_setup(void)
+{
+	int ret;
+
+	thunder_uncore_l2c_tad = kzalloc(sizeof(struct thunder_uncore),
+					 GFP_KERNEL);
+	if (!thunder_uncore_l2c_tad) {
+		ret = -ENOMEM;
+		goto fail_nomem;
+	}
+
+	if (thunder_uncore_version == 0)
+		thunder_l2c_tad_events_group.attrs = thunder_l2c_tad_events_attr;
+	else /* default */
+		thunder_l2c_tad_events_group.attrs = thunder_l2c_tad_pass2_events_attr;
+
+	ret = thunder_uncore_setup(thunder_uncore_l2c_tad,
+			   PCI_DEVICE_ID_THUNDER_L2C_TAD,
+			   L2C_TAD_CONTROL_OFFSET,
+			   L2C_TAD_COUNTER_OFFSET + L2C_TAD_NR_COUNTERS
+				* sizeof(unsigned long long),
+			   &thunder_l2c_tad_pmu);
+	if (ret)
+		goto fail;
+
+	thunder_uncore_l2c_tad->type = L2C_TAD_TYPE;
+	thunder_uncore_l2c_tad->num_counters = L2C_TAD_NR_COUNTERS;
+	thunder_uncore_l2c_tad->event_valid = event_valid;
+	return 0;
+
+fail:
+	kfree(thunder_uncore_l2c_tad);
+fail_nomem:
+	return ret;
+}
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [RFC PATCH 3/7] arm64/perf: Cavium ThunderX L2C CBC uncore support
  2016-02-12 16:55 [RFC PATCH 0/7] Cavium ThunderX uncore PMU support Jan Glauber
  2016-02-12 16:55 ` [RFC PATCH 1/7] arm64/perf: Basic uncore counter support for Cavium ThunderX Jan Glauber
  2016-02-12 16:55 ` [RFC PATCH 2/7] arm64/perf: Cavium ThunderX L2C TAD uncore support Jan Glauber
@ 2016-02-12 16:55 ` Jan Glauber
  2016-02-12 16:55 ` [RFC PATCH 4/7] arm64/perf: Cavium ThunderX LMC " Jan Glauber
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 18+ messages in thread
From: Jan Glauber @ 2016-02-12 16:55 UTC (permalink / raw)
  To: Will Deacon, Mark Rutland; +Cc: linux-kernel, linux-arm-kernel, Jan Glauber

Support counters of the L2 cache crossbar connect.

Signed-off-by: Jan Glauber <jglauber@cavium.com>
---
 arch/arm64/kernel/uncore/Makefile                |   3 +-
 arch/arm64/kernel/uncore/uncore_cavium.c         |   3 +
 arch/arm64/kernel/uncore/uncore_cavium.h         |   4 +
 arch/arm64/kernel/uncore/uncore_cavium_l2c_cbc.c | 239 +++++++++++++++++++++++
 4 files changed, 248 insertions(+), 1 deletion(-)
 create mode 100644 arch/arm64/kernel/uncore/uncore_cavium_l2c_cbc.c

diff --git a/arch/arm64/kernel/uncore/Makefile b/arch/arm64/kernel/uncore/Makefile
index 6a16caf..d52ecc9 100644
--- a/arch/arm64/kernel/uncore/Makefile
+++ b/arch/arm64/kernel/uncore/Makefile
@@ -1,2 +1,3 @@
 obj-$(CONFIG_ARCH_THUNDER) += uncore_cavium.o		\
-			      uncore_cavium_l2c_tad.o
+			      uncore_cavium_l2c_tad.o	\
+			      uncore_cavium_l2c_cbc.o
diff --git a/arch/arm64/kernel/uncore/uncore_cavium.c b/arch/arm64/kernel/uncore/uncore_cavium.c
index b625caf..0304c60 100644
--- a/arch/arm64/kernel/uncore/uncore_cavium.c
+++ b/arch/arm64/kernel/uncore/uncore_cavium.c
@@ -24,6 +24,8 @@ struct thunder_uncore *event_to_thunder_uncore(struct perf_event *event)
 {
 	if (event->pmu->type == thunder_l2c_tad_pmu.type)
 		return thunder_uncore_l2c_tad;
+	else if (event->pmu->type == thunder_l2c_cbc_pmu.type)
+		return thunder_uncore_l2c_cbc;
 	else
 		return NULL;
 }
@@ -209,6 +211,7 @@ static int __init thunder_uncore_init(void)
 	pr_info("PMU version: %d\n", thunder_uncore_version);
 
 	thunder_uncore_l2c_tad_setup();
+	thunder_uncore_l2c_cbc_setup();
 	return 0;
 }
 late_initcall(thunder_uncore_init);
diff --git a/arch/arm64/kernel/uncore/uncore_cavium.h b/arch/arm64/kernel/uncore/uncore_cavium.h
index 90e6a2d..74f44d7 100644
--- a/arch/arm64/kernel/uncore/uncore_cavium.h
+++ b/arch/arm64/kernel/uncore/uncore_cavium.h
@@ -14,6 +14,7 @@
 
 enum uncore_type {
 	L2C_TAD_TYPE,
+	L2C_CBC_TYPE,
 };
 
 extern int thunder_uncore_version;
@@ -60,7 +61,9 @@ static inline void __iomem *map_offset(unsigned long addr,
 
 extern struct attribute_group thunder_uncore_attr_group;
 extern struct thunder_uncore *thunder_uncore_l2c_tad;
+extern struct thunder_uncore *thunder_uncore_l2c_cbc;
 extern struct pmu thunder_l2c_tad_pmu;
+extern struct pmu thunder_l2c_cbc_pmu;
 
 /* Prototypes */
 struct thunder_uncore *event_to_thunder_uncore(struct perf_event *event);
@@ -75,3 +78,4 @@ ssize_t thunder_events_sysfs_show(struct device *dev,
 				  char *page);
 
 int thunder_uncore_l2c_tad_setup(void);
+int thunder_uncore_l2c_cbc_setup(void);
diff --git a/arch/arm64/kernel/uncore/uncore_cavium_l2c_cbc.c b/arch/arm64/kernel/uncore/uncore_cavium_l2c_cbc.c
new file mode 100644
index 0000000..f1ba9be
--- /dev/null
+++ b/arch/arm64/kernel/uncore/uncore_cavium_l2c_cbc.c
@@ -0,0 +1,239 @@
+/*
+ * Cavium Thunder uncore PMU support, L2C CBC counters.
+ *
+ * Copyright 2016 Cavium Inc.
+ * Author: Jan Glauber <jan.glauber@cavium.com>
+ */
+
+#include <linux/types.h>
+#include <linux/slab.h>
+#include <linux/init.h>
+#include <linux/cpu.h>
+#include <linux/io.h>
+#include <linux/perf_event.h>
+#include <linux/pci.h>
+
+#include <asm/cpufeature.h>
+#include <asm/cputype.h>
+
+#include "uncore_cavium.h"
+
+#ifndef PCI_DEVICE_ID_THUNDER_L2C_CBC
+#define PCI_DEVICE_ID_THUNDER_L2C_CBC	0xa02f
+#endif
+
+#define L2C_CBC_NR_COUNTERS             16
+
+/* L2C CBC event list */
+#define L2C_CBC_EVENT_XMC0		0x00
+#define L2C_CBC_EVENT_XMD0		0x01
+#define L2C_CBC_EVENT_RSC0		0x02
+#define L2C_CBC_EVENT_RSD0		0x03
+#define L2C_CBC_EVENT_INV0		0x04
+#define L2C_CBC_EVENT_IOC0		0x05
+#define L2C_CBC_EVENT_IOR0		0x06
+
+#define L2C_CBC_EVENT_XMC1		0x08	/* 0x40 */
+#define L2C_CBC_EVENT_XMD1		0x09
+#define L2C_CBC_EVENT_RSC1		0x0a
+#define L2C_CBC_EVENT_RSD1		0x0b
+#define L2C_CBC_EVENT_INV1		0x0c
+
+#define L2C_CBC_EVENT_XMC2		0x10	/* 0x80 */
+#define L2C_CBC_EVENT_XMD2		0x11
+#define L2C_CBC_EVENT_RSC2		0x12
+#define L2C_CBC_EVENT_RSD2		0x13
+
+struct thunder_uncore *thunder_uncore_l2c_cbc;
+
+int l2c_cbc_events[L2C_CBC_NR_COUNTERS] = {
+	0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06,
+	0x08, 0x09, 0x0a, 0x0b, 0x0c,
+	0x10, 0x11, 0x12, 0x13
+};
+
+static void thunder_uncore_start(struct perf_event *event, int flags)
+{
+	struct thunder_uncore *uncore = event_to_thunder_uncore(event);
+	struct hw_perf_event *hwc = &event->hw;
+	u64 prev;
+	int i;
+
+	/* restore counter value divided by units into all counters */
+	if (flags & PERF_EF_RELOAD) {
+		prev = local64_read(&hwc->prev_count);
+		prev = prev / uncore->nr_units;
+		for (i = 0; i < uncore->nr_units; i++)
+			writeq(prev, map_offset(hwc->event_base, uncore, i));
+	}
+
+	hwc->state = 0;
+	perf_event_update_userpage(event);
+}
+
+static void thunder_uncore_stop(struct perf_event *event, int flags)
+{
+	struct hw_perf_event *hwc = &event->hw;
+
+	if ((flags & PERF_EF_UPDATE) && !(hwc->state & PERF_HES_UPTODATE)) {
+		thunder_uncore_read(event);
+		hwc->state |= PERF_HES_UPTODATE;
+	}
+}
+
+static int thunder_uncore_add(struct perf_event *event, int flags)
+	{
+	struct thunder_uncore *uncore = event_to_thunder_uncore(event);
+	struct hw_perf_event *hwc = &event->hw;
+	int i;
+
+	WARN_ON_ONCE(!uncore);
+
+	/* are we already assigned? */
+	if (hwc->idx != -1 && uncore->events[hwc->idx] == event)
+		goto out;
+
+	for (i = 0; i < uncore->num_counters; i++) {
+		if (uncore->events[i] == event) {
+			hwc->idx = i;
+			goto out;
+		}
+	}
+
+	/* these counters are self-sustained so idx must match the counter! */
+	hwc->idx = -1;
+	for (i = 0; i < uncore->num_counters; i++) {
+		if (l2c_cbc_events[i] == hwc->config) {
+			if (cmpxchg(&uncore->events[i], NULL, event) == NULL) {
+				hwc->idx = i;
+				break;
+			}
+		}
+	}
+
+out:
+	if (hwc->idx == -1)
+		return -EBUSY;
+
+	hwc->event_base = hwc->config * sizeof(unsigned long long);
+
+	/* counter is not stoppable so avoiding PERF_HES_STOPPED */
+	hwc->state = PERF_HES_UPTODATE;
+
+	if (flags & PERF_EF_START)
+		thunder_uncore_start(event, 0);
+
+	return 0;
+}
+
+PMU_FORMAT_ATTR(event, "config:0-4");
+
+static struct attribute *thunder_l2c_cbc_format_attr[] = {
+	&format_attr_event.attr,
+	NULL,
+};
+
+static struct attribute_group thunder_l2c_cbc_format_group = {
+	.name = "format",
+	.attrs = thunder_l2c_cbc_format_attr,
+};
+
+EVENT_ATTR(xmc0,	L2C_CBC_EVENT_XMC0);
+EVENT_ATTR(xmd0,	L2C_CBC_EVENT_XMD0);
+EVENT_ATTR(rsc0,	L2C_CBC_EVENT_RSC0);
+EVENT_ATTR(rsd0,	L2C_CBC_EVENT_RSD0);
+EVENT_ATTR(inv0,	L2C_CBC_EVENT_INV0);
+EVENT_ATTR(ioc0,	L2C_CBC_EVENT_IOC0);
+EVENT_ATTR(ior0,	L2C_CBC_EVENT_IOR0);
+EVENT_ATTR(xmc1,	L2C_CBC_EVENT_XMC1);
+EVENT_ATTR(xmd1,	L2C_CBC_EVENT_XMD1);
+EVENT_ATTR(rsc1,	L2C_CBC_EVENT_RSC1);
+EVENT_ATTR(rsd1,	L2C_CBC_EVENT_RSD1);
+EVENT_ATTR(inv1,	L2C_CBC_EVENT_INV1);
+EVENT_ATTR(xmc2,	L2C_CBC_EVENT_XMC2);
+EVENT_ATTR(xmd2,	L2C_CBC_EVENT_XMD2);
+EVENT_ATTR(rsc2,	L2C_CBC_EVENT_RSC2);
+EVENT_ATTR(rsd2,	L2C_CBC_EVENT_RSD2);
+
+static struct attribute *thunder_l2c_cbc_events_attr[] = {
+	EVENT_PTR(xmc0),
+	EVENT_PTR(xmd0),
+	EVENT_PTR(rsc0),
+	EVENT_PTR(rsd0),
+	EVENT_PTR(inv0),
+	EVENT_PTR(ioc0),
+	EVENT_PTR(ior0),
+	EVENT_PTR(xmc1),
+	EVENT_PTR(xmd1),
+	EVENT_PTR(rsc1),
+	EVENT_PTR(rsd1),
+	EVENT_PTR(inv1),
+	EVENT_PTR(xmc2),
+	EVENT_PTR(xmd2),
+	EVENT_PTR(rsc2),
+	EVENT_PTR(rsd2),
+	NULL,
+};
+
+static struct attribute_group thunder_l2c_cbc_events_group = {
+	.name = "events",
+	.attrs = thunder_l2c_cbc_events_attr,
+};
+
+static const struct attribute_group *thunder_l2c_cbc_attr_groups[] = {
+	&thunder_uncore_attr_group,
+	&thunder_l2c_cbc_format_group,
+	&thunder_l2c_cbc_events_group,
+	NULL,
+};
+
+struct pmu thunder_l2c_cbc_pmu = {
+	.attr_groups	= thunder_l2c_cbc_attr_groups,
+	.name		= "thunder_l2c_cbc",
+	.event_init	= thunder_uncore_event_init,
+	.add		= thunder_uncore_add,
+	.del		= thunder_uncore_del,
+	.start		= thunder_uncore_start,
+	.stop		= thunder_uncore_stop,
+	.read		= thunder_uncore_read,
+};
+
+static int event_valid(u64 config)
+{
+	if (config <= L2C_CBC_EVENT_IOR0 ||
+	    (config >= L2C_CBC_EVENT_XMC1 && config <= L2C_CBC_EVENT_INV1) ||
+	    (config >= L2C_CBC_EVENT_XMC2 && config <= L2C_CBC_EVENT_RSD2))
+		return 1;
+	else
+		return 0;
+}
+
+int __init thunder_uncore_l2c_cbc_setup(void)
+{
+	int ret;
+
+	thunder_uncore_l2c_cbc = kzalloc(sizeof(struct thunder_uncore),
+					 GFP_KERNEL);
+	if (!thunder_uncore_l2c_cbc) {
+		ret = -ENOMEM;
+		goto fail_nomem;
+	}
+
+	ret = thunder_uncore_setup(thunder_uncore_l2c_cbc,
+				   PCI_DEVICE_ID_THUNDER_L2C_CBC,
+				   0,
+				   0x100,
+				   &thunder_l2c_cbc_pmu);
+	if (ret)
+		goto fail;
+
+	thunder_uncore_l2c_cbc->type = L2C_CBC_TYPE;
+	thunder_uncore_l2c_cbc->num_counters = L2C_CBC_NR_COUNTERS;
+	thunder_uncore_l2c_cbc->event_valid = event_valid;
+	return 0;
+
+fail:
+	kfree(thunder_uncore_l2c_cbc);
+fail_nomem:
+	return ret;
+}
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [RFC PATCH 4/7] arm64/perf: Cavium ThunderX LMC uncore support
  2016-02-12 16:55 [RFC PATCH 0/7] Cavium ThunderX uncore PMU support Jan Glauber
                   ` (2 preceding siblings ...)
  2016-02-12 16:55 ` [RFC PATCH 3/7] arm64/perf: Cavium ThunderX L2C CBC " Jan Glauber
@ 2016-02-12 16:55 ` Jan Glauber
  2016-02-12 16:55 ` [RFC PATCH 5/7] arm64/perf: Cavium ThunderX OCX LNE " Jan Glauber
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 18+ messages in thread
From: Jan Glauber @ 2016-02-12 16:55 UTC (permalink / raw)
  To: Will Deacon, Mark Rutland; +Cc: linux-kernel, linux-arm-kernel, Jan Glauber

Support counters on the DRAM controllers.

Signed-off-by: Jan Glauber <jglauber@cavium.com>
---
 arch/arm64/kernel/uncore/Makefile            |   3 +-
 arch/arm64/kernel/uncore/uncore_cavium.c     |   3 +
 arch/arm64/kernel/uncore/uncore_cavium.h     |   4 +
 arch/arm64/kernel/uncore/uncore_cavium_lmc.c | 201 +++++++++++++++++++++++++++
 4 files changed, 210 insertions(+), 1 deletion(-)
 create mode 100644 arch/arm64/kernel/uncore/uncore_cavium_lmc.c

diff --git a/arch/arm64/kernel/uncore/Makefile b/arch/arm64/kernel/uncore/Makefile
index d52ecc9..81479e8 100644
--- a/arch/arm64/kernel/uncore/Makefile
+++ b/arch/arm64/kernel/uncore/Makefile
@@ -1,3 +1,4 @@
 obj-$(CONFIG_ARCH_THUNDER) += uncore_cavium.o		\
 			      uncore_cavium_l2c_tad.o	\
-			      uncore_cavium_l2c_cbc.o
+			      uncore_cavium_l2c_cbc.o	\
+			      uncore_cavium_lmc.o
diff --git a/arch/arm64/kernel/uncore/uncore_cavium.c b/arch/arm64/kernel/uncore/uncore_cavium.c
index 0304c60..a972418 100644
--- a/arch/arm64/kernel/uncore/uncore_cavium.c
+++ b/arch/arm64/kernel/uncore/uncore_cavium.c
@@ -26,6 +26,8 @@ struct thunder_uncore *event_to_thunder_uncore(struct perf_event *event)
 		return thunder_uncore_l2c_tad;
 	else if (event->pmu->type == thunder_l2c_cbc_pmu.type)
 		return thunder_uncore_l2c_cbc;
+	else if (event->pmu->type == thunder_lmc_pmu.type)
+		return thunder_uncore_lmc;
 	else
 		return NULL;
 }
@@ -212,6 +214,7 @@ static int __init thunder_uncore_init(void)
 
 	thunder_uncore_l2c_tad_setup();
 	thunder_uncore_l2c_cbc_setup();
+	thunder_uncore_lmc_setup();
 	return 0;
 }
 late_initcall(thunder_uncore_init);
diff --git a/arch/arm64/kernel/uncore/uncore_cavium.h b/arch/arm64/kernel/uncore/uncore_cavium.h
index 74f44d7..6e3beba 100644
--- a/arch/arm64/kernel/uncore/uncore_cavium.h
+++ b/arch/arm64/kernel/uncore/uncore_cavium.h
@@ -15,6 +15,7 @@
 enum uncore_type {
 	L2C_TAD_TYPE,
 	L2C_CBC_TYPE,
+	LMC_TYPE,
 };
 
 extern int thunder_uncore_version;
@@ -62,8 +63,10 @@ static inline void __iomem *map_offset(unsigned long addr,
 extern struct attribute_group thunder_uncore_attr_group;
 extern struct thunder_uncore *thunder_uncore_l2c_tad;
 extern struct thunder_uncore *thunder_uncore_l2c_cbc;
+extern struct thunder_uncore *thunder_uncore_lmc;
 extern struct pmu thunder_l2c_tad_pmu;
 extern struct pmu thunder_l2c_cbc_pmu;
+extern struct pmu thunder_lmc_pmu;
 
 /* Prototypes */
 struct thunder_uncore *event_to_thunder_uncore(struct perf_event *event);
@@ -79,3 +82,4 @@ ssize_t thunder_events_sysfs_show(struct device *dev,
 
 int thunder_uncore_l2c_tad_setup(void);
 int thunder_uncore_l2c_cbc_setup(void);
+int thunder_uncore_lmc_setup(void);
diff --git a/arch/arm64/kernel/uncore/uncore_cavium_lmc.c b/arch/arm64/kernel/uncore/uncore_cavium_lmc.c
new file mode 100644
index 0000000..9667819
--- /dev/null
+++ b/arch/arm64/kernel/uncore/uncore_cavium_lmc.c
@@ -0,0 +1,201 @@
+/*
+ * Cavium Thunder uncore PMU support, LMC counters.
+ *
+ * Copyright 2016 Cavium Inc.
+ * Author: Jan Glauber <jan.glauber@cavium.com>
+ */
+
+#include <linux/types.h>
+#include <linux/slab.h>
+#include <linux/init.h>
+#include <linux/cpu.h>
+#include <linux/io.h>
+#include <linux/perf_event.h>
+#include <linux/pci.h>
+
+#include <asm/cpufeature.h>
+#include <asm/cputype.h>
+
+#include "uncore_cavium.h"
+
+#ifndef PCI_DEVICE_ID_THUNDER_LMC
+#define PCI_DEVICE_ID_THUNDER_LMC	0xa022
+#endif
+
+#define LMC_NR_COUNTERS			3
+#define LMC_PASS2_NR_COUNTERS		5
+#define LMC_MAX_NR_COUNTERS		LMC_PASS2_NR_COUNTERS
+
+/* LMC event list */
+#define LMC_EVENT_IFB_CNT		0
+#define LMC_EVENT_OPS_CNT		1
+#define LMC_EVENT_DCLK_CNT		2
+
+/* pass 2 added counters */
+#define LMC_EVENT_BANK_CONFLICT1	3
+#define LMC_EVENT_BANK_CONFLICT2	4
+
+#define LMC_COUNTER_START		LMC_EVENT_IFB_CNT
+#define LMC_COUNTER_END			(LMC_EVENT_BANK_CONFLICT2 + 8)
+
+struct thunder_uncore *thunder_uncore_lmc;
+
+int lmc_events[LMC_MAX_NR_COUNTERS] = { 0x1d0, 0x1d8, 0x1e0, 0x360, 0x368 };
+
+static void thunder_uncore_start(struct perf_event *event, int flags)
+{
+	struct hw_perf_event *hwc = &event->hw;
+
+	hwc->state = 0;
+	perf_event_update_userpage(event);
+}
+
+static void thunder_uncore_stop(struct perf_event *event, int flags)
+{
+	struct hw_perf_event *hwc = &event->hw;
+
+	if ((flags & PERF_EF_UPDATE) && !(hwc->state & PERF_HES_UPTODATE)) {
+		thunder_uncore_read(event);
+		hwc->state |= PERF_HES_UPTODATE;
+	}
+}
+
+static int thunder_uncore_add(struct perf_event *event, int flags)
+{
+	struct thunder_uncore *uncore = event_to_thunder_uncore(event);
+	struct hw_perf_event *hwc = &event->hw;
+	int i;
+
+	WARN_ON_ONCE(!uncore);
+
+	/* are we already assigned? */
+	if (hwc->idx != -1 && uncore->events[hwc->idx] == event)
+		goto out;
+
+	for (i = 0; i < uncore->num_counters; i++) {
+		if (uncore->events[i] == event) {
+			hwc->idx = i;
+			goto out;
+		}
+	}
+
+	/* these counters are self-sustained so idx must match the counter! */
+	hwc->idx = -1;
+	if (cmpxchg(&uncore->events[hwc->config], NULL, event) == NULL)
+		hwc->idx = i;
+
+out:
+	if (hwc->idx == -1)
+		return -EBUSY;
+
+	hwc->event_base = lmc_events[hwc->config];
+	hwc->state = PERF_HES_UPTODATE;
+
+	/* counters are read-only, so avoid PERF_EF_RELOAD */
+	if (flags & PERF_EF_START)
+		thunder_uncore_start(event, 0);
+
+	return 0;
+}
+
+PMU_FORMAT_ATTR(event, "config:0-2");
+
+static struct attribute *thunder_lmc_format_attr[] = {
+	&format_attr_event.attr,
+	NULL,
+};
+
+static struct attribute_group thunder_lmc_format_group = {
+	.name = "format",
+	.attrs = thunder_lmc_format_attr,
+};
+
+EVENT_ATTR(ifb_cnt,		LMC_EVENT_IFB_CNT);
+EVENT_ATTR(ops_cnt,		LMC_EVENT_OPS_CNT);
+EVENT_ATTR(dclk_cnt,		LMC_EVENT_DCLK_CNT);
+EVENT_ATTR(bank_conflict1,	LMC_EVENT_BANK_CONFLICT1);
+EVENT_ATTR(bank_conflict2,	LMC_EVENT_BANK_CONFLICT2);
+
+static struct attribute *thunder_lmc_events_attr[] = {
+	EVENT_PTR(ifb_cnt),
+	EVENT_PTR(ops_cnt),
+	EVENT_PTR(dclk_cnt),
+	NULL,
+};
+
+static struct attribute *thunder_lmc_pass2_events_attr[] = {
+	EVENT_PTR(ifb_cnt),
+	EVENT_PTR(ops_cnt),
+	EVENT_PTR(dclk_cnt),
+	EVENT_PTR(bank_conflict1),
+	EVENT_PTR(bank_conflict2),
+	NULL,
+};
+
+static struct attribute_group thunder_lmc_events_group = {
+	.name = "events",
+	.attrs = NULL,
+};
+
+static const struct attribute_group *thunder_lmc_attr_groups[] = {
+	&thunder_uncore_attr_group,
+	&thunder_lmc_format_group,
+	&thunder_lmc_events_group,
+	NULL,
+};
+
+struct pmu thunder_lmc_pmu = {
+	.attr_groups	= thunder_lmc_attr_groups,
+	.name		= "thunder_lmc",
+	.event_init	= thunder_uncore_event_init,
+	.add		= thunder_uncore_add,
+	.del		= thunder_uncore_del,
+	.start		= thunder_uncore_start,
+	.stop		= thunder_uncore_stop,
+	.read		= thunder_uncore_read,
+};
+
+static int event_valid(u64 config)
+{
+	if (config <= LMC_EVENT_DCLK_CNT)
+		return 1;
+
+	if (thunder_uncore_version == 1)
+		if (config == LMC_EVENT_BANK_CONFLICT1 ||
+		    config == LMC_EVENT_BANK_CONFLICT2)
+			return 1;
+	return 0;
+}
+
+int __init thunder_uncore_lmc_setup(void)
+{
+	int ret;
+
+	thunder_uncore_lmc = kzalloc(sizeof(struct thunder_uncore), GFP_KERNEL);
+	if (!thunder_uncore_lmc) {
+		ret = -ENOMEM;
+		goto fail_nomem;
+	}
+
+	thunder_lmc_events_group.attrs = (thunder_uncore_version == 1) ?
+		thunder_lmc_pass2_events_attr : thunder_lmc_events_attr;
+
+	ret = thunder_uncore_setup(thunder_uncore_lmc,
+				   PCI_DEVICE_ID_THUNDER_LMC,
+				   LMC_COUNTER_START,
+				   LMC_COUNTER_END - LMC_COUNTER_START,
+				   &thunder_lmc_pmu);
+	if (ret)
+		goto fail;
+
+	thunder_uncore_lmc->type = LMC_TYPE;
+	thunder_uncore_lmc->num_counters = (thunder_uncore_version == 1) ?
+		LMC_PASS2_NR_COUNTERS : LMC_NR_COUNTERS;
+	thunder_uncore_lmc->event_valid = event_valid;
+	return 0;
+
+fail:
+	kfree(thunder_uncore_lmc);
+fail_nomem:
+	return ret;
+}
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [RFC PATCH 5/7] arm64/perf: Cavium ThunderX OCX LNE uncore support
  2016-02-12 16:55 [RFC PATCH 0/7] Cavium ThunderX uncore PMU support Jan Glauber
                   ` (3 preceding siblings ...)
  2016-02-12 16:55 ` [RFC PATCH 4/7] arm64/perf: Cavium ThunderX LMC " Jan Glauber
@ 2016-02-12 16:55 ` Jan Glauber
  2016-02-12 16:55 ` [RFC PATCH 6/7] arm64/perf: Cavium ThunderX OCX FRC " Jan Glauber
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 18+ messages in thread
From: Jan Glauber @ 2016-02-12 16:55 UTC (permalink / raw)
  To: Will Deacon, Mark Rutland; +Cc: linux-kernel, linux-arm-kernel, Jan Glauber

Support counters for the CCPI Interface controller (OCX) lanes.

Signed-off-by: Jan Glauber <jglauber@cavium.com>
---
 arch/arm64/kernel/uncore/Makefile                |   3 +-
 arch/arm64/kernel/uncore/uncore_cavium.c         |   3 +
 arch/arm64/kernel/uncore/uncore_cavium.h         |   4 +
 arch/arm64/kernel/uncore/uncore_cavium_ocx_lne.c | 270 +++++++++++++++++++++++
 4 files changed, 279 insertions(+), 1 deletion(-)
 create mode 100644 arch/arm64/kernel/uncore/uncore_cavium_ocx_lne.c

diff --git a/arch/arm64/kernel/uncore/Makefile b/arch/arm64/kernel/uncore/Makefile
index 81479e8..da39f452 100644
--- a/arch/arm64/kernel/uncore/Makefile
+++ b/arch/arm64/kernel/uncore/Makefile
@@ -1,4 +1,5 @@
 obj-$(CONFIG_ARCH_THUNDER) += uncore_cavium.o		\
 			      uncore_cavium_l2c_tad.o	\
 			      uncore_cavium_l2c_cbc.o	\
-			      uncore_cavium_lmc.o
+			      uncore_cavium_lmc.o	\
+			      uncore_cavium_ocx_lne.o
diff --git a/arch/arm64/kernel/uncore/uncore_cavium.c b/arch/arm64/kernel/uncore/uncore_cavium.c
index a972418..f2fbdea 100644
--- a/arch/arm64/kernel/uncore/uncore_cavium.c
+++ b/arch/arm64/kernel/uncore/uncore_cavium.c
@@ -28,6 +28,8 @@ struct thunder_uncore *event_to_thunder_uncore(struct perf_event *event)
 		return thunder_uncore_l2c_cbc;
 	else if (event->pmu->type == thunder_lmc_pmu.type)
 		return thunder_uncore_lmc;
+	else if (event->pmu->type == thunder_ocx_lne_pmu.type)
+		return thunder_uncore_ocx_lne;
 	else
 		return NULL;
 }
@@ -215,6 +217,7 @@ static int __init thunder_uncore_init(void)
 	thunder_uncore_l2c_tad_setup();
 	thunder_uncore_l2c_cbc_setup();
 	thunder_uncore_lmc_setup();
+	thunder_uncore_ocx_lne_setup();
 	return 0;
 }
 late_initcall(thunder_uncore_init);
diff --git a/arch/arm64/kernel/uncore/uncore_cavium.h b/arch/arm64/kernel/uncore/uncore_cavium.h
index 6e3beba..b9bcb42 100644
--- a/arch/arm64/kernel/uncore/uncore_cavium.h
+++ b/arch/arm64/kernel/uncore/uncore_cavium.h
@@ -16,6 +16,7 @@ enum uncore_type {
 	L2C_TAD_TYPE,
 	L2C_CBC_TYPE,
 	LMC_TYPE,
+	OCX_LNE_TYPE,
 };
 
 extern int thunder_uncore_version;
@@ -64,9 +65,11 @@ extern struct attribute_group thunder_uncore_attr_group;
 extern struct thunder_uncore *thunder_uncore_l2c_tad;
 extern struct thunder_uncore *thunder_uncore_l2c_cbc;
 extern struct thunder_uncore *thunder_uncore_lmc;
+extern struct thunder_uncore *thunder_uncore_ocx_lne;
 extern struct pmu thunder_l2c_tad_pmu;
 extern struct pmu thunder_l2c_cbc_pmu;
 extern struct pmu thunder_lmc_pmu;
+extern struct pmu thunder_ocx_lne_pmu;
 
 /* Prototypes */
 struct thunder_uncore *event_to_thunder_uncore(struct perf_event *event);
@@ -83,3 +86,4 @@ ssize_t thunder_events_sysfs_show(struct device *dev,
 int thunder_uncore_l2c_tad_setup(void);
 int thunder_uncore_l2c_cbc_setup(void);
 int thunder_uncore_lmc_setup(void);
+int thunder_uncore_ocx_lne_setup(void);
diff --git a/arch/arm64/kernel/uncore/uncore_cavium_ocx_lne.c b/arch/arm64/kernel/uncore/uncore_cavium_ocx_lne.c
new file mode 100644
index 0000000..c2981b9
--- /dev/null
+++ b/arch/arm64/kernel/uncore/uncore_cavium_ocx_lne.c
@@ -0,0 +1,270 @@
+/*
+ * Cavium Thunder uncore PMU support, OCX LNE counters.
+ *
+ * Copyright 2016 Cavium Inc.
+ * Author: Jan Glauber <jan.glauber@cavium.com>
+ */
+
+#include <linux/types.h>
+#include <linux/slab.h>
+#include <linux/init.h>
+#include <linux/cpu.h>
+#include <linux/io.h>
+#include <linux/perf_event.h>
+#include <linux/pci.h>
+
+#include <asm/cpufeature.h>
+#include <asm/cputype.h>
+
+#include "uncore_cavium.h"
+
+#ifndef PCI_DEVICE_ID_THUNDER_OCX
+#define PCI_DEVICE_ID_THUNDER_OCX		0xa013
+#endif
+
+#define OCX_LNE_NR_COUNTERS			15
+#define OCX_LNE_NR_UNITS			24
+#define OCX_LNE_UNIT_OFFSET			0x100
+#define OCX_LNE_CONTROL_OFFSET			0x8000
+#define OCX_LNE_COUNTER_OFFSET			0x40
+
+#define OCX_LNE_STAT_DISABLE			0
+#define OCX_LNE_STAT_ENABLE			1
+
+/* OCX LNE event list */
+#define OCX_LNE_EVENT_STAT00			0x00
+#define OCX_LNE_EVENT_STAT01			0x01
+#define OCX_LNE_EVENT_STAT02			0x02
+#define OCX_LNE_EVENT_STAT03			0x03
+#define OCX_LNE_EVENT_STAT04			0x04
+#define OCX_LNE_EVENT_STAT05			0x05
+#define OCX_LNE_EVENT_STAT06			0x06
+#define OCX_LNE_EVENT_STAT07			0x07
+#define OCX_LNE_EVENT_STAT08			0x08
+#define OCX_LNE_EVENT_STAT09			0x09
+#define OCX_LNE_EVENT_STAT10			0x0a
+#define OCX_LNE_EVENT_STAT11			0x0b
+#define OCX_LNE_EVENT_STAT12			0x0c
+#define OCX_LNE_EVENT_STAT13			0x0d
+#define OCX_LNE_EVENT_STAT14			0x0e
+
+struct thunder_uncore *thunder_uncore_ocx_lne;
+
+static inline void __iomem *map_offset_ocx_lne(unsigned long addr,
+				struct thunder_uncore *uncore, int unit)
+{
+	return (void __iomem *) (addr +
+				 uncore->pdevs[0].map +
+				 unit * OCX_LNE_UNIT_OFFSET);
+}
+
+/*
+ * Summarize counters across all LNE's. Different from the other uncore
+ * PMUs because all LNE's are on one PCI device.
+ */
+static void thunder_uncore_read_ocx_lne(struct perf_event *event)
+{
+	struct thunder_uncore *uncore = event_to_thunder_uncore(event);
+	struct hw_perf_event *hwc = &event->hw;
+	u64 prev, new = 0;
+	s64 delta;
+	int i;
+
+	/*
+	 * since we do not enable counter overflow interrupts,
+	 * we do not have to worry about prev_count changing on us
+	 */
+
+	prev = local64_read(&hwc->prev_count);
+
+	/* read counter values from all units */
+	for (i = 0; i < OCX_LNE_NR_UNITS; i++)
+		new += readq(map_offset_ocx_lne(hwc->event_base, uncore, i));
+
+	local64_set(&hwc->prev_count, new);
+	delta = new - prev;
+	local64_add(delta, &event->count);
+}
+
+static void thunder_uncore_start(struct perf_event *event, int flags)
+{
+	struct thunder_uncore *uncore = event_to_thunder_uncore(event);
+	struct hw_perf_event *hwc = &event->hw;
+	int i;
+
+	hwc->state = 0;
+
+	/* enable counters on all units */
+	for (i = 0; i < OCX_LNE_NR_UNITS; i++)
+		writeb(OCX_LNE_STAT_ENABLE,
+		       map_offset_ocx_lne(hwc->config_base, uncore, i));
+
+	perf_event_update_userpage(event);
+}
+
+static void thunder_uncore_stop(struct perf_event *event, int flags)
+{
+	struct thunder_uncore *uncore = event_to_thunder_uncore(event);
+	struct hw_perf_event *hwc = &event->hw;
+	int i;
+
+	/* disable counters on all units */
+	for (i = 0; i < OCX_LNE_NR_UNITS; i++)
+		writeb(OCX_LNE_STAT_DISABLE,
+		       map_offset_ocx_lne(hwc->config_base, uncore, i));
+	hwc->state |= PERF_HES_STOPPED;
+
+	if ((flags & PERF_EF_UPDATE) && !(hwc->state & PERF_HES_UPTODATE)) {
+		thunder_uncore_read_ocx_lne(event);
+		hwc->state |= PERF_HES_UPTODATE;
+	}
+}
+
+static int thunder_uncore_add(struct perf_event *event, int flags)
+	{
+	struct thunder_uncore *uncore = event_to_thunder_uncore(event);
+	struct hw_perf_event *hwc = &event->hw;
+	int i;
+
+	WARN_ON_ONCE(!uncore);
+
+	/* are we already assigned? */
+	if (hwc->idx != -1 && uncore->events[hwc->idx] == event)
+		goto out;
+
+	for (i = 0; i < uncore->num_counters; i++) {
+		if (uncore->events[i] == event) {
+			hwc->idx = i;
+			goto out;
+		}
+	}
+
+	/* counters are 1:1 */
+	hwc->idx = -1;
+	if (cmpxchg(&uncore->events[hwc->config], NULL, event) == NULL)
+		hwc->idx = hwc->config;
+
+out:
+	if (hwc->idx == -1)
+		return -EBUSY;
+
+	hwc->config_base = 0;
+	hwc->event_base = OCX_LNE_COUNTER_OFFSET +
+			hwc->idx * sizeof(unsigned long long);
+	hwc->state = PERF_HES_UPTODATE | PERF_HES_STOPPED;
+
+	if (flags & PERF_EF_START)
+		/* counters are read-only, so avoid PERF_EF_RELOAD */
+		thunder_uncore_start(event, 0);
+
+	return 0;
+}
+
+PMU_FORMAT_ATTR(event, "config:0-3");
+
+static struct attribute *thunder_ocx_lne_format_attr[] = {
+	&format_attr_event.attr,
+	NULL,
+};
+
+static struct attribute_group thunder_ocx_lne_format_group = {
+	.name = "format",
+	.attrs = thunder_ocx_lne_format_attr,
+};
+
+EVENT_ATTR(stat00,	OCX_LNE_EVENT_STAT00);
+EVENT_ATTR(stat01,	OCX_LNE_EVENT_STAT01);
+EVENT_ATTR(stat02,	OCX_LNE_EVENT_STAT02);
+EVENT_ATTR(stat03,	OCX_LNE_EVENT_STAT03);
+EVENT_ATTR(stat04,	OCX_LNE_EVENT_STAT04);
+EVENT_ATTR(stat05,	OCX_LNE_EVENT_STAT05);
+EVENT_ATTR(stat06,	OCX_LNE_EVENT_STAT06);
+EVENT_ATTR(stat07,	OCX_LNE_EVENT_STAT07);
+EVENT_ATTR(stat08,	OCX_LNE_EVENT_STAT08);
+EVENT_ATTR(stat09,	OCX_LNE_EVENT_STAT09);
+EVENT_ATTR(stat10,	OCX_LNE_EVENT_STAT10);
+EVENT_ATTR(stat11,	OCX_LNE_EVENT_STAT11);
+EVENT_ATTR(stat12,	OCX_LNE_EVENT_STAT12);
+EVENT_ATTR(stat13,	OCX_LNE_EVENT_STAT13);
+EVENT_ATTR(stat14,	OCX_LNE_EVENT_STAT14);
+
+static struct attribute *thunder_ocx_lne_events_attr[] = {
+	EVENT_PTR(stat00),
+	EVENT_PTR(stat01),
+	EVENT_PTR(stat02),
+	EVENT_PTR(stat03),
+	EVENT_PTR(stat04),
+	EVENT_PTR(stat05),
+	EVENT_PTR(stat06),
+	EVENT_PTR(stat07),
+	EVENT_PTR(stat08),
+	EVENT_PTR(stat09),
+	EVENT_PTR(stat10),
+	EVENT_PTR(stat11),
+	EVENT_PTR(stat12),
+	EVENT_PTR(stat13),
+	EVENT_PTR(stat14),
+	NULL,
+};
+
+static struct attribute_group thunder_ocx_lne_events_group = {
+	.name = "events",
+	.attrs = thunder_ocx_lne_events_attr,
+};
+
+static const struct attribute_group *thunder_ocx_lne_attr_groups[] = {
+	&thunder_uncore_attr_group,
+	&thunder_ocx_lne_format_group,
+	&thunder_ocx_lne_events_group,
+	NULL,
+};
+
+struct pmu thunder_ocx_lne_pmu = {
+	.attr_groups	= thunder_ocx_lne_attr_groups,
+	.name		= "thunder_ocx_lne",
+	.event_init	= thunder_uncore_event_init,
+	.add		= thunder_uncore_add,
+	.del		= thunder_uncore_del,
+	.start		= thunder_uncore_start,
+	.stop		= thunder_uncore_stop,
+	.read		= thunder_uncore_read_ocx_lne,
+};
+
+static int event_valid(u64 config)
+{
+	if (config <= OCX_LNE_EVENT_STAT14)
+		return 1;
+	else
+		return 0;
+}
+
+int __init thunder_uncore_ocx_lne_setup(void)
+{
+	int ret;
+
+	thunder_uncore_ocx_lne = kzalloc(sizeof(struct thunder_uncore),
+					 GFP_KERNEL);
+	if (!thunder_uncore_ocx_lne) {
+		ret = -ENOMEM;
+		goto fail_nomem;
+	}
+
+	ret = thunder_uncore_setup(thunder_uncore_ocx_lne,
+				   PCI_DEVICE_ID_THUNDER_OCX,
+				   OCX_LNE_CONTROL_OFFSET,
+				   OCX_LNE_COUNTER_OFFSET + OCX_LNE_NR_COUNTERS
+					* sizeof(unsigned long long),
+				   &thunder_ocx_lne_pmu);
+	if (ret)
+		goto fail;
+
+	thunder_uncore_ocx_lne->type = OCX_LNE_TYPE;
+	thunder_uncore_ocx_lne->num_counters = OCX_LNE_NR_COUNTERS;
+	thunder_uncore_ocx_lne->event_valid = event_valid;
+	return 0;
+
+fail:
+	kfree(thunder_uncore_ocx_lne);
+fail_nomem:
+	return ret;
+}
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [RFC PATCH 6/7] arm64/perf: Cavium ThunderX OCX FRC uncore support
  2016-02-12 16:55 [RFC PATCH 0/7] Cavium ThunderX uncore PMU support Jan Glauber
                   ` (4 preceding siblings ...)
  2016-02-12 16:55 ` [RFC PATCH 5/7] arm64/perf: Cavium ThunderX OCX LNE " Jan Glauber
@ 2016-02-12 16:55 ` Jan Glauber
  2016-02-12 16:55 ` [RFC PATCH 7/7] arm64/perf: Cavium ThunderX OCX TLK " Jan Glauber
  2016-02-12 17:00 ` [RFC PATCH 0/7] Cavium ThunderX uncore PMU support Mark Rutland
  7 siblings, 0 replies; 18+ messages in thread
From: Jan Glauber @ 2016-02-12 16:55 UTC (permalink / raw)
  To: Will Deacon, Mark Rutland; +Cc: linux-kernel, linux-arm-kernel, Jan Glauber

Support for the OCX alignment counters.

Signed-off-by: Jan Glauber <jglauber@cavium.com>
---
 arch/arm64/kernel/uncore/Makefile                |   3 +-
 arch/arm64/kernel/uncore/uncore_cavium.c         |   3 +
 arch/arm64/kernel/uncore/uncore_cavium.h         |   4 +
 arch/arm64/kernel/uncore/uncore_cavium_ocx_frc.c | 248 +++++++++++++++++++++++
 4 files changed, 257 insertions(+), 1 deletion(-)
 create mode 100644 arch/arm64/kernel/uncore/uncore_cavium_ocx_frc.c

diff --git a/arch/arm64/kernel/uncore/Makefile b/arch/arm64/kernel/uncore/Makefile
index da39f452..752af39 100644
--- a/arch/arm64/kernel/uncore/Makefile
+++ b/arch/arm64/kernel/uncore/Makefile
@@ -2,4 +2,5 @@ obj-$(CONFIG_ARCH_THUNDER) += uncore_cavium.o		\
 			      uncore_cavium_l2c_tad.o	\
 			      uncore_cavium_l2c_cbc.o	\
 			      uncore_cavium_lmc.o	\
-			      uncore_cavium_ocx_lne.o
+			      uncore_cavium_ocx_lne.o	\
+			      uncore_cavium_ocx_frc.o
diff --git a/arch/arm64/kernel/uncore/uncore_cavium.c b/arch/arm64/kernel/uncore/uncore_cavium.c
index f2fbdea..9fed1d6 100644
--- a/arch/arm64/kernel/uncore/uncore_cavium.c
+++ b/arch/arm64/kernel/uncore/uncore_cavium.c
@@ -30,6 +30,8 @@ struct thunder_uncore *event_to_thunder_uncore(struct perf_event *event)
 		return thunder_uncore_lmc;
 	else if (event->pmu->type == thunder_ocx_lne_pmu.type)
 		return thunder_uncore_ocx_lne;
+	else if (event->pmu->type == thunder_ocx_frc_pmu.type)
+		return thunder_uncore_ocx_frc;
 	else
 		return NULL;
 }
@@ -218,6 +220,7 @@ static int __init thunder_uncore_init(void)
 	thunder_uncore_l2c_cbc_setup();
 	thunder_uncore_lmc_setup();
 	thunder_uncore_ocx_lne_setup();
+	thunder_uncore_ocx_frc_setup();
 	return 0;
 }
 late_initcall(thunder_uncore_init);
diff --git a/arch/arm64/kernel/uncore/uncore_cavium.h b/arch/arm64/kernel/uncore/uncore_cavium.h
index b9bcb42..07bd4f4 100644
--- a/arch/arm64/kernel/uncore/uncore_cavium.h
+++ b/arch/arm64/kernel/uncore/uncore_cavium.h
@@ -17,6 +17,7 @@ enum uncore_type {
 	L2C_CBC_TYPE,
 	LMC_TYPE,
 	OCX_LNE_TYPE,
+	OCX_FRC_TYPE,
 };
 
 extern int thunder_uncore_version;
@@ -66,10 +67,12 @@ extern struct thunder_uncore *thunder_uncore_l2c_tad;
 extern struct thunder_uncore *thunder_uncore_l2c_cbc;
 extern struct thunder_uncore *thunder_uncore_lmc;
 extern struct thunder_uncore *thunder_uncore_ocx_lne;
+extern struct thunder_uncore *thunder_uncore_ocx_frc;
 extern struct pmu thunder_l2c_tad_pmu;
 extern struct pmu thunder_l2c_cbc_pmu;
 extern struct pmu thunder_lmc_pmu;
 extern struct pmu thunder_ocx_lne_pmu;
+extern struct pmu thunder_ocx_frc_pmu;
 
 /* Prototypes */
 struct thunder_uncore *event_to_thunder_uncore(struct perf_event *event);
@@ -87,3 +90,4 @@ int thunder_uncore_l2c_tad_setup(void);
 int thunder_uncore_l2c_cbc_setup(void);
 int thunder_uncore_lmc_setup(void);
 int thunder_uncore_ocx_lne_setup(void);
+int thunder_uncore_ocx_frc_setup(void);
diff --git a/arch/arm64/kernel/uncore/uncore_cavium_ocx_frc.c b/arch/arm64/kernel/uncore/uncore_cavium_ocx_frc.c
new file mode 100644
index 0000000..7f62019
--- /dev/null
+++ b/arch/arm64/kernel/uncore/uncore_cavium_ocx_frc.c
@@ -0,0 +1,248 @@
+/*
+ * Cavium Thunder uncore PMU support, OCX FRC counters.
+ *
+ * Copyright 2016 Cavium Inc.
+ * Author: Jan Glauber <jan.glauber@cavium.com>
+ */
+
+#include <linux/types.h>
+#include <linux/slab.h>
+#include <linux/init.h>
+#include <linux/cpu.h>
+#include <linux/io.h>
+#include <linux/perf_event.h>
+#include <linux/pci.h>
+
+#include <asm/cpufeature.h>
+#include <asm/cputype.h>
+
+#include "uncore_cavium.h"
+
+#ifndef PCI_DEVICE_ID_THUNDER_OCX
+#define PCI_DEVICE_ID_THUNDER_OCX		0xa013
+#endif
+
+#define OCX_FRC_NR_COUNTERS			4
+#define OCX_FRC_NR_UNITS			6
+#define OCX_FRC_UNIT_OFFSET			0x8
+#define OCX_FRC_COUNTER_OFFSET			0xfa00
+#define OCX_FRC_CONTROL_OFFSET			0xff00
+#define OCX_FRC_COUNTER_INC			0x80
+#define OCX_FRC_EVENT_MASK			0x1fffff
+#define OCX_FRC_STAT_CONTROL_BIT		37
+
+/* OCX FRC event list */
+#define OCX_FRC_EVENT_STAT0			0x0
+#define OCX_FRC_EVENT_STAT1			0x1
+#define OCX_FRC_EVENT_STAT2			0x2
+#define OCX_FRC_EVENT_STAT3			0x3
+
+struct thunder_uncore *thunder_uncore_ocx_frc;
+
+static inline void __iomem *map_offset_ocx_frc(unsigned long addr,
+				struct thunder_uncore *uncore, int unit)
+{
+	return (void __iomem *) (addr +
+				 uncore->pdevs[0].map +
+				 unit * OCX_FRC_UNIT_OFFSET);
+}
+
+/*
+ * Summarize counters across all FRC's. Different from the other uncore
+ * PMUs because all FRC's are on one PCI device.
+ */
+static void thunder_uncore_read_ocx_frc(struct perf_event *event)
+{
+	struct thunder_uncore *uncore = event_to_thunder_uncore(event);
+	struct hw_perf_event *hwc = &event->hw;
+	u64 prev, new, sum = 0;
+	s64 delta;
+	int i;
+
+	/*
+	 * since we do not enable counter overflow interrupts,
+	 * we do not have to worry about prev_count changing on us
+	 */
+
+	prev = local64_read(&hwc->prev_count);
+
+	/* read counter values from all units */
+	for (i = 0; i < OCX_FRC_NR_UNITS; i++) {
+		new = readq(map_offset_ocx_frc(hwc->event_base, uncore, i));
+		sum += new & OCX_FRC_EVENT_MASK;
+	}
+
+	local64_set(&hwc->prev_count, new);
+	delta = new - prev;
+	local64_add(delta, &event->count);
+}
+
+static void thunder_uncore_start(struct perf_event *event, int flags)
+{
+	struct thunder_uncore *uncore = event_to_thunder_uncore(event);
+	struct hw_perf_event *hwc = &event->hw;
+	u64 prev, ctl;
+	int i;
+
+	/* restore counter value divided by units into all counters */
+	if (flags & PERF_EF_RELOAD) {
+		prev = local64_read(&hwc->prev_count);
+		prev = (prev / uncore->nr_units) & OCX_FRC_EVENT_MASK;
+		for (i = 0; i < uncore->nr_units; i++)
+			writeq(prev, map_offset_ocx_frc(hwc->event_base,
+							uncore, i));
+	}
+
+
+	hwc->state = 0;
+
+	/* enable counters */
+	ctl = readq(hwc->config_base + uncore->pdevs[0].map);
+	ctl |= 1ULL << OCX_FRC_STAT_CONTROL_BIT;
+	writeq(ctl, hwc->config_base + uncore->pdevs[0].map);
+
+	perf_event_update_userpage(event);
+}
+
+static void thunder_uncore_stop(struct perf_event *event, int flags)
+{
+	struct thunder_uncore *uncore = event_to_thunder_uncore(event);
+	struct hw_perf_event *hwc = &event->hw;
+	u64 ctl;
+
+	/* disable counters */
+	ctl = readq(hwc->config_base + uncore->pdevs[0].map);
+	ctl &= ~(1ULL << OCX_FRC_STAT_CONTROL_BIT);
+	writeq(ctl, hwc->config_base + uncore->pdevs[0].map);
+
+	hwc->state |= PERF_HES_STOPPED;
+
+	if ((flags & PERF_EF_UPDATE) && !(hwc->state & PERF_HES_UPTODATE)) {
+		thunder_uncore_read_ocx_frc(event);
+		hwc->state |= PERF_HES_UPTODATE;
+	}
+}
+
+static int thunder_uncore_add(struct perf_event *event, int flags)
+{
+	struct thunder_uncore *uncore = event_to_thunder_uncore(event);
+	struct hw_perf_event *hwc = &event->hw;
+	int i;
+
+	WARN_ON_ONCE(!uncore);
+
+	/* are we already assigned? */
+	if (hwc->idx != -1 && uncore->events[hwc->idx] == event)
+		goto out;
+
+	for (i = 0; i < uncore->num_counters; i++) {
+		if (uncore->events[i] == event) {
+			hwc->idx = i;
+			goto out;
+		}
+	}
+
+	/* counters are 1:1 */
+	hwc->idx = -1;
+	if (cmpxchg(&uncore->events[hwc->config], NULL, event) == NULL)
+		hwc->idx = hwc->config;
+
+out:
+	if (hwc->idx == -1)
+		return -EBUSY;
+
+	hwc->config_base = OCX_FRC_CONTROL_OFFSET - OCX_FRC_COUNTER_OFFSET;
+	hwc->event_base = hwc->idx * OCX_FRC_COUNTER_INC;
+	hwc->state = PERF_HES_UPTODATE | PERF_HES_STOPPED;
+
+	if (flags & PERF_EF_START)
+		thunder_uncore_start(event, PERF_EF_RELOAD);
+
+	return 0;
+}
+
+PMU_FORMAT_ATTR(event, "config:0-1");
+
+static struct attribute *thunder_ocx_frc_format_attr[] = {
+	&format_attr_event.attr,
+	NULL,
+};
+
+static struct attribute_group thunder_ocx_frc_format_group = {
+	.name = "format",
+	.attrs = thunder_ocx_frc_format_attr,
+};
+
+EVENT_ATTR(stat0,	OCX_FRC_EVENT_STAT0);
+EVENT_ATTR(stat1,	OCX_FRC_EVENT_STAT1);
+EVENT_ATTR(stat2,	OCX_FRC_EVENT_STAT2);
+EVENT_ATTR(stat3,	OCX_FRC_EVENT_STAT3);
+
+static struct attribute *thunder_ocx_frc_events_attr[] = {
+	EVENT_PTR(stat0),
+	EVENT_PTR(stat1),
+	EVENT_PTR(stat2),
+	EVENT_PTR(stat3),
+	NULL,
+};
+
+static struct attribute_group thunder_ocx_frc_events_group = {
+	.name = "events",
+	.attrs = thunder_ocx_frc_events_attr,
+};
+
+static const struct attribute_group *thunder_ocx_frc_attr_groups[] = {
+	&thunder_uncore_attr_group,
+	&thunder_ocx_frc_format_group,
+	&thunder_ocx_frc_events_group,
+	NULL,
+};
+
+struct pmu thunder_ocx_frc_pmu = {
+	.attr_groups	= thunder_ocx_frc_attr_groups,
+	.name		= "thunder_ocx_frc",
+	.event_init	= thunder_uncore_event_init,
+	.add		= thunder_uncore_add,
+	.del		= thunder_uncore_del,
+	.start		= thunder_uncore_start,
+	.stop		= thunder_uncore_stop,
+	.read		= thunder_uncore_read_ocx_frc,
+};
+
+static int event_valid(u64 config)
+{
+	if (config <= OCX_FRC_EVENT_STAT3)
+		return 1;
+	else
+		return 0;
+}
+
+int __init thunder_uncore_ocx_frc_setup(void)
+{
+	int ret;
+
+	thunder_uncore_ocx_frc = kzalloc(sizeof(struct thunder_uncore),
+					 GFP_KERNEL);
+	if (!thunder_uncore_ocx_frc) {
+		ret = -ENOMEM;
+		goto fail_nomem;
+	}
+
+	ret = thunder_uncore_setup(thunder_uncore_ocx_frc,
+			PCI_DEVICE_ID_THUNDER_OCX, OCX_FRC_COUNTER_OFFSET,
+			OCX_FRC_CONTROL_OFFSET - OCX_FRC_COUNTER_OFFSET
+				+ sizeof(unsigned long long),
+			&thunder_ocx_frc_pmu);
+	if (ret)
+		goto fail;
+
+	thunder_uncore_ocx_frc->type = OCX_FRC_TYPE;
+	thunder_uncore_ocx_frc->num_counters = OCX_FRC_NR_COUNTERS;
+	thunder_uncore_ocx_frc->event_valid = event_valid;
+	return 0;
+
+fail:
+	kfree(thunder_uncore_ocx_frc);
+fail_nomem:
+	return ret;
+}
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [RFC PATCH 7/7] arm64/perf: Cavium ThunderX OCX TLK uncore support
  2016-02-12 16:55 [RFC PATCH 0/7] Cavium ThunderX uncore PMU support Jan Glauber
                   ` (5 preceding siblings ...)
  2016-02-12 16:55 ` [RFC PATCH 6/7] arm64/perf: Cavium ThunderX OCX FRC " Jan Glauber
@ 2016-02-12 16:55 ` Jan Glauber
  2016-02-12 17:00 ` [RFC PATCH 0/7] Cavium ThunderX uncore PMU support Mark Rutland
  7 siblings, 0 replies; 18+ messages in thread
From: Jan Glauber @ 2016-02-12 16:55 UTC (permalink / raw)
  To: Will Deacon, Mark Rutland; +Cc: linux-kernel, linux-arm-kernel, Jan Glauber

Support for the OCX transmit link counters.

Signed-off-by: Jan Glauber <jglauber@cavium.com>
---
 arch/arm64/kernel/uncore/Makefile                |   3 +-
 arch/arm64/kernel/uncore/uncore_cavium.c         |   3 +
 arch/arm64/kernel/uncore/uncore_cavium.h         |   4 +
 arch/arm64/kernel/uncore/uncore_cavium_ocx_tlk.c | 366 +++++++++++++++++++++++
 4 files changed, 375 insertions(+), 1 deletion(-)
 create mode 100644 arch/arm64/kernel/uncore/uncore_cavium_ocx_tlk.c

diff --git a/arch/arm64/kernel/uncore/Makefile b/arch/arm64/kernel/uncore/Makefile
index 752af39..a267800 100644
--- a/arch/arm64/kernel/uncore/Makefile
+++ b/arch/arm64/kernel/uncore/Makefile
@@ -3,4 +3,5 @@ obj-$(CONFIG_ARCH_THUNDER) += uncore_cavium.o		\
 			      uncore_cavium_l2c_cbc.o	\
 			      uncore_cavium_lmc.o	\
 			      uncore_cavium_ocx_lne.o	\
-			      uncore_cavium_ocx_frc.o
+			      uncore_cavium_ocx_frc.o	\
+			      uncore_cavium_ocx_tlk.o
diff --git a/arch/arm64/kernel/uncore/uncore_cavium.c b/arch/arm64/kernel/uncore/uncore_cavium.c
index 9fed1d6..e2edb7a4 100644
--- a/arch/arm64/kernel/uncore/uncore_cavium.c
+++ b/arch/arm64/kernel/uncore/uncore_cavium.c
@@ -32,6 +32,8 @@ struct thunder_uncore *event_to_thunder_uncore(struct perf_event *event)
 		return thunder_uncore_ocx_lne;
 	else if (event->pmu->type == thunder_ocx_frc_pmu.type)
 		return thunder_uncore_ocx_frc;
+	else if (event->pmu->type == thunder_ocx_tlk_pmu.type)
+		return thunder_uncore_ocx_tlk;
 	else
 		return NULL;
 }
@@ -221,6 +223,7 @@ static int __init thunder_uncore_init(void)
 	thunder_uncore_lmc_setup();
 	thunder_uncore_ocx_lne_setup();
 	thunder_uncore_ocx_frc_setup();
+	thunder_uncore_ocx_tlk_setup();
 	return 0;
 }
 late_initcall(thunder_uncore_init);
diff --git a/arch/arm64/kernel/uncore/uncore_cavium.h b/arch/arm64/kernel/uncore/uncore_cavium.h
index 07bd4f4..f312d41 100644
--- a/arch/arm64/kernel/uncore/uncore_cavium.h
+++ b/arch/arm64/kernel/uncore/uncore_cavium.h
@@ -18,6 +18,7 @@ enum uncore_type {
 	LMC_TYPE,
 	OCX_LNE_TYPE,
 	OCX_FRC_TYPE,
+	OCX_TLK_TYPE,
 };
 
 extern int thunder_uncore_version;
@@ -68,11 +69,13 @@ extern struct thunder_uncore *thunder_uncore_l2c_cbc;
 extern struct thunder_uncore *thunder_uncore_lmc;
 extern struct thunder_uncore *thunder_uncore_ocx_lne;
 extern struct thunder_uncore *thunder_uncore_ocx_frc;
+extern struct thunder_uncore *thunder_uncore_ocx_tlk;
 extern struct pmu thunder_l2c_tad_pmu;
 extern struct pmu thunder_l2c_cbc_pmu;
 extern struct pmu thunder_lmc_pmu;
 extern struct pmu thunder_ocx_lne_pmu;
 extern struct pmu thunder_ocx_frc_pmu;
+extern struct pmu thunder_ocx_tlk_pmu;
 
 /* Prototypes */
 struct thunder_uncore *event_to_thunder_uncore(struct perf_event *event);
@@ -91,3 +94,4 @@ int thunder_uncore_l2c_cbc_setup(void);
 int thunder_uncore_lmc_setup(void);
 int thunder_uncore_ocx_lne_setup(void);
 int thunder_uncore_ocx_frc_setup(void);
+int thunder_uncore_ocx_tlk_setup(void);
diff --git a/arch/arm64/kernel/uncore/uncore_cavium_ocx_tlk.c b/arch/arm64/kernel/uncore/uncore_cavium_ocx_tlk.c
new file mode 100644
index 0000000..71ef3ae
--- /dev/null
+++ b/arch/arm64/kernel/uncore/uncore_cavium_ocx_tlk.c
@@ -0,0 +1,366 @@
+/*
+ * Cavium Thunder uncore PMU support, OCX TLK counters.
+ *
+ * Copyright 2016 Cavium Inc.
+ * Author: Jan Glauber <jan.glauber@cavium.com>
+ */
+
+#include <linux/types.h>
+#include <linux/slab.h>
+#include <linux/init.h>
+#include <linux/cpu.h>
+#include <linux/io.h>
+#include <linux/perf_event.h>
+#include <linux/pci.h>
+
+#include <asm/cpufeature.h>
+#include <asm/cputype.h>
+
+#include "uncore_cavium.h"
+
+#ifndef PCI_DEVICE_ID_THUNDER_OCX
+#define PCI_DEVICE_ID_THUNDER_OCX		0xa013
+#endif
+
+#define OCX_TLK_NR_UNITS			3
+#define OCX_TLK_UNIT_OFFSET			0x2000
+#define OCX_TLK_CONTROL_OFFSET			0x10040
+#define OCX_TLK_COUNTER_OFFSET			0x10400
+
+#define OCX_TLK_STAT_DISABLE			0
+#define OCX_TLK_STAT_ENABLE			1
+
+/* OCX TLK event list */
+#define OCX_TLK_EVENT_STAT_IDLE_CNT		0x00
+#define OCX_TLK_EVENT_STAT_DATA_CNT		0x01
+#define OCX_TLK_EVENT_STAT_SYNC_CNT		0x02
+#define OCX_TLK_EVENT_STAT_RETRY_CNT		0x03
+#define OCX_TLK_EVENT_STAT_ERR_CNT		0x04
+
+#define OCX_TLK_EVENT_STAT_MAT0_CNT		0x08
+#define OCX_TLK_EVENT_STAT_MAT1_CNT		0x09
+#define OCX_TLK_EVENT_STAT_MAT2_CNT		0x0a
+#define OCX_TLK_EVENT_STAT_MAT3_CNT		0x0b
+
+#define OCX_TLK_EVENT_STAT_VC0_CMD		0x10
+#define OCX_TLK_EVENT_STAT_VC1_CMD		0x11
+#define OCX_TLK_EVENT_STAT_VC2_CMD		0x12
+#define OCX_TLK_EVENT_STAT_VC3_CMD		0x13
+#define OCX_TLK_EVENT_STAT_VC4_CMD		0x14
+#define OCX_TLK_EVENT_STAT_VC5_CMD		0x15
+
+#define OCX_TLK_EVENT_STAT_VC0_PKT		0x20
+#define OCX_TLK_EVENT_STAT_VC1_PKT		0x21
+#define OCX_TLK_EVENT_STAT_VC2_PKT		0x22
+#define OCX_TLK_EVENT_STAT_VC3_PKT		0x23
+#define OCX_TLK_EVENT_STAT_VC4_PKT		0x24
+#define OCX_TLK_EVENT_STAT_VC5_PKT		0x25
+#define OCX_TLK_EVENT_STAT_VC6_PKT		0x26
+#define OCX_TLK_EVENT_STAT_VC7_PKT		0x27
+#define OCX_TLK_EVENT_STAT_VC8_PKT		0x28
+#define OCX_TLK_EVENT_STAT_VC9_PKT		0x29
+#define OCX_TLK_EVENT_STAT_VC10_PKT		0x2a
+#define OCX_TLK_EVENT_STAT_VC11_PKT		0x2b
+#define OCX_TLK_EVENT_STAT_VC12_PKT		0x2c
+#define OCX_TLK_EVENT_STAT_VC13_PKT		0x2d
+
+#define OCX_TLK_EVENT_STAT_VC0_CON		0x30
+#define OCX_TLK_EVENT_STAT_VC1_CON		0x31
+#define OCX_TLK_EVENT_STAT_VC2_CON		0x32
+#define OCX_TLK_EVENT_STAT_VC3_CON		0x33
+#define OCX_TLK_EVENT_STAT_VC4_CON		0x34
+#define OCX_TLK_EVENT_STAT_VC5_CON		0x35
+#define OCX_TLK_EVENT_STAT_VC6_CON		0x36
+#define OCX_TLK_EVENT_STAT_VC7_CON		0x37
+#define OCX_TLK_EVENT_STAT_VC8_CON		0x38
+#define OCX_TLK_EVENT_STAT_VC9_CON		0x39
+#define OCX_TLK_EVENT_STAT_VC10_CON		0x3a
+#define OCX_TLK_EVENT_STAT_VC11_CON		0x3b
+#define OCX_TLK_EVENT_STAT_VC12_CON		0x3c
+#define OCX_TLK_EVENT_STAT_VC13_CON		0x3d
+
+#define OCX_TLK_MAX_COUNTER			OCX_TLK_EVENT_STAT_VC13_CON
+#define OCX_TLK_NR_COUNTERS			OCX_TLK_MAX_COUNTER
+
+struct thunder_uncore *thunder_uncore_ocx_tlk;
+
+static inline void __iomem *map_offset_ocx_tlk(unsigned long addr,
+				struct thunder_uncore *uncore, int unit)
+{
+	return (void __iomem *) (addr +
+				 uncore->pdevs[0].map +
+				 unit * OCX_TLK_UNIT_OFFSET);
+}
+
+/*
+ * Summarize counters across all TLK's. Different from the other uncore
+ * PMUs because all TLK's are on one PCI device.
+ */
+static void thunder_uncore_read_ocx_tlk(struct perf_event *event)
+{
+	struct thunder_uncore *uncore = event_to_thunder_uncore(event);
+	struct hw_perf_event *hwc = &event->hw;
+	u64 prev, new = 0;
+	s64 delta;
+	int i;
+
+	/*
+	 * since we do not enable counter overflow interrupts,
+	 * we do not have to worry about prev_count changing on us
+	 */
+
+	prev = local64_read(&hwc->prev_count);
+
+	/* read counter values from all units */
+	for (i = 0; i < OCX_TLK_NR_UNITS; i++)
+		new += readq(map_offset_ocx_tlk(hwc->event_base, uncore, i));
+
+	local64_set(&hwc->prev_count, new);
+	delta = new - prev;
+	local64_add(delta, &event->count);
+}
+
+static void thunder_uncore_start(struct perf_event *event, int flags)
+{
+	struct thunder_uncore *uncore = event_to_thunder_uncore(event);
+	struct hw_perf_event *hwc = &event->hw;
+	int i;
+
+	hwc->state = 0;
+
+	/* enable counters on all units */
+	for (i = 0; i < OCX_TLK_NR_UNITS; i++)
+		writeb(OCX_TLK_STAT_ENABLE,
+		       map_offset_ocx_tlk(hwc->config_base, uncore, i));
+
+	perf_event_update_userpage(event);
+}
+
+static void thunder_uncore_stop(struct perf_event *event, int flags)
+{
+	struct thunder_uncore *uncore = event_to_thunder_uncore(event);
+	struct hw_perf_event *hwc = &event->hw;
+	int i;
+
+	/* disable counters on all units */
+	for (i = 0; i < OCX_TLK_NR_UNITS; i++)
+		writeb(OCX_TLK_STAT_DISABLE,
+		       map_offset_ocx_tlk(hwc->config_base, uncore, i));
+	hwc->state |= PERF_HES_STOPPED;
+
+	if ((flags & PERF_EF_UPDATE) && !(hwc->state & PERF_HES_UPTODATE)) {
+		thunder_uncore_read_ocx_tlk(event);
+		hwc->state |= PERF_HES_UPTODATE;
+	}
+}
+
+static int thunder_uncore_add(struct perf_event *event, int flags)
+	{
+	struct thunder_uncore *uncore = event_to_thunder_uncore(event);
+	struct hw_perf_event *hwc = &event->hw;
+	int i;
+
+	WARN_ON_ONCE(!uncore);
+
+	/* are we already assigned? */
+	if (hwc->idx != -1 && uncore->events[hwc->idx] == event)
+		goto out;
+
+	for (i = 0; i < uncore->num_counters; i++) {
+		if (uncore->events[i] == event) {
+			hwc->idx = i;
+			goto out;
+		}
+	}
+
+	/* counters are 1:1 */
+	hwc->idx = -1;
+	if (cmpxchg(&uncore->events[hwc->config], NULL, event) == NULL)
+		hwc->idx = hwc->config;
+
+out:
+	if (hwc->idx == -1)
+		return -EBUSY;
+
+	hwc->config_base = 0;
+	hwc->event_base = OCX_TLK_COUNTER_OFFSET - OCX_TLK_CONTROL_OFFSET +
+			hwc->idx * sizeof(unsigned long long);
+	hwc->state = PERF_HES_UPTODATE | PERF_HES_STOPPED;
+
+	if (flags & PERF_EF_START)
+		thunder_uncore_start(event, PERF_EF_RELOAD);
+
+	return 0;
+}
+
+PMU_FORMAT_ATTR(event, "config:0-5");
+
+static struct attribute *thunder_ocx_tlk_format_attr[] = {
+	&format_attr_event.attr,
+	NULL,
+};
+
+static struct attribute_group thunder_ocx_tlk_format_group = {
+	.name = "format",
+	.attrs = thunder_ocx_tlk_format_attr,
+};
+
+EVENT_ATTR(idle_cnt,	OCX_TLK_EVENT_STAT_IDLE_CNT);
+EVENT_ATTR(data_cnt,	OCX_TLK_EVENT_STAT_DATA_CNT);
+EVENT_ATTR(sync_cnt,	OCX_TLK_EVENT_STAT_SYNC_CNT);
+EVENT_ATTR(retry_cnt,	OCX_TLK_EVENT_STAT_RETRY_CNT);
+EVENT_ATTR(err_cnt,	OCX_TLK_EVENT_STAT_ERR_CNT);
+EVENT_ATTR(mat0_cnt,	OCX_TLK_EVENT_STAT_MAT0_CNT);
+EVENT_ATTR(mat1_cnt,	OCX_TLK_EVENT_STAT_MAT1_CNT);
+EVENT_ATTR(mat2_cnt,	OCX_TLK_EVENT_STAT_MAT2_CNT);
+EVENT_ATTR(mat3_cnt,	OCX_TLK_EVENT_STAT_MAT3_CNT);
+EVENT_ATTR(vc0_cmd,	OCX_TLK_EVENT_STAT_VC0_CMD);
+EVENT_ATTR(vc1_cmd,	OCX_TLK_EVENT_STAT_VC1_CMD);
+EVENT_ATTR(vc2_cmd,	OCX_TLK_EVENT_STAT_VC2_CMD);
+EVENT_ATTR(vc3_cmd,	OCX_TLK_EVENT_STAT_VC3_CMD);
+EVENT_ATTR(vc4_cmd,	OCX_TLK_EVENT_STAT_VC4_CMD);
+EVENT_ATTR(vc5_cmd,	OCX_TLK_EVENT_STAT_VC5_CMD);
+EVENT_ATTR(vc0_pkt,	OCX_TLK_EVENT_STAT_VC0_PKT);
+EVENT_ATTR(vc1_pkt,	OCX_TLK_EVENT_STAT_VC1_PKT);
+EVENT_ATTR(vc2_pkt,	OCX_TLK_EVENT_STAT_VC2_PKT);
+EVENT_ATTR(vc3_pkt,	OCX_TLK_EVENT_STAT_VC3_PKT);
+EVENT_ATTR(vc4_pkt,	OCX_TLK_EVENT_STAT_VC4_PKT);
+EVENT_ATTR(vc5_pkt,	OCX_TLK_EVENT_STAT_VC5_PKT);
+EVENT_ATTR(vc6_pkt,	OCX_TLK_EVENT_STAT_VC6_PKT);
+EVENT_ATTR(vc7_pkt,	OCX_TLK_EVENT_STAT_VC7_PKT);
+EVENT_ATTR(vc8_pkt,	OCX_TLK_EVENT_STAT_VC8_PKT);
+EVENT_ATTR(vc9_pkt,	OCX_TLK_EVENT_STAT_VC9_PKT);
+EVENT_ATTR(vc10_pkt,	OCX_TLK_EVENT_STAT_VC10_PKT);
+EVENT_ATTR(vc11_pkt,	OCX_TLK_EVENT_STAT_VC11_PKT);
+EVENT_ATTR(vc12_pkt,	OCX_TLK_EVENT_STAT_VC12_PKT);
+EVENT_ATTR(vc13_pkt,	OCX_TLK_EVENT_STAT_VC13_PKT);
+EVENT_ATTR(vc0_con,	OCX_TLK_EVENT_STAT_VC0_CON);
+EVENT_ATTR(vc1_con,	OCX_TLK_EVENT_STAT_VC1_CON);
+EVENT_ATTR(vc2_con,	OCX_TLK_EVENT_STAT_VC2_CON);
+EVENT_ATTR(vc3_con,	OCX_TLK_EVENT_STAT_VC3_CON);
+EVENT_ATTR(vc4_con,	OCX_TLK_EVENT_STAT_VC4_CON);
+EVENT_ATTR(vc5_con,	OCX_TLK_EVENT_STAT_VC5_CON);
+EVENT_ATTR(vc6_con,	OCX_TLK_EVENT_STAT_VC6_CON);
+EVENT_ATTR(vc7_con,	OCX_TLK_EVENT_STAT_VC7_CON);
+EVENT_ATTR(vc8_con,	OCX_TLK_EVENT_STAT_VC8_CON);
+EVENT_ATTR(vc9_con,	OCX_TLK_EVENT_STAT_VC9_CON);
+EVENT_ATTR(vc10_con,	OCX_TLK_EVENT_STAT_VC10_CON);
+EVENT_ATTR(vc11_con,	OCX_TLK_EVENT_STAT_VC11_CON);
+EVENT_ATTR(vc12_con,	OCX_TLK_EVENT_STAT_VC12_CON);
+EVENT_ATTR(vc13_con,	OCX_TLK_EVENT_STAT_VC13_CON);
+
+static struct attribute *thunder_ocx_tlk_events_attr[] = {
+	EVENT_PTR(idle_cnt),
+	EVENT_PTR(data_cnt),
+	EVENT_PTR(sync_cnt),
+	EVENT_PTR(retry_cnt),
+	EVENT_PTR(err_cnt),
+	EVENT_PTR(mat0_cnt),
+	EVENT_PTR(mat1_cnt),
+	EVENT_PTR(mat2_cnt),
+	EVENT_PTR(mat3_cnt),
+	EVENT_PTR(vc0_cmd),
+	EVENT_PTR(vc1_cmd),
+	EVENT_PTR(vc2_cmd),
+	EVENT_PTR(vc3_cmd),
+	EVENT_PTR(vc4_cmd),
+	EVENT_PTR(vc5_cmd),
+	EVENT_PTR(vc0_pkt),
+	EVENT_PTR(vc1_pkt),
+	EVENT_PTR(vc2_pkt),
+	EVENT_PTR(vc3_pkt),
+	EVENT_PTR(vc4_pkt),
+	EVENT_PTR(vc5_pkt),
+	EVENT_PTR(vc6_pkt),
+	EVENT_PTR(vc7_pkt),
+	EVENT_PTR(vc8_pkt),
+	EVENT_PTR(vc9_pkt),
+	EVENT_PTR(vc10_pkt),
+	EVENT_PTR(vc11_pkt),
+	EVENT_PTR(vc12_pkt),
+	EVENT_PTR(vc13_pkt),
+	EVENT_PTR(vc0_con),
+	EVENT_PTR(vc1_con),
+	EVENT_PTR(vc2_con),
+	EVENT_PTR(vc3_con),
+	EVENT_PTR(vc4_con),
+	EVENT_PTR(vc5_con),
+	EVENT_PTR(vc6_con),
+	EVENT_PTR(vc7_con),
+	EVENT_PTR(vc8_con),
+	EVENT_PTR(vc9_con),
+	EVENT_PTR(vc10_con),
+	EVENT_PTR(vc11_con),
+	EVENT_PTR(vc12_con),
+	EVENT_PTR(vc13_con),
+	NULL,
+};
+
+static struct attribute_group thunder_ocx_tlk_events_group = {
+	.name = "events",
+	.attrs = thunder_ocx_tlk_events_attr,
+};
+
+static const struct attribute_group *thunder_ocx_tlk_attr_groups[] = {
+	&thunder_uncore_attr_group,
+	&thunder_ocx_tlk_format_group,
+	&thunder_ocx_tlk_events_group,
+	NULL,
+};
+
+struct pmu thunder_ocx_tlk_pmu = {
+	.attr_groups	= thunder_ocx_tlk_attr_groups,
+	.name		= "thunder_ocx_tlk",
+	.event_init	= thunder_uncore_event_init,
+	.add		= thunder_uncore_add,
+	.del		= thunder_uncore_del,
+	.start		= thunder_uncore_start,
+	.stop		= thunder_uncore_stop,
+	.read		= thunder_uncore_read_ocx_tlk,
+};
+
+static int event_valid(u64 config)
+{
+	if (config <= OCX_TLK_EVENT_STAT_ERR_CNT ||
+	    (config >= OCX_TLK_EVENT_STAT_MAT0_CNT &&
+	     config <= OCX_TLK_EVENT_STAT_MAT3_CNT) ||
+	    (config >= OCX_TLK_EVENT_STAT_VC0_CMD &&
+	     config <= OCX_TLK_EVENT_STAT_VC5_CMD) ||
+	    (config >= OCX_TLK_EVENT_STAT_VC0_PKT &&
+	     config <= OCX_TLK_EVENT_STAT_VC13_PKT) ||
+	    (config >= OCX_TLK_EVENT_STAT_VC0_CON &&
+	     config <= OCX_TLK_EVENT_STAT_VC13_CON))
+		return 1;
+	else
+		return 0;
+}
+
+int __init thunder_uncore_ocx_tlk_setup(void)
+{
+	int ret;
+
+	thunder_uncore_ocx_tlk = kzalloc(sizeof(struct thunder_uncore),
+					 GFP_KERNEL);
+	if (!thunder_uncore_ocx_tlk) {
+		ret = -ENOMEM;
+		goto fail_nomem;
+	}
+
+	ret = thunder_uncore_setup(thunder_uncore_ocx_tlk,
+				   PCI_DEVICE_ID_THUNDER_OCX,
+				   OCX_TLK_CONTROL_OFFSET,
+				   OCX_TLK_UNIT_OFFSET * OCX_TLK_NR_UNITS,
+				   &thunder_ocx_tlk_pmu);
+	if (ret)
+		goto fail;
+
+	thunder_uncore_ocx_tlk->type = OCX_TLK_TYPE;
+	thunder_uncore_ocx_tlk->num_counters = OCX_TLK_NR_COUNTERS;
+	thunder_uncore_ocx_tlk->event_valid = event_valid;
+	return 0;
+
+fail:
+	kfree(thunder_uncore_ocx_tlk);
+fail_nomem:
+	return ret;
+}
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH 0/7] Cavium ThunderX uncore PMU support
  2016-02-12 16:55 [RFC PATCH 0/7] Cavium ThunderX uncore PMU support Jan Glauber
                   ` (6 preceding siblings ...)
  2016-02-12 16:55 ` [RFC PATCH 7/7] arm64/perf: Cavium ThunderX OCX TLK " Jan Glauber
@ 2016-02-12 17:00 ` Mark Rutland
  7 siblings, 0 replies; 18+ messages in thread
From: Mark Rutland @ 2016-02-12 17:00 UTC (permalink / raw)
  To: Jan Glauber; +Cc: Will Deacon, linux-kernel, linux-arm-kernel

On Fri, Feb 12, 2016 at 05:55:05PM +0100, Jan Glauber wrote:
> Hi,
> 
> this patch series provides access to various counters on the ThunderX SOC.
> 
> For details of the implementation see patch #1.
> 
> Patches #2-7 add ther various ThunderX specific PMUs.
> 
> I did not want to put these file into arch/arm64/kernel so I added a
> "uncore" directory. Maybe this should be put somewhere under drivers/
> instead.

This is probably better suited to drivers/perf/, or drivers/bus/ (where
CCN and CCI perf drivers live).

Mark.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH 1/7] arm64/perf: Basic uncore counter support for Cavium ThunderX
  2016-02-12 16:55 ` [RFC PATCH 1/7] arm64/perf: Basic uncore counter support for Cavium ThunderX Jan Glauber
@ 2016-02-12 17:36   ` Mark Rutland
  2016-02-13  1:47     ` David Daney
                       ` (3 more replies)
  0 siblings, 4 replies; 18+ messages in thread
From: Mark Rutland @ 2016-02-12 17:36 UTC (permalink / raw)
  To: Jan Glauber; +Cc: Will Deacon, linux-kernel, linux-arm-kernel

On Fri, Feb 12, 2016 at 05:55:06PM +0100, Jan Glauber wrote:
> Provide uncore facilities for non-CPU performance counter units.
> Based on Intel/AMD uncore pmu support.
> 
> The uncore PMUs can be found under /sys/bus/event_source/devices.
> All counters are exported via sysfs in the corresponding events
> files under the PMU directory so the perf tool can list the event names.

It turns out that "uncore" covers quite a lot of things.

Where exactly do the see counters live? system, socket, cluster?

Are there potentially multiple instances of a given PMU in the system?
e.g. might each clutster have an instance of an L2 PMU?

If I turn off a set of CPUs, do any "uncore" PMUs lost context or become
inaccessible?

Otherwise, are they associated with some power domain?

> There are 2 points that are special in this implementation:
> 
> 1) The PMU detection solely relies on PCI device detection. If a
>    matching PCI device is found the PMU is created. The code can deal
>    with multiple units of the same type, e.g. more than one memory
>    controller.

I see below that the driver has an initcall that runs regardless of
whether the PCI device exists, and looks at the MIDR. That's clearly not
string PCI device detection.

Why is this not a true PCI driver that only gets probed if the PCI
device exists? 

> 2) Counters are summarized across the different units of the same type,
>    e.g. L2C TAD 0..7 is presented as a single counter (adding the
>    values from TAD 0 to 7). Although losing the ability to read a
>    single value the merged values are easier to use and yield
>    enough information.

I'm not sure I follow this. What is easier? What are you doing, and what
are you comparing that with to say that your approach is easier?

It sounds like it should be possible to handle multiple counters like
this, so I don't follow why you want to amalgamate them in-kernel.

[...]

> +#include <asm/cpufeature.h>
> +#include <asm/cputype.h>

I don't see why you should need these two if this is truly an uncore
device probed solely from PCI.

> +void thunder_uncore_read(struct perf_event *event)
> +{
> +	struct thunder_uncore *uncore = event_to_thunder_uncore(event);
> +	struct hw_perf_event *hwc = &event->hw;
> +	u64 prev, new = 0;
> +	s64 delta;
> +	int i;
> +
> +	/*
> +	 * since we do not enable counter overflow interrupts,
> +	 * we do not have to worry about prev_count changing on us
> +	 */

Without overflow interrupts, how do you ensure that you account for
overflow in a reasonable time window (i.e. before the counter runs past
its initial value)?

> +
> +	prev = local64_read(&hwc->prev_count);
> +
> +	/* read counter values from all units */
> +	for (i = 0; i < uncore->nr_units; i++)
> +		new += readq(map_offset(hwc->event_base, uncore, i));

There's no bit to determine whether an overflow occurred?

> +
> +	local64_set(&hwc->prev_count, new);
> +	delta = new - prev;
> +	local64_add(delta, &event->count);
> +}
> +
> +void thunder_uncore_del(struct perf_event *event, int flags)
> +{
> +	struct thunder_uncore *uncore = event_to_thunder_uncore(event);
> +	struct hw_perf_event *hwc = &event->hw;
> +	int i;
> +
> +	event->pmu->stop(event, PERF_EF_UPDATE);
> +
> +	for (i = 0; i < uncore->num_counters; i++) {
> +		if (cmpxchg(&uncore->events[i], event, NULL) == event)
> +			break;
> +	}
> +	hwc->idx = -1;
> +}

Why not just place the event at uncode->events[hwc->idx] ?

Theat way removing the event is trivial.

> +int thunder_uncore_event_init(struct perf_event *event)
> +{
> +	struct hw_perf_event *hwc = &event->hw;
> +	struct thunder_uncore *uncore;
> +
> +	if (event->attr.type != event->pmu->type)
> +		return -ENOENT;
> +
> +	/* we do not support sampling */
> +	if (is_sampling_event(event))
> +		return -EINVAL;
> +
> +	/* counters do not have these bits */
> +	if (event->attr.exclude_user	||
> +	    event->attr.exclude_kernel	||
> +	    event->attr.exclude_host	||
> +	    event->attr.exclude_guest	||
> +	    event->attr.exclude_hv	||
> +	    event->attr.exclude_idle)
> +		return -EINVAL;

We should _really_ make these features opt-in at the core level. It's
crazy that each and every PMU drivers has to explicitly test and reject
things it doesn't support.

> +
> +	/* and we do not enable counter overflow interrupts */

That statement raises far more questions than it answers.

_why_ do we not user overflow interrupts?

> +
> +	uncore = event_to_thunder_uncore(event);
> +	if (!uncore)
> +		return -ENODEV;
> +	if (!uncore->event_valid(event->attr.config))
> +		return -EINVAL;
> +
> +	hwc->config = event->attr.config;
> +	hwc->idx = -1;
> +
> +	/* and we don't care about CPU */

Actually, you do. You want the perf core to serialize accesses via the
same CPU, so all events _must_ be targetted at the same CPU. Otherwise
there are a tonne of problems you don't even want to think about.

You _must_ ensure this kernel-side, regardless of what the perf tool
happens to do.

See the arm-cci and arm-ccn drivers for an example.

You can also follow the migration approach used there to allow you to
retain counting across a hotplug.

[...]

> +static int __init thunder_uncore_init(void)
> +{
> +	unsigned long implementor = read_cpuid_implementor();
> +	unsigned long part_number = read_cpuid_part_number();
> +	u32 variant;
> +
> +	if (implementor != ARM_CPU_IMP_CAVIUM ||
> +	    part_number != CAVIUM_CPU_PART_THUNDERX)
> +		return -ENODEV;
> +
> +	/* detect pass2 which contains different counters */
> +	variant = MIDR_VARIANT(read_cpuid_id());
> +	if (variant == 1)
> +		thunder_uncore_version = 1;
> +	pr_info("PMU version: %d\n", thunder_uncore_version);
> +
> +	return 0;
> +}

You should call out these differences in the commmit message.

Mark.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH 1/7] arm64/perf: Basic uncore counter support for Cavium ThunderX
  2016-02-12 17:36   ` Mark Rutland
@ 2016-02-13  1:47     ` David Daney
  2016-02-15 11:33       ` Mark Rutland
  2016-02-15 14:07     ` Jan Glauber
                       ` (2 subsequent siblings)
  3 siblings, 1 reply; 18+ messages in thread
From: David Daney @ 2016-02-13  1:47 UTC (permalink / raw)
  To: Mark Rutland; +Cc: Jan Glauber, Will Deacon, linux-kernel, linux-arm-kernel

On 02/12/2016 09:36 AM, Mark Rutland wrote:
> On Fri, Feb 12, 2016 at 05:55:06PM +0100, Jan Glauber wrote:
[...]
>> 2) Counters are summarized across the different units of the same type,
>>     e.g. L2C TAD 0..7 is presented as a single counter (adding the
>>     values from TAD 0 to 7). Although losing the ability to read a
>>     single value the merged values are easier to use and yield
>>     enough information.
>
> I'm not sure I follow this. What is easier? What are you doing, and what
> are you comparing that with to say that your approach is easier?
>
> It sounds like it should be possible to handle multiple counters like
> this, so I don't follow why you want to amalgamate them in-kernel.
>

The values of the individual counters are close to meaningless.  The 
only thing that is meaningful to someone reading the counters is the 
aggregate sum of all the counts.


> [...]
>
>> +#include <asm/cpufeature.h>
>> +#include <asm/cputype.h>
>
> I don't see why you should need these two if this is truly an uncore
> device probed solely from PCI.
>
>> +void thunder_uncore_read(struct perf_event *event)
>> +{
>> +	struct thunder_uncore *uncore = event_to_thunder_uncore(event);
>> +	struct hw_perf_event *hwc = &event->hw;
>> +	u64 prev, new = 0;
>> +	s64 delta;
>> +	int i;
>> +
>> +	/*
>> +	 * since we do not enable counter overflow interrupts,
>> +	 * we do not have to worry about prev_count changing on us
>> +	 */
>
> Without overflow interrupts, how do you ensure that you account for
> overflow in a reasonable time window (i.e. before the counter runs past
> its initial value)?

Two reasons:

   1) There are no interrupts.

   2) The counters are 64-bit, they never overflow.

>
>> +
>> +	prev = local64_read(&hwc->prev_count);
>> +
>> +	/* read counter values from all units */
>> +	for (i = 0; i < uncore->nr_units; i++)
>> +		new += readq(map_offset(hwc->event_base, uncore, i));
>
> There's no bit to determine whether an overflow occurred?

No.


>
>> +
>> +	local64_set(&hwc->prev_count, new);
>> +	delta = new - prev;
>> +	local64_add(delta, &event->count);
>> +}
>> +
>> +void thunder_uncore_del(struct perf_event *event, int flags)
>> +{
>> +	struct thunder_uncore *uncore = event_to_thunder_uncore(event);
>> +	struct hw_perf_event *hwc = &event->hw;
>> +	int i;
>> +
>> +	event->pmu->stop(event, PERF_EF_UPDATE);
>> +
>> +	for (i = 0; i < uncore->num_counters; i++) {
>> +		if (cmpxchg(&uncore->events[i], event, NULL) == event)
>> +			break;
>> +	}
>> +	hwc->idx = -1;
>> +}
>
> Why not just place the event at uncode->events[hwc->idx] ?
>
> Theat way removing the event is trivial.
>
>> +int thunder_uncore_event_init(struct perf_event *event)
>> +{
>> +	struct hw_perf_event *hwc = &event->hw;
>> +	struct thunder_uncore *uncore;
>> +
>> +	if (event->attr.type != event->pmu->type)
>> +		return -ENOENT;
>> +
>> +	/* we do not support sampling */
>> +	if (is_sampling_event(event))
>> +		return -EINVAL;
>> +
>> +	/* counters do not have these bits */
>> +	if (event->attr.exclude_user	||
>> +	    event->attr.exclude_kernel	||
>> +	    event->attr.exclude_host	||
>> +	    event->attr.exclude_guest	||
>> +	    event->attr.exclude_hv	||
>> +	    event->attr.exclude_idle)
>> +		return -EINVAL;
>
> We should _really_ make these features opt-in at the core level. It's
> crazy that each and every PMU drivers has to explicitly test and reject
> things it doesn't support.
>
>> +
>> +	/* and we do not enable counter overflow interrupts */
>
> That statement raises far more questions than it answers.
>
> _why_ do we not user overflow interrupts?

As stated above, there are *no* overflow interrupts.

The events we are counting cannot be attributed to any one (or even any) 
CPU, they run asynchronous to the CPU, so even if there were interrupts, 
they would be meaningless.


David Daney

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH 1/7] arm64/perf: Basic uncore counter support for Cavium ThunderX
  2016-02-13  1:47     ` David Daney
@ 2016-02-15 11:33       ` Mark Rutland
  0 siblings, 0 replies; 18+ messages in thread
From: Mark Rutland @ 2016-02-15 11:33 UTC (permalink / raw)
  To: David Daney; +Cc: Jan Glauber, Will Deacon, linux-kernel, linux-arm-kernel

On Fri, Feb 12, 2016 at 05:47:25PM -0800, David Daney wrote:
> On 02/12/2016 09:36 AM, Mark Rutland wrote:
> >On Fri, Feb 12, 2016 at 05:55:06PM +0100, Jan Glauber wrote:
> [...]
> >>2) Counters are summarized across the different units of the same type,
> >>    e.g. L2C TAD 0..7 is presented as a single counter (adding the
> >>    values from TAD 0 to 7). Although losing the ability to read a
> >>    single value the merged values are easier to use and yield
> >>    enough information.
> >
> >I'm not sure I follow this. What is easier? What are you doing, and what
> >are you comparing that with to say that your approach is easier?
> >
> >It sounds like it should be possible to handle multiple counters like
> >this, so I don't follow why you want to amalgamate them in-kernel.
> >
> 
> The values of the individual counters are close to meaningless.  The
> only thing that is meaningful to someone reading the counters is the
> aggregate sum of all the counts.

I obviously know nowhere near enough about your system to say with
certainty, but it may turn out that being able to track counters
individually is useful for some profiling/debugging scenario. How
meaningful the individual counts are really depends on what you're
trying to figure out.

If you believe that aggregate values are sufficient, then I'm happy to
leave that as-is.

> >>+void thunder_uncore_read(struct perf_event *event)
> >>+{
> >>+	struct thunder_uncore *uncore = event_to_thunder_uncore(event);
> >>+	struct hw_perf_event *hwc = &event->hw;
> >>+	u64 prev, new = 0;
> >>+	s64 delta;
> >>+	int i;
> >>+
> >>+	/*
> >>+	 * since we do not enable counter overflow interrupts,
> >>+	 * we do not have to worry about prev_count changing on us
> >>+	 */
> >
> >Without overflow interrupts, how do you ensure that you account for
> >overflow in a reasonable time window (i.e. before the counter runs past
> >its initial value)?
> 
> Two reasons:
> 
>   1) There are no interrupts.
> 
>   2) The counters are 64-bit, they never overflow.

Ok. Please point this out in the comment so that reviewers aren't
misled. Stating that we don't enable an interrupt implies that said
interrupt exists.

> >>+	/* and we do not enable counter overflow interrupts */
> >
> >That statement raises far more questions than it answers.
> >
> >_why_ do we not user overflow interrupts?
> 
> As stated above, there are *no* overflow interrupts.

Ok. As stated above, please fix this comment to not mislead.

> The events we are counting cannot be attributed to any one (or even
> any) CPU, they run asynchronous to the CPU, so even if there were
> interrupts, they would be meaningless.

Yes, they are meaningless w.r.t. the state of an arbitrary CPU.

Were they to exist you could use them to drive other snapshotting of the
state of the uncore PMU, to get an idea of the frequency/stability of
events over time, etc. Userspace might then decide to snapshot other
whole system state based on events fed to it.

That's moot if they don't exist.

Mark.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH 1/7] arm64/perf: Basic uncore counter support for Cavium ThunderX
  2016-02-12 17:36   ` Mark Rutland
  2016-02-13  1:47     ` David Daney
@ 2016-02-15 14:07     ` Jan Glauber
  2016-02-15 14:27       ` Mark Rutland
  2016-02-16  8:41     ` Jan Glauber
  2016-03-11 10:54     ` Jan Glauber
  3 siblings, 1 reply; 18+ messages in thread
From: Jan Glauber @ 2016-02-15 14:07 UTC (permalink / raw)
  To: Mark Rutland; +Cc: Will Deacon, linux-kernel, linux-arm-kernel

Hi Mark,

thanks for reviewing! I'll need several mails to address all questions.

On Fri, Feb 12, 2016 at 05:36:59PM +0000, Mark Rutland wrote:
> On Fri, Feb 12, 2016 at 05:55:06PM +0100, Jan Glauber wrote:
> > Provide uncore facilities for non-CPU performance counter units.
> > Based on Intel/AMD uncore pmu support.
> > 
> > The uncore PMUs can be found under /sys/bus/event_source/devices.
> > All counters are exported via sysfs in the corresponding events
> > files under the PMU directory so the perf tool can list the event names.
> 
> It turns out that "uncore" covers quite a lot of things.
> 
> Where exactly do the see counters live? system, socket, cluster?

Neither cluster nor socket, so I would say system. Where a system may
consist of 2 nodes that mostly appear as one system.

> Are there potentially multiple instances of a given PMU in the system?
> e.g. might each clutster have an instance of an L2 PMU?

Yes.

> If I turn off a set of CPUs, do any "uncore" PMUs lost context or become
> inaccessible?

No, they are not related to CPUs.

[...]

> > 
> > 1) The PMU detection solely relies on PCI device detection. If a
> >    matching PCI device is found the PMU is created. The code can deal
> >    with multiple units of the same type, e.g. more than one memory
> >    controller.
> 
> I see below that the driver has an initcall that runs regardless of
> whether the PCI device exists, and looks at the MIDR. That's clearly not
> string PCI device detection.
> 
> Why is this not a true PCI driver that only gets probed if the PCI
> device exists? 

It is not a PCI driver because there are already drivers like edac that
will access these PCI devices. The uncore driver only accesses the
performance counters, which are not used by the other drivers.

[...]

> > +#include <asm/cpufeature.h>
> > +#include <asm/cputype.h>
> 
> I don't see why you should need these two if this is truly an uncore
> device probed solely from PCI.

There are several passes of the hardware that have the same PCI device
ID. Therefore I need the CPU variant to distinguish them. This could
be done _after_ the PCI device is found but I found it easier to
implement the check once in the common setup function.

[...]

> > +int thunder_uncore_event_init(struct perf_event *event)
> > +{
> > +	struct hw_perf_event *hwc = &event->hw;
> > +	struct thunder_uncore *uncore;
> > +
> > +	if (event->attr.type != event->pmu->type)
> > +		return -ENOENT;
> > +
> > +	/* we do not support sampling */
> > +	if (is_sampling_event(event))
> > +		return -EINVAL;
> > +
> > +	/* counters do not have these bits */
> > +	if (event->attr.exclude_user	||
> > +	    event->attr.exclude_kernel	||
> > +	    event->attr.exclude_host	||
> > +	    event->attr.exclude_guest	||
> > +	    event->attr.exclude_hv	||
> > +	    event->attr.exclude_idle)
> > +		return -EINVAL;
> 
> We should _really_ make these features opt-in at the core level. It's
> crazy that each and every PMU drivers has to explicitly test and reject
> things it doesn't support.

Completely agreed. Also, every sample code I looked at did
check for other bits...

[...]

> > +
> > +	uncore = event_to_thunder_uncore(event);
> > +	if (!uncore)
> > +		return -ENODEV;
> > +	if (!uncore->event_valid(event->attr.config))
> > +		return -EINVAL;
> > +
> > +	hwc->config = event->attr.config;
> > +	hwc->idx = -1;
> > +
> > +	/* and we don't care about CPU */
> 
> Actually, you do. You want the perf core to serialize accesses via the
> same CPU, so all events _must_ be targetted at the same CPU. Otherwise
> there are a tonne of problems you don't even want to think about.

I found that perf added the events on every CPU in the system. Because
the uncore events are not CPU related I wanted to avoid this. Setting
cpumask to -1 did not work. Therefore I added a single CPU in the
cpumask, see thunder_uncore_attr_show_cpumask().

> You _must_ ensure this kernel-side, regardless of what the perf tool
> happens to do.
> 
> See the arm-cci and arm-ccn drivers for an example.
> 
> You can also follow the migration approach used there to allow you to
> retain counting across a hotplug.
> 
> [...]
> 
> > +static int __init thunder_uncore_init(void)
> > +{
> > +	unsigned long implementor = read_cpuid_implementor();
> > +	unsigned long part_number = read_cpuid_part_number();
> > +	u32 variant;
> > +
> > +	if (implementor != ARM_CPU_IMP_CAVIUM ||
> > +	    part_number != CAVIUM_CPU_PART_THUNDERX)
> > +		return -ENODEV;
> > +
> > +	/* detect pass2 which contains different counters */
> > +	variant = MIDR_VARIANT(read_cpuid_id());
> > +	if (variant == 1)
> > +		thunder_uncore_version = 1;
> > +	pr_info("PMU version: %d\n", thunder_uncore_version);
> > +
> > +	return 0;
> > +}
> 
> You should call out these differences in the commmit message.

OK

> Mark.

Thanks, Jan

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH 1/7] arm64/perf: Basic uncore counter support for Cavium ThunderX
  2016-02-15 14:07     ` Jan Glauber
@ 2016-02-15 14:27       ` Mark Rutland
  2016-02-15 14:46         ` Mark Rutland
  2016-02-15 15:34         ` Jan Glauber
  0 siblings, 2 replies; 18+ messages in thread
From: Mark Rutland @ 2016-02-15 14:27 UTC (permalink / raw)
  To: Jan Glauber; +Cc: Will Deacon, linux-kernel, linux-arm-kernel

On Mon, Feb 15, 2016 at 03:07:20PM +0100, Jan Glauber wrote:
> Hi Mark,
> 
> thanks for reviewing! I'll need several mails to address all questions.
> 
> On Fri, Feb 12, 2016 at 05:36:59PM +0000, Mark Rutland wrote:
> > On Fri, Feb 12, 2016 at 05:55:06PM +0100, Jan Glauber wrote:
> > > Provide uncore facilities for non-CPU performance counter units.
> > > Based on Intel/AMD uncore pmu support.
> > > 
> > > The uncore PMUs can be found under /sys/bus/event_source/devices.
> > > All counters are exported via sysfs in the corresponding events
> > > files under the PMU directory so the perf tool can list the event names.
> > 
> > It turns out that "uncore" covers quite a lot of things.
> > 
> > Where exactly do the see counters live? system, socket, cluster?
> 
> Neither cluster nor socket, so I would say system. Where a system may
> consist of 2 nodes that mostly appear as one system.
> 
> > Are there potentially multiple instances of a given PMU in the system?
> > e.g. might each clutster have an instance of an L2 PMU?
> 
> Yes.
> 
> > If I turn off a set of CPUs, do any "uncore" PMUs lost context or become
> > inaccessible?
> 
> No, they are not related to CPUs.

Ok. So I should be able to concurrently hotplug random CPUs on/off while
this driver is running, without issues? No registers might be
clock-gated or similar?

I appreciate that they are not "related" to particular CPUs as such.

> > > 1) The PMU detection solely relies on PCI device detection. If a
> > >    matching PCI device is found the PMU is created. The code can deal
> > >    with multiple units of the same type, e.g. more than one memory
> > >    controller.
> > 
> > I see below that the driver has an initcall that runs regardless of
> > whether the PCI device exists, and looks at the MIDR. That's clearly not
> > string PCI device detection.
> > 
> > Why is this not a true PCI driver that only gets probed if the PCI
> > device exists? 
> 
> It is not a PCI driver because there are already drivers like edac that
> will access these PCI devices. The uncore driver only accesses the
> performance counters, which are not used by the other drivers.

Several drivers are accessing the same device?

That sounds somewhat scary.

> > > +#include <asm/cpufeature.h>
> > > +#include <asm/cputype.h>
> > 
> > I don't see why you should need these two if this is truly an uncore
> > device probed solely from PCI.
> 
> There are several passes of the hardware that have the same PCI device
> ID. Therefore I need the CPU variant to distinguish them. This could
> be done _after_ the PCI device is found but I found it easier to
> implement the check once in the common setup function.

Ok. Please call that out in the commit message.

> > > +int thunder_uncore_event_init(struct perf_event *event)
> > > +{
> > > +	struct hw_perf_event *hwc = &event->hw;
> > > +	struct thunder_uncore *uncore;
> > > +
> > > +	if (event->attr.type != event->pmu->type)
> > > +		return -ENOENT;
> > > +
> > > +	/* we do not support sampling */
> > > +	if (is_sampling_event(event))
> > > +		return -EINVAL;
> > > +
> > > +	/* counters do not have these bits */
> > > +	if (event->attr.exclude_user	||
> > > +	    event->attr.exclude_kernel	||
> > > +	    event->attr.exclude_host	||
> > > +	    event->attr.exclude_guest	||
> > > +	    event->attr.exclude_hv	||
> > > +	    event->attr.exclude_idle)
> > > +		return -EINVAL;
> > 
> > We should _really_ make these features opt-in at the core level. It's
> > crazy that each and every PMU drivers has to explicitly test and reject
> > things it doesn't support.
> 
> Completely agreed. Also, every sample code I looked at did
> check for other bits...
> 
> [...]
> 
> > > +
> > > +	uncore = event_to_thunder_uncore(event);
> > > +	if (!uncore)
> > > +		return -ENODEV;
> > > +	if (!uncore->event_valid(event->attr.config))
> > > +		return -EINVAL;
> > > +
> > > +	hwc->config = event->attr.config;
> > > +	hwc->idx = -1;
> > > +
> > > +	/* and we don't care about CPU */
> > 
> > Actually, you do. You want the perf core to serialize accesses via the
> > same CPU, so all events _must_ be targetted at the same CPU. Otherwise
> > there are a tonne of problems you don't even want to think about.
> 
> I found that perf added the events on every CPU in the system. Because
> the uncore events are not CPU related I wanted to avoid this. Setting
> cpumask to -1 did not work. Therefore I added a single CPU in the
> cpumask, see thunder_uncore_attr_show_cpumask().

I understand that, which is why I wrote:

> > You _must_ ensure this kernel-side, regardless of what the perf tool
> > happens to do.
> > 
> > See the arm-cci and arm-ccn drivers for an example.

Take a look at drivers/bus/arm-cci.c; specifically, what we do in
cci_pmu_event_init and cci_pmu_cpu_notifier.

This is the same thing that's done for x86 system PMUs. Take a look at
uncore_pmu_event_init in arch/x86/kernel/cpu/perf_event_intel_uncore.c.

Otherwise there are a number of situations where userspace might open
events on different CPUs, and you get some freaky results because the
perf core expects accesses to a PMU and its related data structures to
be strictly serialised through _some_ CPU (even if that CPU is
arbitrarily chosen).

For example, if CPU 0 was offline when one event was opened, then cpu0
was hotplugged, then a second event was opened, there would be an event
in the CPU1 context and another in the CPU0 context. Modification to the
PMU state is done per CPU-context in the core, and these would race
against each other given the lack of serialisation. Even with sufficient
locking to prevent outright corruption, things like event rotation would
race leading to very non-deterministic results.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH 1/7] arm64/perf: Basic uncore counter support for Cavium ThunderX
  2016-02-15 14:27       ` Mark Rutland
@ 2016-02-15 14:46         ` Mark Rutland
  2016-02-15 15:34         ` Jan Glauber
  1 sibling, 0 replies; 18+ messages in thread
From: Mark Rutland @ 2016-02-15 14:46 UTC (permalink / raw)
  To: Jan Glauber; +Cc: Will Deacon, linux-kernel, linux-arm-kernel

On Mon, Feb 15, 2016 at 02:27:27PM +0000, Mark Rutland wrote:
> > > > +	uncore = event_to_thunder_uncore(event);
> > > > +	if (!uncore)
> > > > +		return -ENODEV;
> > > > +	if (!uncore->event_valid(event->attr.config))
> > > > +		return -EINVAL;
> > > > +
> > > > +	hwc->config = event->attr.config;
> > > > +	hwc->idx = -1;
> > > > +
> > > > +	/* and we don't care about CPU */
> > > 
> > > Actually, you do. You want the perf core to serialize accesses via the
> > > same CPU, so all events _must_ be targetted at the same CPU. Otherwise
> > > there are a tonne of problems you don't even want to think about.
> > 
> > I found that perf added the events on every CPU in the system. Because
> > the uncore events are not CPU related I wanted to avoid this. Setting
> > cpumask to -1 did not work. Therefore I added a single CPU in the
> > cpumask, see thunder_uncore_attr_show_cpumask().
> 
> I understand that, which is why I wrote:
> 
> > > You _must_ ensure this kernel-side, regardless of what the perf tool
> > > happens to do.
> > > 
> > > See the arm-cci and arm-ccn drivers for an example.
> 
> Take a look at drivers/bus/arm-cci.c; specifically, what we do in
> cci_pmu_event_init and cci_pmu_cpu_notifier.
> 
> This is the same thing that's done for x86 system PMUs. Take a look at
> uncore_pmu_event_init in arch/x86/kernel/cpu/perf_event_intel_uncore.c.

I note that we still have an open TODO rather than a call to
perf_pmu_migrate_context.

The better example is arm_ccn_pmu_cpu_notifier in drivers/bus/arm-ccn.c.

Mark.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH 1/7] arm64/perf: Basic uncore counter support for Cavium ThunderX
  2016-02-15 14:27       ` Mark Rutland
  2016-02-15 14:46         ` Mark Rutland
@ 2016-02-15 15:34         ` Jan Glauber
  1 sibling, 0 replies; 18+ messages in thread
From: Jan Glauber @ 2016-02-15 15:34 UTC (permalink / raw)
  To: Mark Rutland; +Cc: Will Deacon, linux-kernel, linux-arm-kernel

On Mon, Feb 15, 2016 at 02:27:27PM +0000, Mark Rutland wrote:
> > > > 1) The PMU detection solely relies on PCI device detection. If a
> > > >    matching PCI device is found the PMU is created. The code can deal
> > > >    with multiple units of the same type, e.g. more than one memory
> > > >    controller.
> > > 
> > > I see below that the driver has an initcall that runs regardless of
> > > whether the PCI device exists, and looks at the MIDR. That's clearly not
> > > string PCI device detection.
> > > 
> > > Why is this not a true PCI driver that only gets probed if the PCI
> > > device exists? 
> > 
> > It is not a PCI driver because there are already drivers like edac that
> > will access these PCI devices. The uncore driver only accesses the
> > performance counters, which are not used by the other drivers.
> 
> Several drivers are accessing the same device?
> 
> That sounds somewhat scary.

I've double checked that the edac drivers do not access the performance
counters at all. It felt cleaner to me to keep the performance counter code
seperated from edac.

Jan

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH 1/7] arm64/perf: Basic uncore counter support for Cavium ThunderX
  2016-02-12 17:36   ` Mark Rutland
  2016-02-13  1:47     ` David Daney
  2016-02-15 14:07     ` Jan Glauber
@ 2016-02-16  8:41     ` Jan Glauber
  2016-03-11 10:54     ` Jan Glauber
  3 siblings, 0 replies; 18+ messages in thread
From: Jan Glauber @ 2016-02-16  8:41 UTC (permalink / raw)
  To: Mark Rutland; +Cc: Will Deacon, linux-kernel, linux-arm-kernel

On Fri, Feb 12, 2016 at 05:36:59PM +0000, Mark Rutland wrote:
> On Fri, Feb 12, 2016 at 05:55:06PM +0100, Jan Glauber wrote:
> > Provide uncore facilities for non-CPU performance counter units.
> > Based on Intel/AMD uncore pmu support.
> > 
> > The uncore PMUs can be found under /sys/bus/event_source/devices.
> > All counters are exported via sysfs in the corresponding events
> > files under the PMU directory so the perf tool can list the event names.
> 
> It turns out that "uncore" covers quite a lot of things.
> 
> Where exactly do the see counters live? system, socket, cluster?
> 
> Are there potentially multiple instances of a given PMU in the system?
> e.g. might each clutster have an instance of an L2 PMU?

Thinking twice about the multi-node systems I would like to change
the implementation to not merge counters across nodes. There might
be value in having the counters per node and also performance wise
it would be better to merge counters only on the local node.

I'll introduce a node sysfs attribute that can be combined with
the event and fill the node with the numa id.

[...]

> 
> Otherwise, are they associated with some power domain?

No, power domains are not used for the "uncore" related devices.
These devices are currently always on.

Jan

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH 1/7] arm64/perf: Basic uncore counter support for Cavium ThunderX
  2016-02-12 17:36   ` Mark Rutland
                       ` (2 preceding siblings ...)
  2016-02-16  8:41     ` Jan Glauber
@ 2016-03-11 10:54     ` Jan Glauber
  3 siblings, 0 replies; 18+ messages in thread
From: Jan Glauber @ 2016-03-11 10:54 UTC (permalink / raw)
  To: Mark Rutland; +Cc: Will Deacon, linux-kernel, linux-arm-kernel

On Fri, Feb 12, 2016 at 05:36:59PM +0000, Mark Rutland wrote:
> On Fri, Feb 12, 2016 at 05:55:06PM +0100, Jan Glauber wrote:

[...]

> > +int thunder_uncore_event_init(struct perf_event *event)
> > +{
> > +	struct hw_perf_event *hwc = &event->hw;
> > +	struct thunder_uncore *uncore;
> > +
> > +	if (event->attr.type != event->pmu->type)
> > +		return -ENOENT;
> > +
> > +	/* we do not support sampling */
> > +	if (is_sampling_event(event))
> > +		return -EINVAL;
> > +
> > +	/* counters do not have these bits */
> > +	if (event->attr.exclude_user	||
> > +	    event->attr.exclude_kernel	||
> > +	    event->attr.exclude_host	||
> > +	    event->attr.exclude_guest	||
> > +	    event->attr.exclude_hv	||
> > +	    event->attr.exclude_idle)
> > +		return -EINVAL;
> 
> We should _really_ make these features opt-in at the core level. It's
> crazy that each and every PMU drivers has to explicitly test and reject
> things it doesn't support.
> 

Looking at the exclude_* feature handling in pmu->event_init under
arch/ shows lots of differences. Just as an example, exclude_idle
returns sometimes -EINVAL, sometimes -EPERM while other archs
ignore it and one silently deletes the flag.

So it looks hard to me to move the exclude handling into
perf core while keeping the per-arch differences. And if we
don't and return an error on the perf_event_open syscall in a newer
kernel for an exclude bit that was previously ignored it will be a
user-space regression, right?

Looking only at the uncore drivers (plus drivers/bus/arm-cc*)
it looks like they all don't support any exclude bits but the
handling here differs also. The table shows the current behaviour,
X means the requested exclude flag is refused.

                          user  kernel  host    guest    hv     idle
---------------------------------------------------------------------
x86 uncore intel        |  X      X                      X       X
x86 uncore intel snb    |  X      X      X       X       X       X
x86 uncore intel cqm    |  X      X      X       X       X       X
x86 uncore intel cstate |  X      X      X       X       X       X
x86 uncore intel rapl   |  X      X      X       X       X       X
x86 uncore amd          |  X      X      X       X
x86 uncore amd iommu    |  X      X      X       X
x86 uncore amd ibs      |  X      X      X       X       X       X
arm bus cci             |  X      X      X       X       X       X
arm bus ccn             |  X      X                      X       X
----------------------------------------------------------------------

How about simply adding a helper function to include/linux/perf_event.h
that checks if _any_ of the exclude bits is set? We could then
simplify the check-for-any exclude flag to:

if (any_exclude_set(event))
	return -EINVAL;

What's your opinion?

Jan

---

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index f5c5a3f..0eacdba0 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -849,6 +849,18 @@ static inline bool is_sampling_event(struct perf_event *event)
 	return event->attr.sample_period != 0;
 }
 
+static inline int any_exclude_set(struct perf_event *event)
+{
+	if (event->attr.exclude_user	||
+	    event->attr.exclude_kernel	||
+	    event->attr.exclude_host	||
+	    event->attr.exclude_guest	||
+	    event->attr.exclude_hv	||
+	    event->attr.exclude_idle)
+		return 1;
+	return 0;
+}
+
 /*
  * Return 1 for a software event, 0 for a hardware event
  */

^ permalink raw reply related	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2016-03-11 10:54 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-02-12 16:55 [RFC PATCH 0/7] Cavium ThunderX uncore PMU support Jan Glauber
2016-02-12 16:55 ` [RFC PATCH 1/7] arm64/perf: Basic uncore counter support for Cavium ThunderX Jan Glauber
2016-02-12 17:36   ` Mark Rutland
2016-02-13  1:47     ` David Daney
2016-02-15 11:33       ` Mark Rutland
2016-02-15 14:07     ` Jan Glauber
2016-02-15 14:27       ` Mark Rutland
2016-02-15 14:46         ` Mark Rutland
2016-02-15 15:34         ` Jan Glauber
2016-02-16  8:41     ` Jan Glauber
2016-03-11 10:54     ` Jan Glauber
2016-02-12 16:55 ` [RFC PATCH 2/7] arm64/perf: Cavium ThunderX L2C TAD uncore support Jan Glauber
2016-02-12 16:55 ` [RFC PATCH 3/7] arm64/perf: Cavium ThunderX L2C CBC " Jan Glauber
2016-02-12 16:55 ` [RFC PATCH 4/7] arm64/perf: Cavium ThunderX LMC " Jan Glauber
2016-02-12 16:55 ` [RFC PATCH 5/7] arm64/perf: Cavium ThunderX OCX LNE " Jan Glauber
2016-02-12 16:55 ` [RFC PATCH 6/7] arm64/perf: Cavium ThunderX OCX FRC " Jan Glauber
2016-02-12 16:55 ` [RFC PATCH 7/7] arm64/perf: Cavium ThunderX OCX TLK " Jan Glauber
2016-02-12 17:00 ` [RFC PATCH 0/7] Cavium ThunderX uncore PMU support Mark Rutland

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).