All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 0/5] Cavium ThunderX uncore PMU support
@ 2016-10-29 11:55 ` Jan Glauber
  0 siblings, 0 replies; 28+ messages in thread
From: Jan Glauber @ 2016-10-29 11:55 UTC (permalink / raw)
  To: Mark Rutland, Will Deacon; +Cc: linux-kernel, linux-arm-kernel, Jan Glauber

As discussed, changed perf_sw_context -> perf_invalid_context.

Not changed:
- Stick to NUMA node ID to detect the socket a device belongs to but made
  uncore depend on CONFIG_NUMA.
- Stick to initcall for uncore framework because it is easier to do the
  scanning for the same type of PCI devices, also I don't know if the PCI layer
  would allow for several drivers to register for the same device ID.

Patches are against 4.9.0-rc2.

Changes to v3:
- use perf_invalid_context

Changes to v2:
- Embedded struct pmu and killed uncore->type
- Simplified add functions
- Unified functions where possible into a common implementation
- Use arrays to translate non-contiguous counter addresses to event_id's
  visible to the user
- Sorted includes
- Got rid of division for previous counter values
- Removed unneeded WARN_ONs
- Use sizeof(*ptr)
- Use bool for event_valid return
- Fixed HES_STOPPED logic
- Added some design notes and improved (hopefully) comments
- Removed pass1 counter support for now
- Merged EVENT_ATTR and EVENT_PTR defines into one (unreadable) thing
- Use pmu_enable|disable to start|stop the OCX TLK counter set
- Moved cpumask into thunder_uncore struct
- Switched to new cpuhp stuff. I still don't care about the CPU location
  used to access an uncore device, it may cross the CCPI and
  we'll pay a performance penalty. We might optimize this later, for now
  I feel it is not worth the time optimizing it.

--------------------------

Jan Glauber (5):
  arm64: perf: Basic uncore counter support for Cavium ThunderX SOC
  arm64: perf: Cavium ThunderX L2C TAD uncore support
  arm64: perf: Cavium ThunderX L2C CBC uncore support
  arm64: perf: Cavium ThunderX LMC uncore support
  arm64: perf: Cavium ThunderX OCX TLK uncore support

 drivers/perf/Kconfig                        |  13 +
 drivers/perf/Makefile                       |   1 +
 drivers/perf/uncore/Makefile                |   5 +
 drivers/perf/uncore/uncore_cavium.c         | 355 ++++++++++++++++++++++++++
 drivers/perf/uncore/uncore_cavium.h         |  75 ++++++
 drivers/perf/uncore/uncore_cavium_l2c_cbc.c | 148 +++++++++++
 drivers/perf/uncore/uncore_cavium_l2c_tad.c | 379 ++++++++++++++++++++++++++++
 drivers/perf/uncore/uncore_cavium_lmc.c     | 118 +++++++++
 drivers/perf/uncore/uncore_cavium_ocx_tlk.c | 344 +++++++++++++++++++++++++
 include/linux/cpuhotplug.h                  |   1 +
 10 files changed, 1439 insertions(+)
 create mode 100644 drivers/perf/uncore/Makefile
 create mode 100644 drivers/perf/uncore/uncore_cavium.c
 create mode 100644 drivers/perf/uncore/uncore_cavium.h
 create mode 100644 drivers/perf/uncore/uncore_cavium_l2c_cbc.c
 create mode 100644 drivers/perf/uncore/uncore_cavium_l2c_tad.c
 create mode 100644 drivers/perf/uncore/uncore_cavium_lmc.c
 create mode 100644 drivers/perf/uncore/uncore_cavium_ocx_tlk.c

-- 
2.9.0.rc0.21.g7777322

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v4 0/5] Cavium ThunderX uncore PMU support
@ 2016-10-29 11:55 ` Jan Glauber
  0 siblings, 0 replies; 28+ messages in thread
From: Jan Glauber @ 2016-10-29 11:55 UTC (permalink / raw)
  To: linux-arm-kernel

As discussed, changed perf_sw_context -> perf_invalid_context.

Not changed:
- Stick to NUMA node ID to detect the socket a device belongs to but made
  uncore depend on CONFIG_NUMA.
- Stick to initcall for uncore framework because it is easier to do the
  scanning for the same type of PCI devices, also I don't know if the PCI layer
  would allow for several drivers to register for the same device ID.

Patches are against 4.9.0-rc2.

Changes to v3:
- use perf_invalid_context

Changes to v2:
- Embedded struct pmu and killed uncore->type
- Simplified add functions
- Unified functions where possible into a common implementation
- Use arrays to translate non-contiguous counter addresses to event_id's
  visible to the user
- Sorted includes
- Got rid of division for previous counter values
- Removed unneeded WARN_ONs
- Use sizeof(*ptr)
- Use bool for event_valid return
- Fixed HES_STOPPED logic
- Added some design notes and improved (hopefully) comments
- Removed pass1 counter support for now
- Merged EVENT_ATTR and EVENT_PTR defines into one (unreadable) thing
- Use pmu_enable|disable to start|stop the OCX TLK counter set
- Moved cpumask into thunder_uncore struct
- Switched to new cpuhp stuff. I still don't care about the CPU location
  used to access an uncore device, it may cross the CCPI and
  we'll pay a performance penalty. We might optimize this later, for now
  I feel it is not worth the time optimizing it.

--------------------------

Jan Glauber (5):
  arm64: perf: Basic uncore counter support for Cavium ThunderX SOC
  arm64: perf: Cavium ThunderX L2C TAD uncore support
  arm64: perf: Cavium ThunderX L2C CBC uncore support
  arm64: perf: Cavium ThunderX LMC uncore support
  arm64: perf: Cavium ThunderX OCX TLK uncore support

 drivers/perf/Kconfig                        |  13 +
 drivers/perf/Makefile                       |   1 +
 drivers/perf/uncore/Makefile                |   5 +
 drivers/perf/uncore/uncore_cavium.c         | 355 ++++++++++++++++++++++++++
 drivers/perf/uncore/uncore_cavium.h         |  75 ++++++
 drivers/perf/uncore/uncore_cavium_l2c_cbc.c | 148 +++++++++++
 drivers/perf/uncore/uncore_cavium_l2c_tad.c | 379 ++++++++++++++++++++++++++++
 drivers/perf/uncore/uncore_cavium_lmc.c     | 118 +++++++++
 drivers/perf/uncore/uncore_cavium_ocx_tlk.c | 344 +++++++++++++++++++++++++
 include/linux/cpuhotplug.h                  |   1 +
 10 files changed, 1439 insertions(+)
 create mode 100644 drivers/perf/uncore/Makefile
 create mode 100644 drivers/perf/uncore/uncore_cavium.c
 create mode 100644 drivers/perf/uncore/uncore_cavium.h
 create mode 100644 drivers/perf/uncore/uncore_cavium_l2c_cbc.c
 create mode 100644 drivers/perf/uncore/uncore_cavium_l2c_tad.c
 create mode 100644 drivers/perf/uncore/uncore_cavium_lmc.c
 create mode 100644 drivers/perf/uncore/uncore_cavium_ocx_tlk.c

-- 
2.9.0.rc0.21.g7777322

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v4 1/5] arm64: perf: Basic uncore counter support for Cavium ThunderX SOC
  2016-10-29 11:55 ` Jan Glauber
@ 2016-10-29 11:55   ` Jan Glauber
  -1 siblings, 0 replies; 28+ messages in thread
From: Jan Glauber @ 2016-10-29 11:55 UTC (permalink / raw)
  To: Mark Rutland, Will Deacon; +Cc: linux-kernel, linux-arm-kernel, Jan Glauber

Provide "uncore" facilities for different non-CPU performance
counter units.

The uncore PMUs can be found under /sys/bus/event_source/devices.
All counters are exported via sysfs in the corresponding events
files under the PMU directory so the perf tool can list the event names.

There are some points that are special in this implementation:

1) The PMU detection relies on PCI device detection. If a
   matching PCI device is found the PMU is created. The code can deal
   with multiple units of the same type, e.g. more than one memory
   controller.

2) Counters are summarized across different units of the same type
   on one NUMA node but not across NUMA nodes.
   For instance L2C TAD 0..7 are presented as a single counter
   (adding the values from TAD 0 to 7). Although losing the ability
   to read a single value the merged values are easier to use.

3) The counters are not CPU related. A random CPU is picked regardless
   of the NUMA node. There is a small performance penalty for accessing
   counters on a remote note but reading a performance counter is a
   slow operation anyway.

Signed-off-by: Jan Glauber <jglauber@cavium.com>
---
 drivers/perf/Kconfig                |  13 ++
 drivers/perf/Makefile               |   1 +
 drivers/perf/uncore/Makefile        |   1 +
 drivers/perf/uncore/uncore_cavium.c | 351 ++++++++++++++++++++++++++++++++++++
 drivers/perf/uncore/uncore_cavium.h |  71 ++++++++
 include/linux/cpuhotplug.h          |   1 +
 6 files changed, 438 insertions(+)
 create mode 100644 drivers/perf/uncore/Makefile
 create mode 100644 drivers/perf/uncore/uncore_cavium.c
 create mode 100644 drivers/perf/uncore/uncore_cavium.h

diff --git a/drivers/perf/Kconfig b/drivers/perf/Kconfig
index 4d5c5f9..3266c87 100644
--- a/drivers/perf/Kconfig
+++ b/drivers/perf/Kconfig
@@ -19,4 +19,17 @@ config XGENE_PMU
         help
           Say y if you want to use APM X-Gene SoC performance monitors.
 
+config UNCORE_PMU
+	bool
+
+config UNCORE_PMU_CAVIUM
+	depends on PERF_EVENTS && NUMA && ARM64
+	bool "Cavium uncore PMU support"
+	select UNCORE_PMU
+	default y
+	help
+	  Say y if you want to access performance counters of subsystems
+	  on a Cavium SOC like cache controller, memory controller or
+	  processor interconnect.
+
 endmenu
diff --git a/drivers/perf/Makefile b/drivers/perf/Makefile
index b116e98..d6c02c9 100644
--- a/drivers/perf/Makefile
+++ b/drivers/perf/Makefile
@@ -1,2 +1,3 @@
 obj-$(CONFIG_ARM_PMU) += arm_pmu.o
 obj-$(CONFIG_XGENE_PMU) += xgene_pmu.o
+obj-y += uncore/
diff --git a/drivers/perf/uncore/Makefile b/drivers/perf/uncore/Makefile
new file mode 100644
index 0000000..6130e18
--- /dev/null
+++ b/drivers/perf/uncore/Makefile
@@ -0,0 +1 @@
+obj-$(CONFIG_UNCORE_PMU_CAVIUM) += uncore_cavium.o
diff --git a/drivers/perf/uncore/uncore_cavium.c b/drivers/perf/uncore/uncore_cavium.c
new file mode 100644
index 0000000..a7b4277
--- /dev/null
+++ b/drivers/perf/uncore/uncore_cavium.c
@@ -0,0 +1,351 @@
+/*
+ * Cavium Thunder uncore PMU support.
+ *
+ * Copyright (C) 2015,2016 Cavium Inc.
+ * Author: Jan Glauber <jan.glauber@cavium.com>
+ */
+
+#include <linux/cpufeature.h>
+#include <linux/numa.h>
+#include <linux/slab.h>
+
+#include "uncore_cavium.h"
+
+/*
+ * Some notes about the various counters supported by this "uncore" PMU
+ * and the design:
+ *
+ * All counters are 64 bit long.
+ * There are no overflow interrupts.
+ * Counters are summarized per node/socket.
+ * Most devices appear as separate PCI devices per socket with the exception
+ * of OCX TLK which appears as one PCI device per socket and contains several
+ * units with counters that are merged.
+ * Some counters are selected via a control register (L2C TAD) and read by
+ * a number of counter registers, others (L2C CBC, LMC & OCX TLK) have
+ * one dedicated counter per event.
+ * Some counters are not stoppable (L2C CBC & LMC).
+ * Some counters are read-only (LMC).
+ * All counters belong to PCI devices, the devices may have additional
+ * drivers but we assume we are the only user of the counter registers.
+ * We map the whole PCI BAR so we must be careful to forbid access to
+ * addresses that contain neither counters nor counter control registers.
+ */
+
+void thunder_uncore_read(struct perf_event *event)
+{
+	struct thunder_uncore *uncore = to_uncore(event->pmu);
+	struct hw_perf_event *hwc = &event->hw;
+	struct thunder_uncore_node *node;
+	struct thunder_uncore_unit *unit;
+	u64 prev, delta, new = 0;
+
+	node = get_node(hwc->config, uncore);
+
+	/* read counter values from all units on the node */
+	list_for_each_entry(unit, &node->unit_list, entry)
+		new += readq(hwc->event_base + unit->map);
+
+	prev = local64_read(&hwc->prev_count);
+	local64_set(&hwc->prev_count, new);
+	delta = new - prev;
+	local64_add(delta, &event->count);
+}
+
+int thunder_uncore_add(struct perf_event *event, int flags, u64 config_base,
+		       u64 event_base)
+{
+	struct thunder_uncore *uncore = to_uncore(event->pmu);
+	struct hw_perf_event *hwc = &event->hw;
+	struct thunder_uncore_node *node;
+	int id;
+
+	node = get_node(hwc->config, uncore);
+	id = get_id(hwc->config);
+
+	if (!cmpxchg(&node->events[id], NULL, event))
+		hwc->idx = id;
+
+	if (hwc->idx == -1)
+		return -EBUSY;
+
+	hwc->config_base = config_base;
+	hwc->event_base = event_base;
+	hwc->state = PERF_HES_UPTODATE | PERF_HES_STOPPED;
+
+	if (flags & PERF_EF_START)
+		uncore->pmu.start(event, PERF_EF_RELOAD);
+
+	return 0;
+}
+
+void thunder_uncore_del(struct perf_event *event, int flags)
+{
+	struct thunder_uncore *uncore = to_uncore(event->pmu);
+	struct hw_perf_event *hwc = &event->hw;
+	struct thunder_uncore_node *node;
+	int i;
+
+	event->pmu->stop(event, PERF_EF_UPDATE);
+
+	/*
+	 * For programmable counters we need to check where we installed it.
+	 * To keep this function generic always test the more complicated
+	 * case (free running counters won't need the loop).
+	 */
+	node = get_node(hwc->config, uncore);
+	for (i = 0; i < node->num_counters; i++) {
+		if (cmpxchg(&node->events[i], event, NULL) == event)
+			break;
+	}
+	hwc->idx = -1;
+}
+
+void thunder_uncore_start(struct perf_event *event, int flags)
+{
+	struct thunder_uncore *uncore = to_uncore(event->pmu);
+	struct hw_perf_event *hwc = &event->hw;
+	struct thunder_uncore_node *node;
+	struct thunder_uncore_unit *unit;
+	u64 new = 0;
+
+	/* read counter values from all units on the node */
+	node = get_node(hwc->config, uncore);
+	list_for_each_entry(unit, &node->unit_list, entry)
+		new += readq(hwc->event_base + unit->map);
+	local64_set(&hwc->prev_count, new);
+
+	hwc->state = 0;
+	perf_event_update_userpage(event);
+}
+
+void thunder_uncore_stop(struct perf_event *event, int flags)
+{
+	struct hw_perf_event *hwc = &event->hw;
+
+	hwc->state |= PERF_HES_STOPPED;
+
+	if ((flags & PERF_EF_UPDATE) && !(hwc->state & PERF_HES_UPTODATE)) {
+		thunder_uncore_read(event);
+		hwc->state |= PERF_HES_UPTODATE;
+	}
+}
+
+int thunder_uncore_event_init(struct perf_event *event)
+{
+	struct hw_perf_event *hwc = &event->hw;
+	struct thunder_uncore_node *node;
+	struct thunder_uncore *uncore;
+
+	if (event->attr.type != event->pmu->type)
+		return -ENOENT;
+
+	/* we do not support sampling */
+	if (is_sampling_event(event))
+		return -EINVAL;
+
+	/* counters do not have these bits */
+	if (event->attr.exclude_user	||
+	    event->attr.exclude_kernel	||
+	    event->attr.exclude_host	||
+	    event->attr.exclude_guest	||
+	    event->attr.exclude_hv	||
+	    event->attr.exclude_idle)
+		return -EINVAL;
+
+	uncore = to_uncore(event->pmu);
+	if (!uncore)
+		return -ENODEV;
+	if (!uncore->event_valid(event->attr.config & UNCORE_EVENT_ID_MASK))
+		return -EINVAL;
+
+	/* check NUMA node */
+	node = get_node(event->attr.config, uncore);
+	if (!node) {
+		pr_debug("Invalid NUMA node selected\n");
+		return -EINVAL;
+	}
+
+	hwc->config = event->attr.config;
+	hwc->idx = -1;
+	return 0;
+}
+
+static ssize_t thunder_uncore_attr_show_cpumask(struct device *dev,
+						struct device_attribute *attr,
+						char *buf)
+{
+	struct pmu *pmu = dev_get_drvdata(dev);
+	struct thunder_uncore *uncore =
+		container_of(pmu, struct thunder_uncore, pmu);
+
+	return cpumap_print_to_pagebuf(true, buf, &uncore->active_mask);
+}
+static DEVICE_ATTR(cpumask, S_IRUGO, thunder_uncore_attr_show_cpumask, NULL);
+
+static struct attribute *thunder_uncore_attrs[] = {
+	&dev_attr_cpumask.attr,
+	NULL,
+};
+
+struct attribute_group thunder_uncore_attr_group = {
+	.attrs = thunder_uncore_attrs,
+};
+
+ssize_t thunder_events_sysfs_show(struct device *dev,
+				  struct device_attribute *attr,
+				  char *page)
+{
+	struct perf_pmu_events_attr *pmu_attr =
+		container_of(attr, struct perf_pmu_events_attr, attr);
+
+	if (pmu_attr->event_str)
+		return sprintf(page, "%s", pmu_attr->event_str);
+
+	return 0;
+}
+
+/* node attribute depending on number of NUMA nodes */
+static ssize_t node_show(struct device *dev, struct device_attribute *attr,
+			 char *page)
+{
+	if (NODES_SHIFT)
+		return sprintf(page, "config:16-%d\n", 16 + NODES_SHIFT - 1);
+	else
+		return sprintf(page, "config:16\n");
+}
+
+struct device_attribute format_attr_node = __ATTR_RO(node);
+
+/*
+ * Thunder uncore events are independent from CPUs. Provide a cpumask
+ * nevertheless to prevent perf from adding the event per-cpu and just
+ * set the mask to one online CPU. Use the same cpumask for all uncore
+ * devices.
+ *
+ * There is a performance penalty for accessing a device from a CPU on
+ * another socket, but we do not care (yet).
+ */
+static int thunder_uncore_offline_cpu(unsigned int old_cpu, struct hlist_node *node)
+{
+	struct thunder_uncore *uncore = hlist_entry_safe(node, struct thunder_uncore, node);
+	int new_cpu;
+
+	if (!cpumask_test_and_clear_cpu(old_cpu, &uncore->active_mask))
+		return 0;
+	new_cpu = cpumask_any_but(cpu_online_mask, old_cpu);
+	if (new_cpu >= nr_cpu_ids)
+		return 0;
+	perf_pmu_migrate_context(&uncore->pmu, old_cpu, new_cpu);
+	cpumask_set_cpu(new_cpu, &uncore->active_mask);
+	return 0;
+}
+
+static struct thunder_uncore_node * __init alloc_node(struct thunder_uncore *uncore,
+						      int node_id, int counters)
+{
+	struct thunder_uncore_node *node;
+
+	node = kzalloc(sizeof(*node), GFP_KERNEL);
+	if (!node)
+		return NULL;
+	node->num_counters = counters;
+	INIT_LIST_HEAD(&node->unit_list);
+	return node;
+}
+
+int __init thunder_uncore_setup(struct thunder_uncore *uncore, int device_id,
+				struct pmu *pmu, int counters)
+{
+	unsigned int vendor_id = PCI_VENDOR_ID_CAVIUM;
+	struct thunder_uncore_unit  *unit, *tmp;
+	struct thunder_uncore_node *node;
+	struct pci_dev *pdev = NULL;
+	int ret, node_id, found = 0;
+
+	/* detect PCI devices */
+	while ((pdev = pci_get_device(vendor_id, device_id, pdev))) {
+		if (!pdev)
+			break;
+
+		node_id = dev_to_node(&pdev->dev);
+
+		/* allocate node if necessary */
+		if (!uncore->nodes[node_id])
+			uncore->nodes[node_id] = alloc_node(uncore, node_id, counters);
+
+		node = uncore->nodes[node_id];
+		if (!node) {
+			ret = -ENOMEM;
+			goto fail;
+		}
+
+		unit = kzalloc(sizeof(*unit), GFP_KERNEL);
+		if (!unit) {
+			ret = -ENOMEM;
+			goto fail;
+		}
+
+		unit->pdev = pdev;
+		unit->map = ioremap(pci_resource_start(pdev, 0),
+				    pci_resource_len(pdev, 0));
+		list_add(&unit->entry, &node->unit_list);
+		node->nr_units++;
+		found++;
+	}
+
+	if (!found)
+		return -ENODEV;
+
+	cpuhp_state_add_instance_nocalls(CPUHP_AP_UNCORE_CAVIUM_ONLINE,
+                                         &uncore->node);
+
+	/*
+	 * perf PMU is CPU dependent in difference to our uncore devices.
+	 * Just pick a CPU and migrate away if it goes offline.
+	 */
+	cpumask_set_cpu(smp_processor_id(), &uncore->active_mask);
+
+	uncore->pmu = *pmu;
+	ret = perf_pmu_register(&uncore->pmu, uncore->pmu.name, -1);
+	if (ret)
+		goto fail;
+
+	return 0;
+
+fail:
+	node_id = 0;
+	while (uncore->nodes[node_id]) {
+		node = uncore->nodes[node_id];
+
+		list_for_each_entry_safe(unit, tmp, &node->unit_list, entry) {
+			if (unit->pdev) {
+				if (unit->map)
+					iounmap(unit->map);
+				pci_dev_put(unit->pdev);
+			}
+			kfree(unit);
+		}
+		kfree(uncore->nodes[node_id]);
+		node_id++;
+	}
+	return ret;
+}
+
+static int __init thunder_uncore_init(void)
+{
+	unsigned long implementor = read_cpuid_implementor();
+	int ret;
+
+	if (implementor != ARM_CPU_IMP_CAVIUM)
+		return -ENODEV;
+
+	ret = cpuhp_setup_state_multi(CPUHP_AP_UNCORE_CAVIUM_ONLINE,
+				      "AP_PERF_UNCORE_CAVIUM_ONLINE", NULL,
+				      thunder_uncore_offline_cpu);
+	if (ret)
+		return ret;
+
+	return 0;
+}
+late_initcall(thunder_uncore_init);
diff --git a/drivers/perf/uncore/uncore_cavium.h b/drivers/perf/uncore/uncore_cavium.h
new file mode 100644
index 0000000..b5d64b5
--- /dev/null
+++ b/drivers/perf/uncore/uncore_cavium.h
@@ -0,0 +1,71 @@
+#include <linux/io.h>
+#include <linux/list.h>
+#include <linux/pci.h>
+#include <linux/perf_event.h>
+
+#undef pr_fmt
+#define pr_fmt(fmt)     "thunderx_uncore: " fmt
+
+#define to_uncore(x) container_of((x), struct thunder_uncore, pmu)
+
+#define UNCORE_EVENT_ID_MASK		0xffff
+#define UNCORE_EVENT_ID_SHIFT		16
+
+/* maximum number of parallel hardware counters for all uncore parts */
+#define MAX_COUNTERS			64
+
+struct thunder_uncore_unit {
+	struct list_head entry;
+	void __iomem *map;
+	struct pci_dev *pdev;
+};
+
+struct thunder_uncore_node {
+	int nr_units;
+	int num_counters;
+	struct list_head unit_list;
+	struct perf_event *events[MAX_COUNTERS];
+};
+
+/* generic uncore struct for different pmu types */
+struct thunder_uncore {
+	struct pmu pmu;
+	bool (*event_valid)(u64);
+	struct hlist_node node;
+	struct thunder_uncore_node *nodes[MAX_NUMNODES];
+	cpumask_t active_mask;
+};
+
+#define UC_EVENT_ENTRY(_name, _id)							\
+	&((struct perf_pmu_events_attr[]) {						\
+		{									\
+			__ATTR(_name, S_IRUGO, thunder_events_sysfs_show, NULL),	\
+			0,								\
+			"event=" __stringify(_id),					\
+		}									\
+	})[0].attr.attr
+
+static inline struct thunder_uncore_node *get_node(u64 config,
+				   struct thunder_uncore *uncore)
+{
+	return uncore->nodes[config >> UNCORE_EVENT_ID_SHIFT];
+}
+
+#define get_id(config) (config & UNCORE_EVENT_ID_MASK)
+
+extern struct attribute_group thunder_uncore_attr_group;
+extern struct device_attribute format_attr_node;
+
+/* Prototypes */
+void thunder_uncore_read(struct perf_event *event);
+int thunder_uncore_add(struct perf_event *event, int flags, u64 config_base,
+		       u64 event_base);
+void thunder_uncore_del(struct perf_event *event, int flags);
+void thunder_uncore_start(struct perf_event *event, int flags);
+void thunder_uncore_stop(struct perf_event *event, int flags);
+int thunder_uncore_event_init(struct perf_event *event);
+int thunder_uncore_setup(struct thunder_uncore *uncore, int id,
+			 struct pmu *pmu, int counters);
+ssize_t thunder_events_sysfs_show(struct device *dev,
+				  struct device_attribute *attr,
+				  char *page);
diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h
index afe641c..973f2bb 100644
--- a/include/linux/cpuhotplug.h
+++ b/include/linux/cpuhotplug.h
@@ -118,6 +118,7 @@ enum cpuhp_state {
 	CPUHP_AP_PERF_ARM_CCI_ONLINE,
 	CPUHP_AP_PERF_ARM_CCN_ONLINE,
 	CPUHP_AP_PERF_ARM_L2X0_ONLINE,
+	CPUHP_AP_UNCORE_CAVIUM_ONLINE,
 	CPUHP_AP_WORKQUEUE_ONLINE,
 	CPUHP_AP_RCUTREE_ONLINE,
 	CPUHP_AP_NOTIFY_ONLINE,
-- 
2.9.0.rc0.21.g7777322

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v4 1/5] arm64: perf: Basic uncore counter support for Cavium ThunderX SOC
@ 2016-10-29 11:55   ` Jan Glauber
  0 siblings, 0 replies; 28+ messages in thread
From: Jan Glauber @ 2016-10-29 11:55 UTC (permalink / raw)
  To: linux-arm-kernel

Provide "uncore" facilities for different non-CPU performance
counter units.

The uncore PMUs can be found under /sys/bus/event_source/devices.
All counters are exported via sysfs in the corresponding events
files under the PMU directory so the perf tool can list the event names.

There are some points that are special in this implementation:

1) The PMU detection relies on PCI device detection. If a
   matching PCI device is found the PMU is created. The code can deal
   with multiple units of the same type, e.g. more than one memory
   controller.

2) Counters are summarized across different units of the same type
   on one NUMA node but not across NUMA nodes.
   For instance L2C TAD 0..7 are presented as a single counter
   (adding the values from TAD 0 to 7). Although losing the ability
   to read a single value the merged values are easier to use.

3) The counters are not CPU related. A random CPU is picked regardless
   of the NUMA node. There is a small performance penalty for accessing
   counters on a remote note but reading a performance counter is a
   slow operation anyway.

Signed-off-by: Jan Glauber <jglauber@cavium.com>
---
 drivers/perf/Kconfig                |  13 ++
 drivers/perf/Makefile               |   1 +
 drivers/perf/uncore/Makefile        |   1 +
 drivers/perf/uncore/uncore_cavium.c | 351 ++++++++++++++++++++++++++++++++++++
 drivers/perf/uncore/uncore_cavium.h |  71 ++++++++
 include/linux/cpuhotplug.h          |   1 +
 6 files changed, 438 insertions(+)
 create mode 100644 drivers/perf/uncore/Makefile
 create mode 100644 drivers/perf/uncore/uncore_cavium.c
 create mode 100644 drivers/perf/uncore/uncore_cavium.h

diff --git a/drivers/perf/Kconfig b/drivers/perf/Kconfig
index 4d5c5f9..3266c87 100644
--- a/drivers/perf/Kconfig
+++ b/drivers/perf/Kconfig
@@ -19,4 +19,17 @@ config XGENE_PMU
         help
           Say y if you want to use APM X-Gene SoC performance monitors.
 
+config UNCORE_PMU
+	bool
+
+config UNCORE_PMU_CAVIUM
+	depends on PERF_EVENTS && NUMA && ARM64
+	bool "Cavium uncore PMU support"
+	select UNCORE_PMU
+	default y
+	help
+	  Say y if you want to access performance counters of subsystems
+	  on a Cavium SOC like cache controller, memory controller or
+	  processor interconnect.
+
 endmenu
diff --git a/drivers/perf/Makefile b/drivers/perf/Makefile
index b116e98..d6c02c9 100644
--- a/drivers/perf/Makefile
+++ b/drivers/perf/Makefile
@@ -1,2 +1,3 @@
 obj-$(CONFIG_ARM_PMU) += arm_pmu.o
 obj-$(CONFIG_XGENE_PMU) += xgene_pmu.o
+obj-y += uncore/
diff --git a/drivers/perf/uncore/Makefile b/drivers/perf/uncore/Makefile
new file mode 100644
index 0000000..6130e18
--- /dev/null
+++ b/drivers/perf/uncore/Makefile
@@ -0,0 +1 @@
+obj-$(CONFIG_UNCORE_PMU_CAVIUM) += uncore_cavium.o
diff --git a/drivers/perf/uncore/uncore_cavium.c b/drivers/perf/uncore/uncore_cavium.c
new file mode 100644
index 0000000..a7b4277
--- /dev/null
+++ b/drivers/perf/uncore/uncore_cavium.c
@@ -0,0 +1,351 @@
+/*
+ * Cavium Thunder uncore PMU support.
+ *
+ * Copyright (C) 2015,2016 Cavium Inc.
+ * Author: Jan Glauber <jan.glauber@cavium.com>
+ */
+
+#include <linux/cpufeature.h>
+#include <linux/numa.h>
+#include <linux/slab.h>
+
+#include "uncore_cavium.h"
+
+/*
+ * Some notes about the various counters supported by this "uncore" PMU
+ * and the design:
+ *
+ * All counters are 64 bit long.
+ * There are no overflow interrupts.
+ * Counters are summarized per node/socket.
+ * Most devices appear as separate PCI devices per socket with the exception
+ * of OCX TLK which appears as one PCI device per socket and contains several
+ * units with counters that are merged.
+ * Some counters are selected via a control register (L2C TAD) and read by
+ * a number of counter registers, others (L2C CBC, LMC & OCX TLK) have
+ * one dedicated counter per event.
+ * Some counters are not stoppable (L2C CBC & LMC).
+ * Some counters are read-only (LMC).
+ * All counters belong to PCI devices, the devices may have additional
+ * drivers but we assume we are the only user of the counter registers.
+ * We map the whole PCI BAR so we must be careful to forbid access to
+ * addresses that contain neither counters nor counter control registers.
+ */
+
+void thunder_uncore_read(struct perf_event *event)
+{
+	struct thunder_uncore *uncore = to_uncore(event->pmu);
+	struct hw_perf_event *hwc = &event->hw;
+	struct thunder_uncore_node *node;
+	struct thunder_uncore_unit *unit;
+	u64 prev, delta, new = 0;
+
+	node = get_node(hwc->config, uncore);
+
+	/* read counter values from all units on the node */
+	list_for_each_entry(unit, &node->unit_list, entry)
+		new += readq(hwc->event_base + unit->map);
+
+	prev = local64_read(&hwc->prev_count);
+	local64_set(&hwc->prev_count, new);
+	delta = new - prev;
+	local64_add(delta, &event->count);
+}
+
+int thunder_uncore_add(struct perf_event *event, int flags, u64 config_base,
+		       u64 event_base)
+{
+	struct thunder_uncore *uncore = to_uncore(event->pmu);
+	struct hw_perf_event *hwc = &event->hw;
+	struct thunder_uncore_node *node;
+	int id;
+
+	node = get_node(hwc->config, uncore);
+	id = get_id(hwc->config);
+
+	if (!cmpxchg(&node->events[id], NULL, event))
+		hwc->idx = id;
+
+	if (hwc->idx == -1)
+		return -EBUSY;
+
+	hwc->config_base = config_base;
+	hwc->event_base = event_base;
+	hwc->state = PERF_HES_UPTODATE | PERF_HES_STOPPED;
+
+	if (flags & PERF_EF_START)
+		uncore->pmu.start(event, PERF_EF_RELOAD);
+
+	return 0;
+}
+
+void thunder_uncore_del(struct perf_event *event, int flags)
+{
+	struct thunder_uncore *uncore = to_uncore(event->pmu);
+	struct hw_perf_event *hwc = &event->hw;
+	struct thunder_uncore_node *node;
+	int i;
+
+	event->pmu->stop(event, PERF_EF_UPDATE);
+
+	/*
+	 * For programmable counters we need to check where we installed it.
+	 * To keep this function generic always test the more complicated
+	 * case (free running counters won't need the loop).
+	 */
+	node = get_node(hwc->config, uncore);
+	for (i = 0; i < node->num_counters; i++) {
+		if (cmpxchg(&node->events[i], event, NULL) == event)
+			break;
+	}
+	hwc->idx = -1;
+}
+
+void thunder_uncore_start(struct perf_event *event, int flags)
+{
+	struct thunder_uncore *uncore = to_uncore(event->pmu);
+	struct hw_perf_event *hwc = &event->hw;
+	struct thunder_uncore_node *node;
+	struct thunder_uncore_unit *unit;
+	u64 new = 0;
+
+	/* read counter values from all units on the node */
+	node = get_node(hwc->config, uncore);
+	list_for_each_entry(unit, &node->unit_list, entry)
+		new += readq(hwc->event_base + unit->map);
+	local64_set(&hwc->prev_count, new);
+
+	hwc->state = 0;
+	perf_event_update_userpage(event);
+}
+
+void thunder_uncore_stop(struct perf_event *event, int flags)
+{
+	struct hw_perf_event *hwc = &event->hw;
+
+	hwc->state |= PERF_HES_STOPPED;
+
+	if ((flags & PERF_EF_UPDATE) && !(hwc->state & PERF_HES_UPTODATE)) {
+		thunder_uncore_read(event);
+		hwc->state |= PERF_HES_UPTODATE;
+	}
+}
+
+int thunder_uncore_event_init(struct perf_event *event)
+{
+	struct hw_perf_event *hwc = &event->hw;
+	struct thunder_uncore_node *node;
+	struct thunder_uncore *uncore;
+
+	if (event->attr.type != event->pmu->type)
+		return -ENOENT;
+
+	/* we do not support sampling */
+	if (is_sampling_event(event))
+		return -EINVAL;
+
+	/* counters do not have these bits */
+	if (event->attr.exclude_user	||
+	    event->attr.exclude_kernel	||
+	    event->attr.exclude_host	||
+	    event->attr.exclude_guest	||
+	    event->attr.exclude_hv	||
+	    event->attr.exclude_idle)
+		return -EINVAL;
+
+	uncore = to_uncore(event->pmu);
+	if (!uncore)
+		return -ENODEV;
+	if (!uncore->event_valid(event->attr.config & UNCORE_EVENT_ID_MASK))
+		return -EINVAL;
+
+	/* check NUMA node */
+	node = get_node(event->attr.config, uncore);
+	if (!node) {
+		pr_debug("Invalid NUMA node selected\n");
+		return -EINVAL;
+	}
+
+	hwc->config = event->attr.config;
+	hwc->idx = -1;
+	return 0;
+}
+
+static ssize_t thunder_uncore_attr_show_cpumask(struct device *dev,
+						struct device_attribute *attr,
+						char *buf)
+{
+	struct pmu *pmu = dev_get_drvdata(dev);
+	struct thunder_uncore *uncore =
+		container_of(pmu, struct thunder_uncore, pmu);
+
+	return cpumap_print_to_pagebuf(true, buf, &uncore->active_mask);
+}
+static DEVICE_ATTR(cpumask, S_IRUGO, thunder_uncore_attr_show_cpumask, NULL);
+
+static struct attribute *thunder_uncore_attrs[] = {
+	&dev_attr_cpumask.attr,
+	NULL,
+};
+
+struct attribute_group thunder_uncore_attr_group = {
+	.attrs = thunder_uncore_attrs,
+};
+
+ssize_t thunder_events_sysfs_show(struct device *dev,
+				  struct device_attribute *attr,
+				  char *page)
+{
+	struct perf_pmu_events_attr *pmu_attr =
+		container_of(attr, struct perf_pmu_events_attr, attr);
+
+	if (pmu_attr->event_str)
+		return sprintf(page, "%s", pmu_attr->event_str);
+
+	return 0;
+}
+
+/* node attribute depending on number of NUMA nodes */
+static ssize_t node_show(struct device *dev, struct device_attribute *attr,
+			 char *page)
+{
+	if (NODES_SHIFT)
+		return sprintf(page, "config:16-%d\n", 16 + NODES_SHIFT - 1);
+	else
+		return sprintf(page, "config:16\n");
+}
+
+struct device_attribute format_attr_node = __ATTR_RO(node);
+
+/*
+ * Thunder uncore events are independent from CPUs. Provide a cpumask
+ * nevertheless to prevent perf from adding the event per-cpu and just
+ * set the mask to one online CPU. Use the same cpumask for all uncore
+ * devices.
+ *
+ * There is a performance penalty for accessing a device from a CPU on
+ * another socket, but we do not care (yet).
+ */
+static int thunder_uncore_offline_cpu(unsigned int old_cpu, struct hlist_node *node)
+{
+	struct thunder_uncore *uncore = hlist_entry_safe(node, struct thunder_uncore, node);
+	int new_cpu;
+
+	if (!cpumask_test_and_clear_cpu(old_cpu, &uncore->active_mask))
+		return 0;
+	new_cpu = cpumask_any_but(cpu_online_mask, old_cpu);
+	if (new_cpu >= nr_cpu_ids)
+		return 0;
+	perf_pmu_migrate_context(&uncore->pmu, old_cpu, new_cpu);
+	cpumask_set_cpu(new_cpu, &uncore->active_mask);
+	return 0;
+}
+
+static struct thunder_uncore_node * __init alloc_node(struct thunder_uncore *uncore,
+						      int node_id, int counters)
+{
+	struct thunder_uncore_node *node;
+
+	node = kzalloc(sizeof(*node), GFP_KERNEL);
+	if (!node)
+		return NULL;
+	node->num_counters = counters;
+	INIT_LIST_HEAD(&node->unit_list);
+	return node;
+}
+
+int __init thunder_uncore_setup(struct thunder_uncore *uncore, int device_id,
+				struct pmu *pmu, int counters)
+{
+	unsigned int vendor_id = PCI_VENDOR_ID_CAVIUM;
+	struct thunder_uncore_unit  *unit, *tmp;
+	struct thunder_uncore_node *node;
+	struct pci_dev *pdev = NULL;
+	int ret, node_id, found = 0;
+
+	/* detect PCI devices */
+	while ((pdev = pci_get_device(vendor_id, device_id, pdev))) {
+		if (!pdev)
+			break;
+
+		node_id = dev_to_node(&pdev->dev);
+
+		/* allocate node if necessary */
+		if (!uncore->nodes[node_id])
+			uncore->nodes[node_id] = alloc_node(uncore, node_id, counters);
+
+		node = uncore->nodes[node_id];
+		if (!node) {
+			ret = -ENOMEM;
+			goto fail;
+		}
+
+		unit = kzalloc(sizeof(*unit), GFP_KERNEL);
+		if (!unit) {
+			ret = -ENOMEM;
+			goto fail;
+		}
+
+		unit->pdev = pdev;
+		unit->map = ioremap(pci_resource_start(pdev, 0),
+				    pci_resource_len(pdev, 0));
+		list_add(&unit->entry, &node->unit_list);
+		node->nr_units++;
+		found++;
+	}
+
+	if (!found)
+		return -ENODEV;
+
+	cpuhp_state_add_instance_nocalls(CPUHP_AP_UNCORE_CAVIUM_ONLINE,
+                                         &uncore->node);
+
+	/*
+	 * perf PMU is CPU dependent in difference to our uncore devices.
+	 * Just pick a CPU and migrate away if it goes offline.
+	 */
+	cpumask_set_cpu(smp_processor_id(), &uncore->active_mask);
+
+	uncore->pmu = *pmu;
+	ret = perf_pmu_register(&uncore->pmu, uncore->pmu.name, -1);
+	if (ret)
+		goto fail;
+
+	return 0;
+
+fail:
+	node_id = 0;
+	while (uncore->nodes[node_id]) {
+		node = uncore->nodes[node_id];
+
+		list_for_each_entry_safe(unit, tmp, &node->unit_list, entry) {
+			if (unit->pdev) {
+				if (unit->map)
+					iounmap(unit->map);
+				pci_dev_put(unit->pdev);
+			}
+			kfree(unit);
+		}
+		kfree(uncore->nodes[node_id]);
+		node_id++;
+	}
+	return ret;
+}
+
+static int __init thunder_uncore_init(void)
+{
+	unsigned long implementor = read_cpuid_implementor();
+	int ret;
+
+	if (implementor != ARM_CPU_IMP_CAVIUM)
+		return -ENODEV;
+
+	ret = cpuhp_setup_state_multi(CPUHP_AP_UNCORE_CAVIUM_ONLINE,
+				      "AP_PERF_UNCORE_CAVIUM_ONLINE", NULL,
+				      thunder_uncore_offline_cpu);
+	if (ret)
+		return ret;
+
+	return 0;
+}
+late_initcall(thunder_uncore_init);
diff --git a/drivers/perf/uncore/uncore_cavium.h b/drivers/perf/uncore/uncore_cavium.h
new file mode 100644
index 0000000..b5d64b5
--- /dev/null
+++ b/drivers/perf/uncore/uncore_cavium.h
@@ -0,0 +1,71 @@
+#include <linux/io.h>
+#include <linux/list.h>
+#include <linux/pci.h>
+#include <linux/perf_event.h>
+
+#undef pr_fmt
+#define pr_fmt(fmt)     "thunderx_uncore: " fmt
+
+#define to_uncore(x) container_of((x), struct thunder_uncore, pmu)
+
+#define UNCORE_EVENT_ID_MASK		0xffff
+#define UNCORE_EVENT_ID_SHIFT		16
+
+/* maximum number of parallel hardware counters for all uncore parts */
+#define MAX_COUNTERS			64
+
+struct thunder_uncore_unit {
+	struct list_head entry;
+	void __iomem *map;
+	struct pci_dev *pdev;
+};
+
+struct thunder_uncore_node {
+	int nr_units;
+	int num_counters;
+	struct list_head unit_list;
+	struct perf_event *events[MAX_COUNTERS];
+};
+
+/* generic uncore struct for different pmu types */
+struct thunder_uncore {
+	struct pmu pmu;
+	bool (*event_valid)(u64);
+	struct hlist_node node;
+	struct thunder_uncore_node *nodes[MAX_NUMNODES];
+	cpumask_t active_mask;
+};
+
+#define UC_EVENT_ENTRY(_name, _id)							\
+	&((struct perf_pmu_events_attr[]) {						\
+		{									\
+			__ATTR(_name, S_IRUGO, thunder_events_sysfs_show, NULL),	\
+			0,								\
+			"event=" __stringify(_id),					\
+		}									\
+	})[0].attr.attr
+
+static inline struct thunder_uncore_node *get_node(u64 config,
+				   struct thunder_uncore *uncore)
+{
+	return uncore->nodes[config >> UNCORE_EVENT_ID_SHIFT];
+}
+
+#define get_id(config) (config & UNCORE_EVENT_ID_MASK)
+
+extern struct attribute_group thunder_uncore_attr_group;
+extern struct device_attribute format_attr_node;
+
+/* Prototypes */
+void thunder_uncore_read(struct perf_event *event);
+int thunder_uncore_add(struct perf_event *event, int flags, u64 config_base,
+		       u64 event_base);
+void thunder_uncore_del(struct perf_event *event, int flags);
+void thunder_uncore_start(struct perf_event *event, int flags);
+void thunder_uncore_stop(struct perf_event *event, int flags);
+int thunder_uncore_event_init(struct perf_event *event);
+int thunder_uncore_setup(struct thunder_uncore *uncore, int id,
+			 struct pmu *pmu, int counters);
+ssize_t thunder_events_sysfs_show(struct device *dev,
+				  struct device_attribute *attr,
+				  char *page);
diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h
index afe641c..973f2bb 100644
--- a/include/linux/cpuhotplug.h
+++ b/include/linux/cpuhotplug.h
@@ -118,6 +118,7 @@ enum cpuhp_state {
 	CPUHP_AP_PERF_ARM_CCI_ONLINE,
 	CPUHP_AP_PERF_ARM_CCN_ONLINE,
 	CPUHP_AP_PERF_ARM_L2X0_ONLINE,
+	CPUHP_AP_UNCORE_CAVIUM_ONLINE,
 	CPUHP_AP_WORKQUEUE_ONLINE,
 	CPUHP_AP_RCUTREE_ONLINE,
 	CPUHP_AP_NOTIFY_ONLINE,
-- 
2.9.0.rc0.21.g7777322

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v4 2/5] arm64: perf: Cavium ThunderX L2C TAD uncore support
  2016-10-29 11:55 ` Jan Glauber
@ 2016-10-29 11:55   ` Jan Glauber
  -1 siblings, 0 replies; 28+ messages in thread
From: Jan Glauber @ 2016-10-29 11:55 UTC (permalink / raw)
  To: Mark Rutland, Will Deacon; +Cc: linux-kernel, linux-arm-kernel, Jan Glauber

Support counters of the L2 Cache tag and data units.

Signed-off-by: Jan Glauber <jglauber@cavium.com>
---
 drivers/perf/uncore/Makefile                |   3 +-
 drivers/perf/uncore/uncore_cavium.c         |   1 +
 drivers/perf/uncore/uncore_cavium.h         |   1 +
 drivers/perf/uncore/uncore_cavium_l2c_tad.c | 379 ++++++++++++++++++++++++++++
 4 files changed, 383 insertions(+), 1 deletion(-)
 create mode 100644 drivers/perf/uncore/uncore_cavium_l2c_tad.c

diff --git a/drivers/perf/uncore/Makefile b/drivers/perf/uncore/Makefile
index 6130e18..90850a2 100644
--- a/drivers/perf/uncore/Makefile
+++ b/drivers/perf/uncore/Makefile
@@ -1 +1,2 @@
-obj-$(CONFIG_UNCORE_PMU_CAVIUM) += uncore_cavium.o
+obj-$(CONFIG_UNCORE_PMU_CAVIUM) += uncore_cavium.o		\
+				   uncore_cavium_l2c_tad.o
diff --git a/drivers/perf/uncore/uncore_cavium.c b/drivers/perf/uncore/uncore_cavium.c
index a7b4277..15e1aec 100644
--- a/drivers/perf/uncore/uncore_cavium.c
+++ b/drivers/perf/uncore/uncore_cavium.c
@@ -346,6 +346,7 @@ static int __init thunder_uncore_init(void)
 	if (ret)
 		return ret;
 
+	thunder_uncore_l2c_tad_setup();
 	return 0;
 }
 late_initcall(thunder_uncore_init);
diff --git a/drivers/perf/uncore/uncore_cavium.h b/drivers/perf/uncore/uncore_cavium.h
index b5d64b5..70a8214 100644
--- a/drivers/perf/uncore/uncore_cavium.h
+++ b/drivers/perf/uncore/uncore_cavium.h
@@ -69,3 +69,4 @@ int thunder_uncore_setup(struct thunder_uncore *uncore, int id,
 ssize_t thunder_events_sysfs_show(struct device *dev,
 				  struct device_attribute *attr,
 				  char *page);
+int thunder_uncore_l2c_tad_setup(void);
diff --git a/drivers/perf/uncore/uncore_cavium_l2c_tad.c b/drivers/perf/uncore/uncore_cavium_l2c_tad.c
new file mode 100644
index 0000000..b97ba33
--- /dev/null
+++ b/drivers/perf/uncore/uncore_cavium_l2c_tad.c
@@ -0,0 +1,379 @@
+/*
+ * Cavium Thunder uncore PMU support,
+ * L2 Cache tag-and-data-units (L2C TAD) counters.
+ *
+ * Copyright 2016 Cavium Inc.
+ * Author: Jan Glauber <jan.glauber@cavium.com>
+ */
+
+#include <linux/perf_event.h>
+#include <linux/slab.h>
+
+#include "uncore_cavium.h"
+
+struct thunder_uncore *thunder_uncore_l2c_tad;
+
+#define L2C_TAD_NR_COUNTERS             4
+#define L2C_TAD_PRF_OFFSET		0x10000
+#define L2C_TAD_PFC_OFFSET		0x10100
+
+/*
+ * Counters are selected via L2C_TAD(x)_PRF:
+ *
+ *   63					    32
+ *   +---------------------------------------+
+ *   |  Reserved			     |
+ *   +---------------------------------------+
+ *   | CNT3SEL | CNT2SEL | CNT1SEL | CNT0SEL |
+ *   +---------------------------------------+
+ *   31       24	16	  8	     0
+ *
+ * config_base contains the offset of the selected CNTxSEL in the mapped BAR.
+ *
+ * Counters are read via L2C_TAD(x)_PFC(0..3).
+ * event_base contains the associated address to read the counter.
+ */
+
+/* L2C TAD event list */
+#define L2C_TAD_EVENTS_DISABLED			0x00
+#define L2C_TAD_EVENT_L2T_HIT			0x01
+#define L2C_TAD_EVENT_L2T_MISS			0x02
+#define L2C_TAD_EVENT_L2T_NOALLOC		0x03
+#define L2C_TAD_EVENT_L2_VIC			0x04
+#define L2C_TAD_EVENT_SC_FAIL			0x05
+#define L2C_TAD_EVENT_SC_PASS			0x06
+#define L2C_TAD_EVENT_LFB_OCC			0x07
+#define L2C_TAD_EVENT_WAIT_LFB			0x08
+#define L2C_TAD_EVENT_WAIT_VAB			0x09
+#define L2C_TAD_EVENT_OPEN_CCPI			0x0a
+#define L2C_TAD_EVENT_LOOKUP			0x40
+#define L2C_TAD_EVENT_LOOKUP_XMC_LCL		0x41
+#define L2C_TAD_EVENT_LOOKUP_XMC_RMT		0x42
+#define L2C_TAD_EVENT_LOOKUP_MIB		0x43
+#define L2C_TAD_EVENT_LOOKUP_ALL		0x44
+#define L2C_TAD_EVENT_TAG_ALC_HIT		0x48
+#define L2C_TAD_EVENT_TAG_ALC_MISS		0x49
+#define L2C_TAD_EVENT_TAG_ALC_NALC		0x4a
+#define L2C_TAD_EVENT_TAG_NALC_HIT		0x4b
+#define L2C_TAD_EVENT_TAG_NALC_MISS		0x4c
+#define L2C_TAD_EVENT_LMC_WR			0x4e
+#define L2C_TAD_EVENT_LMC_SBLKDTY		0x4f
+#define L2C_TAD_EVENT_TAG_ALC_RTG_HIT		0x50
+#define L2C_TAD_EVENT_TAG_ALC_RTG_HITE		0x51
+#define L2C_TAD_EVENT_TAG_ALC_RTG_HITS		0x52
+#define L2C_TAD_EVENT_TAG_ALC_RTG_MISS		0x53
+#define L2C_TAD_EVENT_TAG_NALC_RTG_HIT		0x54
+#define L2C_TAD_EVENT_TAG_NALC_RTG_MISS		0x55
+#define L2C_TAD_EVENT_TAG_NALC_RTG_HITE		0x56
+#define L2C_TAD_EVENT_TAG_NALC_RTG_HITS		0x57
+#define L2C_TAD_EVENT_TAG_ALC_LCL_EVICT		0x58
+#define L2C_TAD_EVENT_TAG_ALC_LCL_CLNVIC	0x59
+#define L2C_TAD_EVENT_TAG_ALC_LCL_DTYVIC	0x5a
+#define L2C_TAD_EVENT_TAG_ALC_RMT_EVICT		0x5b
+#define L2C_TAD_EVENT_TAG_ALC_RMT_VIC		0x5c
+#define L2C_TAD_EVENT_RTG_ALC			0x5d
+#define L2C_TAD_EVENT_RTG_ALC_HIT		0x5e
+#define L2C_TAD_EVENT_RTG_ALC_HITWB		0x5f
+#define L2C_TAD_EVENT_STC_TOTAL			0x60
+#define L2C_TAD_EVENT_STC_TOTAL_FAIL		0x61
+#define L2C_TAD_EVENT_STC_RMT			0x62
+#define L2C_TAD_EVENT_STC_RMT_FAIL		0x63
+#define L2C_TAD_EVENT_STC_LCL			0x64
+#define L2C_TAD_EVENT_STC_LCL_FAIL		0x65
+#define L2C_TAD_EVENT_OCI_RTG_WAIT		0x68
+#define L2C_TAD_EVENT_OCI_FWD_CYC_HIT		0x69
+#define L2C_TAD_EVENT_OCI_FWD_RACE		0x6a
+#define L2C_TAD_EVENT_OCI_HAKS			0x6b
+#define L2C_TAD_EVENT_OCI_FLDX_TAG_E_NODAT	0x6c
+#define L2C_TAD_EVENT_OCI_FLDX_TAG_E_DAT	0x6d
+#define L2C_TAD_EVENT_OCI_RLDD			0x6e
+#define L2C_TAD_EVENT_OCI_RLDD_PEMD		0x6f
+#define L2C_TAD_EVENT_OCI_RRQ_DAT_CNT		0x70
+#define L2C_TAD_EVENT_OCI_RRQ_DAT_DMASK		0x71
+#define L2C_TAD_EVENT_OCI_RSP_DAT_CNT		0x72
+#define L2C_TAD_EVENT_OCI_RSP_DAT_DMASK		0x73
+#define L2C_TAD_EVENT_OCI_RSP_DAT_VICD_CNT	0x74
+#define L2C_TAD_EVENT_OCI_RSP_DAT_VICD_DMASK	0x75
+#define L2C_TAD_EVENT_OCI_RTG_ALC_EVICT		0x76
+#define L2C_TAD_EVENT_OCI_RTG_ALC_VIC		0x77
+#define L2C_TAD_EVENT_QD0_IDX			0x80
+#define L2C_TAD_EVENT_QD0_RDAT			0x81
+#define L2C_TAD_EVENT_QD0_BNKS			0x82
+#define L2C_TAD_EVENT_QD0_WDAT			0x83
+#define L2C_TAD_EVENT_QD1_IDX			0x90
+#define L2C_TAD_EVENT_QD1_RDAT			0x91
+#define L2C_TAD_EVENT_QD1_BNKS			0x92
+#define L2C_TAD_EVENT_QD1_WDAT			0x93
+#define L2C_TAD_EVENT_QD2_IDX			0xa0
+#define L2C_TAD_EVENT_QD2_RDAT			0xa1
+#define L2C_TAD_EVENT_QD2_BNKS			0xa2
+#define L2C_TAD_EVENT_QD2_WDAT			0xa3
+#define L2C_TAD_EVENT_QD3_IDX			0xb0
+#define L2C_TAD_EVENT_QD3_RDAT			0xb1
+#define L2C_TAD_EVENT_QD3_BNKS			0xb2
+#define L2C_TAD_EVENT_QD3_WDAT			0xb3
+#define L2C_TAD_EVENT_QD4_IDX			0xc0
+#define L2C_TAD_EVENT_QD4_RDAT			0xc1
+#define L2C_TAD_EVENT_QD4_BNKS			0xc2
+#define L2C_TAD_EVENT_QD4_WDAT			0xc3
+#define L2C_TAD_EVENT_QD5_IDX			0xd0
+#define L2C_TAD_EVENT_QD5_RDAT			0xd1
+#define L2C_TAD_EVENT_QD5_BNKS			0xd2
+#define L2C_TAD_EVENT_QD5_WDAT			0xd3
+#define L2C_TAD_EVENT_QD6_IDX			0xe0
+#define L2C_TAD_EVENT_QD6_RDAT			0xe1
+#define L2C_TAD_EVENT_QD6_BNKS			0xe2
+#define L2C_TAD_EVENT_QD6_WDAT			0xe3
+#define L2C_TAD_EVENT_QD7_IDX			0xf0
+#define L2C_TAD_EVENT_QD7_RDAT			0xf1
+#define L2C_TAD_EVENT_QD7_BNKS			0xf2
+#define L2C_TAD_EVENT_QD7_WDAT			0xf3
+
+static void thunder_uncore_start_l2c_tad(struct perf_event *event, int flags)
+{
+	struct thunder_uncore *uncore = to_uncore(event->pmu);
+	struct hw_perf_event *hwc = &event->hw;
+	struct thunder_uncore_node *node;
+	struct thunder_uncore_unit *unit;
+	int id;
+
+	node = get_node(hwc->config, uncore);
+	id = get_id(hwc->config);
+
+	/* reset counter values to zero */
+	if (flags & PERF_EF_RELOAD)
+		list_for_each_entry(unit, &node->unit_list, entry)
+			writeq(0, hwc->event_base + unit->map);
+
+	/* start counters on all units on the node */
+	list_for_each_entry(unit, &node->unit_list, entry)
+		writeb(id, hwc->config_base + unit->map);
+
+	hwc->state = 0;
+	perf_event_update_userpage(event);
+}
+
+static void thunder_uncore_stop_l2c_tad(struct perf_event *event, int flags)
+{
+	struct thunder_uncore *uncore = to_uncore(event->pmu);
+	struct hw_perf_event *hwc = &event->hw;
+	struct thunder_uncore_node *node;
+	struct thunder_uncore_unit *unit;
+
+	node = get_node(hwc->config, uncore);
+
+	/* disable counters for all units on the node */
+	list_for_each_entry(unit, &node->unit_list, entry)
+		writeb(L2C_TAD_EVENTS_DISABLED, hwc->config_base + unit->map);
+	hwc->state |= PERF_HES_STOPPED;
+
+	if ((flags & PERF_EF_UPDATE) && !(hwc->state & PERF_HES_UPTODATE)) {
+		thunder_uncore_read(event);
+		hwc->state |= PERF_HES_UPTODATE;
+	}
+}
+
+static int thunder_uncore_add_l2c_tad(struct perf_event *event, int flags)
+{
+	struct thunder_uncore *uncore = to_uncore(event->pmu);
+	struct hw_perf_event *hwc = &event->hw;
+	struct thunder_uncore_node *node;
+	int i;
+
+	node = get_node(hwc->config, uncore);
+
+	/* take the first available counter */
+	for (i = 0; i < node->num_counters; i++) {
+		if (!cmpxchg(&node->events[i], NULL, event)) {
+			hwc->idx = i;
+			break;
+		}
+	}
+
+	if (hwc->idx == -1)
+		return -EBUSY;
+
+	/* see comment at beginning of file */
+	hwc->config_base = L2C_TAD_PRF_OFFSET + hwc->idx;
+	hwc->event_base = L2C_TAD_PFC_OFFSET + hwc->idx * sizeof(u64);
+
+	hwc->state = PERF_HES_UPTODATE | PERF_HES_STOPPED;
+	if (flags & PERF_EF_START)
+		thunder_uncore_start(event, PERF_EF_RELOAD);
+	return 0;
+}
+
+PMU_FORMAT_ATTR(event, "config:0-7");
+
+static struct attribute *thunder_l2c_tad_format_attr[] = {
+	&format_attr_event.attr,
+	&format_attr_node.attr,
+	NULL,
+};
+
+static struct attribute_group thunder_l2c_tad_format_group = {
+	.name = "format",
+	.attrs = thunder_l2c_tad_format_attr,
+};
+
+static struct attribute *thunder_l2c_tad_events_attr[] = {
+	UC_EVENT_ENTRY(l2t_hit,			L2C_TAD_EVENT_L2T_HIT),
+	UC_EVENT_ENTRY(l2t_miss,		L2C_TAD_EVENT_L2T_MISS),
+	UC_EVENT_ENTRY(l2t_noalloc,		L2C_TAD_EVENT_L2T_NOALLOC),
+	UC_EVENT_ENTRY(l2_vic,			L2C_TAD_EVENT_L2_VIC),
+	UC_EVENT_ENTRY(sc_fail,			L2C_TAD_EVENT_SC_FAIL),
+	UC_EVENT_ENTRY(sc_pass,			L2C_TAD_EVENT_SC_PASS),
+	UC_EVENT_ENTRY(lfb_occ,			L2C_TAD_EVENT_LFB_OCC),
+	UC_EVENT_ENTRY(wait_lfb,		L2C_TAD_EVENT_WAIT_LFB),
+	UC_EVENT_ENTRY(wait_vab,		L2C_TAD_EVENT_WAIT_VAB),
+	UC_EVENT_ENTRY(open_ccpi,		L2C_TAD_EVENT_OPEN_CCPI),
+	UC_EVENT_ENTRY(lookup,			L2C_TAD_EVENT_LOOKUP),
+	UC_EVENT_ENTRY(lookup_xmc_lcl,		L2C_TAD_EVENT_LOOKUP_XMC_LCL),
+	UC_EVENT_ENTRY(lookup_xmc_rmt,		L2C_TAD_EVENT_LOOKUP_XMC_RMT),
+	UC_EVENT_ENTRY(lookup_mib,		L2C_TAD_EVENT_LOOKUP_MIB),
+	UC_EVENT_ENTRY(lookup_all,		L2C_TAD_EVENT_LOOKUP_ALL),
+	UC_EVENT_ENTRY(tag_alc_hit,		L2C_TAD_EVENT_TAG_ALC_HIT),
+	UC_EVENT_ENTRY(tag_alc_miss,		L2C_TAD_EVENT_TAG_ALC_MISS),
+	UC_EVENT_ENTRY(tag_alc_nalc,		L2C_TAD_EVENT_TAG_ALC_NALC),
+	UC_EVENT_ENTRY(tag_nalc_hit,		L2C_TAD_EVENT_TAG_NALC_HIT),
+	UC_EVENT_ENTRY(tag_nalc_miss,		L2C_TAD_EVENT_TAG_NALC_MISS),
+	UC_EVENT_ENTRY(lmc_wr,			L2C_TAD_EVENT_LMC_WR),
+	UC_EVENT_ENTRY(lmc_sblkdty,		L2C_TAD_EVENT_LMC_SBLKDTY),
+	UC_EVENT_ENTRY(tag_alc_rtg_hit,		L2C_TAD_EVENT_TAG_ALC_RTG_HIT),
+	UC_EVENT_ENTRY(tag_alc_rtg_hite,	L2C_TAD_EVENT_TAG_ALC_RTG_HITE),
+	UC_EVENT_ENTRY(tag_alc_rtg_hits,	L2C_TAD_EVENT_TAG_ALC_RTG_HITS),
+	UC_EVENT_ENTRY(tag_alc_rtg_miss,	L2C_TAD_EVENT_TAG_ALC_RTG_MISS),
+	UC_EVENT_ENTRY(tag_alc_nalc_rtg_hit,	L2C_TAD_EVENT_TAG_NALC_RTG_HIT),
+	UC_EVENT_ENTRY(tag_nalc_rtg_miss,	L2C_TAD_EVENT_TAG_NALC_RTG_MISS),
+	UC_EVENT_ENTRY(tag_nalc_rtg_hite,	L2C_TAD_EVENT_TAG_NALC_RTG_HITE),
+	UC_EVENT_ENTRY(tag_nalc_rtg_hits,	L2C_TAD_EVENT_TAG_NALC_RTG_HITS),
+	UC_EVENT_ENTRY(tag_alc_lcl_evict,	L2C_TAD_EVENT_TAG_ALC_LCL_EVICT),
+	UC_EVENT_ENTRY(tag_alc_lcl_clnvic,	L2C_TAD_EVENT_TAG_ALC_LCL_CLNVIC),
+	UC_EVENT_ENTRY(tag_alc_lcl_dtyvic,	L2C_TAD_EVENT_TAG_ALC_LCL_DTYVIC),
+	UC_EVENT_ENTRY(tag_alc_rmt_evict,	L2C_TAD_EVENT_TAG_ALC_RMT_EVICT),
+	UC_EVENT_ENTRY(tag_alc_rmt_vic,		L2C_TAD_EVENT_TAG_ALC_RMT_VIC),
+	UC_EVENT_ENTRY(rtg_alc,			L2C_TAD_EVENT_RTG_ALC),
+	UC_EVENT_ENTRY(rtg_alc_hit,		L2C_TAD_EVENT_RTG_ALC_HIT),
+	UC_EVENT_ENTRY(rtg_alc_hitwb,		L2C_TAD_EVENT_RTG_ALC_HITWB),
+	UC_EVENT_ENTRY(stc_total,		L2C_TAD_EVENT_STC_TOTAL),
+	UC_EVENT_ENTRY(stc_total_fail,		L2C_TAD_EVENT_STC_TOTAL_FAIL),
+	UC_EVENT_ENTRY(stc_rmt,			L2C_TAD_EVENT_STC_RMT),
+	UC_EVENT_ENTRY(stc_rmt_fail,		L2C_TAD_EVENT_STC_RMT_FAIL),
+	UC_EVENT_ENTRY(stc_lcl,			L2C_TAD_EVENT_STC_LCL),
+	UC_EVENT_ENTRY(stc_lcl_fail,		L2C_TAD_EVENT_STC_LCL_FAIL),
+	UC_EVENT_ENTRY(oci_rtg_wait,		L2C_TAD_EVENT_OCI_RTG_WAIT),
+	UC_EVENT_ENTRY(oci_fwd_cyc_hit,		L2C_TAD_EVENT_OCI_FWD_CYC_HIT),
+	UC_EVENT_ENTRY(oci_fwd_race,		L2C_TAD_EVENT_OCI_FWD_RACE),
+	UC_EVENT_ENTRY(oci_haks,		L2C_TAD_EVENT_OCI_HAKS),
+	UC_EVENT_ENTRY(oci_fldx_tag_e_nodat,	L2C_TAD_EVENT_OCI_FLDX_TAG_E_NODAT),
+	UC_EVENT_ENTRY(oci_fldx_tag_e_dat,	L2C_TAD_EVENT_OCI_FLDX_TAG_E_DAT),
+	UC_EVENT_ENTRY(oci_rldd,		L2C_TAD_EVENT_OCI_RLDD),
+	UC_EVENT_ENTRY(oci_rldd_pemd,		L2C_TAD_EVENT_OCI_RLDD_PEMD),
+	UC_EVENT_ENTRY(oci_rrq_dat_cnt,		L2C_TAD_EVENT_OCI_RRQ_DAT_CNT),
+	UC_EVENT_ENTRY(oci_rrq_dat_dmask,	L2C_TAD_EVENT_OCI_RRQ_DAT_DMASK),
+	UC_EVENT_ENTRY(oci_rsp_dat_cnt,		L2C_TAD_EVENT_OCI_RSP_DAT_CNT),
+	UC_EVENT_ENTRY(oci_rsp_dat_dmaks,	L2C_TAD_EVENT_OCI_RSP_DAT_DMASK),
+	UC_EVENT_ENTRY(oci_rsp_dat_vicd_cnt,	L2C_TAD_EVENT_OCI_RSP_DAT_VICD_CNT),
+	UC_EVENT_ENTRY(oci_rsp_dat_vicd_dmask,	L2C_TAD_EVENT_OCI_RSP_DAT_VICD_DMASK),
+	UC_EVENT_ENTRY(oci_rtg_alc_evict,	L2C_TAD_EVENT_OCI_RTG_ALC_EVICT),
+	UC_EVENT_ENTRY(oci_rtg_alc_vic,		L2C_TAD_EVENT_OCI_RTG_ALC_VIC),
+	UC_EVENT_ENTRY(qd0_idx,			L2C_TAD_EVENT_QD0_IDX),
+	UC_EVENT_ENTRY(qd0_rdat,		L2C_TAD_EVENT_QD0_RDAT),
+	UC_EVENT_ENTRY(qd0_bnks,		L2C_TAD_EVENT_QD0_BNKS),
+	UC_EVENT_ENTRY(qd0_wdat,		L2C_TAD_EVENT_QD0_WDAT),
+	UC_EVENT_ENTRY(qd1_idx,			L2C_TAD_EVENT_QD1_IDX),
+	UC_EVENT_ENTRY(qd1_rdat,		L2C_TAD_EVENT_QD1_RDAT),
+	UC_EVENT_ENTRY(qd1_bnks,		L2C_TAD_EVENT_QD1_BNKS),
+	UC_EVENT_ENTRY(qd1_wdat,		L2C_TAD_EVENT_QD1_WDAT),
+	UC_EVENT_ENTRY(qd2_idx,			L2C_TAD_EVENT_QD2_IDX),
+	UC_EVENT_ENTRY(qd2_rdat,		L2C_TAD_EVENT_QD2_RDAT),
+	UC_EVENT_ENTRY(qd2_bnks,		L2C_TAD_EVENT_QD2_BNKS),
+	UC_EVENT_ENTRY(qd2_wdat,		L2C_TAD_EVENT_QD2_WDAT),
+	UC_EVENT_ENTRY(qd3_idx,			L2C_TAD_EVENT_QD3_IDX),
+	UC_EVENT_ENTRY(qd3_rdat,		L2C_TAD_EVENT_QD3_RDAT),
+	UC_EVENT_ENTRY(qd3_bnks,		L2C_TAD_EVENT_QD3_BNKS),
+	UC_EVENT_ENTRY(qd3_wdat,		L2C_TAD_EVENT_QD3_WDAT),
+	UC_EVENT_ENTRY(qd4_idx,			L2C_TAD_EVENT_QD4_IDX),
+	UC_EVENT_ENTRY(qd4_rdat,		L2C_TAD_EVENT_QD4_RDAT),
+	UC_EVENT_ENTRY(qd4_bnks,		L2C_TAD_EVENT_QD4_BNKS),
+	UC_EVENT_ENTRY(qd4_wdat,		L2C_TAD_EVENT_QD4_WDAT),
+	UC_EVENT_ENTRY(qd5_idx,			L2C_TAD_EVENT_QD5_IDX),
+	UC_EVENT_ENTRY(qd5_rdat,		L2C_TAD_EVENT_QD5_RDAT),
+	UC_EVENT_ENTRY(qd5_bnks,		L2C_TAD_EVENT_QD5_BNKS),
+	UC_EVENT_ENTRY(qd5_wdat,		L2C_TAD_EVENT_QD5_WDAT),
+	UC_EVENT_ENTRY(qd6_idx,			L2C_TAD_EVENT_QD6_IDX),
+	UC_EVENT_ENTRY(qd6_rdat,		L2C_TAD_EVENT_QD6_RDAT),
+	UC_EVENT_ENTRY(qd6_bnks,		L2C_TAD_EVENT_QD6_BNKS),
+	UC_EVENT_ENTRY(qd6_wdat,		L2C_TAD_EVENT_QD6_WDAT),
+	UC_EVENT_ENTRY(qd7_idx,			L2C_TAD_EVENT_QD7_IDX),
+	UC_EVENT_ENTRY(qd7_rdat,		L2C_TAD_EVENT_QD7_RDAT),
+	UC_EVENT_ENTRY(qd7_bnks,		L2C_TAD_EVENT_QD7_BNKS),
+	UC_EVENT_ENTRY(qd7_wdat,		L2C_TAD_EVENT_QD7_WDAT),
+	NULL,
+};
+
+static struct attribute_group thunder_l2c_tad_events_group = {
+	.name = "events",
+	.attrs = thunder_l2c_tad_events_attr,
+};
+
+static const struct attribute_group *thunder_l2c_tad_attr_groups[] = {
+	&thunder_uncore_attr_group,
+	&thunder_l2c_tad_format_group,
+	&thunder_l2c_tad_events_group,
+	NULL,
+};
+
+struct pmu thunder_l2c_tad_pmu = {
+	.name		= "thunder_l2c_tad",
+	.task_ctx_nr    = perf_invalid_context,
+	.event_init	= thunder_uncore_event_init,
+	.add		= thunder_uncore_add_l2c_tad,
+	.del		= thunder_uncore_del,
+	.start		= thunder_uncore_start_l2c_tad,
+	.stop		= thunder_uncore_stop_l2c_tad,
+	.read		= thunder_uncore_read,
+	.attr_groups	= thunder_l2c_tad_attr_groups,
+};
+
+static bool event_valid(u64 c)
+{
+	if ((c > 0 &&
+	     c <= L2C_TAD_EVENT_OPEN_CCPI) ||
+	    (c >= L2C_TAD_EVENT_LOOKUP &&
+	     c <= L2C_TAD_EVENT_LOOKUP_ALL) ||
+	    (c >= L2C_TAD_EVENT_TAG_ALC_HIT &&
+	     c <= L2C_TAD_EVENT_TAG_NALC_MISS) ||
+	    (c >= L2C_TAD_EVENT_LMC_WR &&
+	     c <= L2C_TAD_EVENT_STC_LCL_FAIL) ||
+	    (c >= L2C_TAD_EVENT_OCI_RTG_WAIT &&
+	     c <= L2C_TAD_EVENT_OCI_RTG_ALC_VIC) ||
+	    /* L2C_TAD_EVENT_QD[0..7] IDX,RDAT,BNKS,WDAT => 0x80 .. 0xf3 */
+	    ((c & 0x80) && ((c & 0xf) <= 3)))
+		return true;
+
+	return false;
+}
+
+int __init thunder_uncore_l2c_tad_setup(void)
+{
+	int ret = -ENOMEM;
+
+	thunder_uncore_l2c_tad = kzalloc(sizeof(*thunder_uncore_l2c_tad),
+					 GFP_KERNEL);
+	if (!thunder_uncore_l2c_tad)
+		goto fail_nomem;
+
+	ret = thunder_uncore_setup(thunder_uncore_l2c_tad, 0xa02e,
+				   &thunder_l2c_tad_pmu, L2C_TAD_NR_COUNTERS);
+	if (ret)
+		goto fail;
+
+	thunder_uncore_l2c_tad->event_valid = event_valid;
+	return 0;
+
+fail:
+	kfree(thunder_uncore_l2c_tad);
+fail_nomem:
+	return ret;
+}
-- 
2.9.0.rc0.21.g7777322

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v4 2/5] arm64: perf: Cavium ThunderX L2C TAD uncore support
@ 2016-10-29 11:55   ` Jan Glauber
  0 siblings, 0 replies; 28+ messages in thread
From: Jan Glauber @ 2016-10-29 11:55 UTC (permalink / raw)
  To: linux-arm-kernel

Support counters of the L2 Cache tag and data units.

Signed-off-by: Jan Glauber <jglauber@cavium.com>
---
 drivers/perf/uncore/Makefile                |   3 +-
 drivers/perf/uncore/uncore_cavium.c         |   1 +
 drivers/perf/uncore/uncore_cavium.h         |   1 +
 drivers/perf/uncore/uncore_cavium_l2c_tad.c | 379 ++++++++++++++++++++++++++++
 4 files changed, 383 insertions(+), 1 deletion(-)
 create mode 100644 drivers/perf/uncore/uncore_cavium_l2c_tad.c

diff --git a/drivers/perf/uncore/Makefile b/drivers/perf/uncore/Makefile
index 6130e18..90850a2 100644
--- a/drivers/perf/uncore/Makefile
+++ b/drivers/perf/uncore/Makefile
@@ -1 +1,2 @@
-obj-$(CONFIG_UNCORE_PMU_CAVIUM) += uncore_cavium.o
+obj-$(CONFIG_UNCORE_PMU_CAVIUM) += uncore_cavium.o		\
+				   uncore_cavium_l2c_tad.o
diff --git a/drivers/perf/uncore/uncore_cavium.c b/drivers/perf/uncore/uncore_cavium.c
index a7b4277..15e1aec 100644
--- a/drivers/perf/uncore/uncore_cavium.c
+++ b/drivers/perf/uncore/uncore_cavium.c
@@ -346,6 +346,7 @@ static int __init thunder_uncore_init(void)
 	if (ret)
 		return ret;
 
+	thunder_uncore_l2c_tad_setup();
 	return 0;
 }
 late_initcall(thunder_uncore_init);
diff --git a/drivers/perf/uncore/uncore_cavium.h b/drivers/perf/uncore/uncore_cavium.h
index b5d64b5..70a8214 100644
--- a/drivers/perf/uncore/uncore_cavium.h
+++ b/drivers/perf/uncore/uncore_cavium.h
@@ -69,3 +69,4 @@ int thunder_uncore_setup(struct thunder_uncore *uncore, int id,
 ssize_t thunder_events_sysfs_show(struct device *dev,
 				  struct device_attribute *attr,
 				  char *page);
+int thunder_uncore_l2c_tad_setup(void);
diff --git a/drivers/perf/uncore/uncore_cavium_l2c_tad.c b/drivers/perf/uncore/uncore_cavium_l2c_tad.c
new file mode 100644
index 0000000..b97ba33
--- /dev/null
+++ b/drivers/perf/uncore/uncore_cavium_l2c_tad.c
@@ -0,0 +1,379 @@
+/*
+ * Cavium Thunder uncore PMU support,
+ * L2 Cache tag-and-data-units (L2C TAD) counters.
+ *
+ * Copyright 2016 Cavium Inc.
+ * Author: Jan Glauber <jan.glauber@cavium.com>
+ */
+
+#include <linux/perf_event.h>
+#include <linux/slab.h>
+
+#include "uncore_cavium.h"
+
+struct thunder_uncore *thunder_uncore_l2c_tad;
+
+#define L2C_TAD_NR_COUNTERS             4
+#define L2C_TAD_PRF_OFFSET		0x10000
+#define L2C_TAD_PFC_OFFSET		0x10100
+
+/*
+ * Counters are selected via L2C_TAD(x)_PRF:
+ *
+ *   63					    32
+ *   +---------------------------------------+
+ *   |  Reserved			     |
+ *   +---------------------------------------+
+ *   | CNT3SEL | CNT2SEL | CNT1SEL | CNT0SEL |
+ *   +---------------------------------------+
+ *   31       24	16	  8	     0
+ *
+ * config_base contains the offset of the selected CNTxSEL in the mapped BAR.
+ *
+ * Counters are read via L2C_TAD(x)_PFC(0..3).
+ * event_base contains the associated address to read the counter.
+ */
+
+/* L2C TAD event list */
+#define L2C_TAD_EVENTS_DISABLED			0x00
+#define L2C_TAD_EVENT_L2T_HIT			0x01
+#define L2C_TAD_EVENT_L2T_MISS			0x02
+#define L2C_TAD_EVENT_L2T_NOALLOC		0x03
+#define L2C_TAD_EVENT_L2_VIC			0x04
+#define L2C_TAD_EVENT_SC_FAIL			0x05
+#define L2C_TAD_EVENT_SC_PASS			0x06
+#define L2C_TAD_EVENT_LFB_OCC			0x07
+#define L2C_TAD_EVENT_WAIT_LFB			0x08
+#define L2C_TAD_EVENT_WAIT_VAB			0x09
+#define L2C_TAD_EVENT_OPEN_CCPI			0x0a
+#define L2C_TAD_EVENT_LOOKUP			0x40
+#define L2C_TAD_EVENT_LOOKUP_XMC_LCL		0x41
+#define L2C_TAD_EVENT_LOOKUP_XMC_RMT		0x42
+#define L2C_TAD_EVENT_LOOKUP_MIB		0x43
+#define L2C_TAD_EVENT_LOOKUP_ALL		0x44
+#define L2C_TAD_EVENT_TAG_ALC_HIT		0x48
+#define L2C_TAD_EVENT_TAG_ALC_MISS		0x49
+#define L2C_TAD_EVENT_TAG_ALC_NALC		0x4a
+#define L2C_TAD_EVENT_TAG_NALC_HIT		0x4b
+#define L2C_TAD_EVENT_TAG_NALC_MISS		0x4c
+#define L2C_TAD_EVENT_LMC_WR			0x4e
+#define L2C_TAD_EVENT_LMC_SBLKDTY		0x4f
+#define L2C_TAD_EVENT_TAG_ALC_RTG_HIT		0x50
+#define L2C_TAD_EVENT_TAG_ALC_RTG_HITE		0x51
+#define L2C_TAD_EVENT_TAG_ALC_RTG_HITS		0x52
+#define L2C_TAD_EVENT_TAG_ALC_RTG_MISS		0x53
+#define L2C_TAD_EVENT_TAG_NALC_RTG_HIT		0x54
+#define L2C_TAD_EVENT_TAG_NALC_RTG_MISS		0x55
+#define L2C_TAD_EVENT_TAG_NALC_RTG_HITE		0x56
+#define L2C_TAD_EVENT_TAG_NALC_RTG_HITS		0x57
+#define L2C_TAD_EVENT_TAG_ALC_LCL_EVICT		0x58
+#define L2C_TAD_EVENT_TAG_ALC_LCL_CLNVIC	0x59
+#define L2C_TAD_EVENT_TAG_ALC_LCL_DTYVIC	0x5a
+#define L2C_TAD_EVENT_TAG_ALC_RMT_EVICT		0x5b
+#define L2C_TAD_EVENT_TAG_ALC_RMT_VIC		0x5c
+#define L2C_TAD_EVENT_RTG_ALC			0x5d
+#define L2C_TAD_EVENT_RTG_ALC_HIT		0x5e
+#define L2C_TAD_EVENT_RTG_ALC_HITWB		0x5f
+#define L2C_TAD_EVENT_STC_TOTAL			0x60
+#define L2C_TAD_EVENT_STC_TOTAL_FAIL		0x61
+#define L2C_TAD_EVENT_STC_RMT			0x62
+#define L2C_TAD_EVENT_STC_RMT_FAIL		0x63
+#define L2C_TAD_EVENT_STC_LCL			0x64
+#define L2C_TAD_EVENT_STC_LCL_FAIL		0x65
+#define L2C_TAD_EVENT_OCI_RTG_WAIT		0x68
+#define L2C_TAD_EVENT_OCI_FWD_CYC_HIT		0x69
+#define L2C_TAD_EVENT_OCI_FWD_RACE		0x6a
+#define L2C_TAD_EVENT_OCI_HAKS			0x6b
+#define L2C_TAD_EVENT_OCI_FLDX_TAG_E_NODAT	0x6c
+#define L2C_TAD_EVENT_OCI_FLDX_TAG_E_DAT	0x6d
+#define L2C_TAD_EVENT_OCI_RLDD			0x6e
+#define L2C_TAD_EVENT_OCI_RLDD_PEMD		0x6f
+#define L2C_TAD_EVENT_OCI_RRQ_DAT_CNT		0x70
+#define L2C_TAD_EVENT_OCI_RRQ_DAT_DMASK		0x71
+#define L2C_TAD_EVENT_OCI_RSP_DAT_CNT		0x72
+#define L2C_TAD_EVENT_OCI_RSP_DAT_DMASK		0x73
+#define L2C_TAD_EVENT_OCI_RSP_DAT_VICD_CNT	0x74
+#define L2C_TAD_EVENT_OCI_RSP_DAT_VICD_DMASK	0x75
+#define L2C_TAD_EVENT_OCI_RTG_ALC_EVICT		0x76
+#define L2C_TAD_EVENT_OCI_RTG_ALC_VIC		0x77
+#define L2C_TAD_EVENT_QD0_IDX			0x80
+#define L2C_TAD_EVENT_QD0_RDAT			0x81
+#define L2C_TAD_EVENT_QD0_BNKS			0x82
+#define L2C_TAD_EVENT_QD0_WDAT			0x83
+#define L2C_TAD_EVENT_QD1_IDX			0x90
+#define L2C_TAD_EVENT_QD1_RDAT			0x91
+#define L2C_TAD_EVENT_QD1_BNKS			0x92
+#define L2C_TAD_EVENT_QD1_WDAT			0x93
+#define L2C_TAD_EVENT_QD2_IDX			0xa0
+#define L2C_TAD_EVENT_QD2_RDAT			0xa1
+#define L2C_TAD_EVENT_QD2_BNKS			0xa2
+#define L2C_TAD_EVENT_QD2_WDAT			0xa3
+#define L2C_TAD_EVENT_QD3_IDX			0xb0
+#define L2C_TAD_EVENT_QD3_RDAT			0xb1
+#define L2C_TAD_EVENT_QD3_BNKS			0xb2
+#define L2C_TAD_EVENT_QD3_WDAT			0xb3
+#define L2C_TAD_EVENT_QD4_IDX			0xc0
+#define L2C_TAD_EVENT_QD4_RDAT			0xc1
+#define L2C_TAD_EVENT_QD4_BNKS			0xc2
+#define L2C_TAD_EVENT_QD4_WDAT			0xc3
+#define L2C_TAD_EVENT_QD5_IDX			0xd0
+#define L2C_TAD_EVENT_QD5_RDAT			0xd1
+#define L2C_TAD_EVENT_QD5_BNKS			0xd2
+#define L2C_TAD_EVENT_QD5_WDAT			0xd3
+#define L2C_TAD_EVENT_QD6_IDX			0xe0
+#define L2C_TAD_EVENT_QD6_RDAT			0xe1
+#define L2C_TAD_EVENT_QD6_BNKS			0xe2
+#define L2C_TAD_EVENT_QD6_WDAT			0xe3
+#define L2C_TAD_EVENT_QD7_IDX			0xf0
+#define L2C_TAD_EVENT_QD7_RDAT			0xf1
+#define L2C_TAD_EVENT_QD7_BNKS			0xf2
+#define L2C_TAD_EVENT_QD7_WDAT			0xf3
+
+static void thunder_uncore_start_l2c_tad(struct perf_event *event, int flags)
+{
+	struct thunder_uncore *uncore = to_uncore(event->pmu);
+	struct hw_perf_event *hwc = &event->hw;
+	struct thunder_uncore_node *node;
+	struct thunder_uncore_unit *unit;
+	int id;
+
+	node = get_node(hwc->config, uncore);
+	id = get_id(hwc->config);
+
+	/* reset counter values to zero */
+	if (flags & PERF_EF_RELOAD)
+		list_for_each_entry(unit, &node->unit_list, entry)
+			writeq(0, hwc->event_base + unit->map);
+
+	/* start counters on all units on the node */
+	list_for_each_entry(unit, &node->unit_list, entry)
+		writeb(id, hwc->config_base + unit->map);
+
+	hwc->state = 0;
+	perf_event_update_userpage(event);
+}
+
+static void thunder_uncore_stop_l2c_tad(struct perf_event *event, int flags)
+{
+	struct thunder_uncore *uncore = to_uncore(event->pmu);
+	struct hw_perf_event *hwc = &event->hw;
+	struct thunder_uncore_node *node;
+	struct thunder_uncore_unit *unit;
+
+	node = get_node(hwc->config, uncore);
+
+	/* disable counters for all units on the node */
+	list_for_each_entry(unit, &node->unit_list, entry)
+		writeb(L2C_TAD_EVENTS_DISABLED, hwc->config_base + unit->map);
+	hwc->state |= PERF_HES_STOPPED;
+
+	if ((flags & PERF_EF_UPDATE) && !(hwc->state & PERF_HES_UPTODATE)) {
+		thunder_uncore_read(event);
+		hwc->state |= PERF_HES_UPTODATE;
+	}
+}
+
+static int thunder_uncore_add_l2c_tad(struct perf_event *event, int flags)
+{
+	struct thunder_uncore *uncore = to_uncore(event->pmu);
+	struct hw_perf_event *hwc = &event->hw;
+	struct thunder_uncore_node *node;
+	int i;
+
+	node = get_node(hwc->config, uncore);
+
+	/* take the first available counter */
+	for (i = 0; i < node->num_counters; i++) {
+		if (!cmpxchg(&node->events[i], NULL, event)) {
+			hwc->idx = i;
+			break;
+		}
+	}
+
+	if (hwc->idx == -1)
+		return -EBUSY;
+
+	/* see comment at beginning of file */
+	hwc->config_base = L2C_TAD_PRF_OFFSET + hwc->idx;
+	hwc->event_base = L2C_TAD_PFC_OFFSET + hwc->idx * sizeof(u64);
+
+	hwc->state = PERF_HES_UPTODATE | PERF_HES_STOPPED;
+	if (flags & PERF_EF_START)
+		thunder_uncore_start(event, PERF_EF_RELOAD);
+	return 0;
+}
+
+PMU_FORMAT_ATTR(event, "config:0-7");
+
+static struct attribute *thunder_l2c_tad_format_attr[] = {
+	&format_attr_event.attr,
+	&format_attr_node.attr,
+	NULL,
+};
+
+static struct attribute_group thunder_l2c_tad_format_group = {
+	.name = "format",
+	.attrs = thunder_l2c_tad_format_attr,
+};
+
+static struct attribute *thunder_l2c_tad_events_attr[] = {
+	UC_EVENT_ENTRY(l2t_hit,			L2C_TAD_EVENT_L2T_HIT),
+	UC_EVENT_ENTRY(l2t_miss,		L2C_TAD_EVENT_L2T_MISS),
+	UC_EVENT_ENTRY(l2t_noalloc,		L2C_TAD_EVENT_L2T_NOALLOC),
+	UC_EVENT_ENTRY(l2_vic,			L2C_TAD_EVENT_L2_VIC),
+	UC_EVENT_ENTRY(sc_fail,			L2C_TAD_EVENT_SC_FAIL),
+	UC_EVENT_ENTRY(sc_pass,			L2C_TAD_EVENT_SC_PASS),
+	UC_EVENT_ENTRY(lfb_occ,			L2C_TAD_EVENT_LFB_OCC),
+	UC_EVENT_ENTRY(wait_lfb,		L2C_TAD_EVENT_WAIT_LFB),
+	UC_EVENT_ENTRY(wait_vab,		L2C_TAD_EVENT_WAIT_VAB),
+	UC_EVENT_ENTRY(open_ccpi,		L2C_TAD_EVENT_OPEN_CCPI),
+	UC_EVENT_ENTRY(lookup,			L2C_TAD_EVENT_LOOKUP),
+	UC_EVENT_ENTRY(lookup_xmc_lcl,		L2C_TAD_EVENT_LOOKUP_XMC_LCL),
+	UC_EVENT_ENTRY(lookup_xmc_rmt,		L2C_TAD_EVENT_LOOKUP_XMC_RMT),
+	UC_EVENT_ENTRY(lookup_mib,		L2C_TAD_EVENT_LOOKUP_MIB),
+	UC_EVENT_ENTRY(lookup_all,		L2C_TAD_EVENT_LOOKUP_ALL),
+	UC_EVENT_ENTRY(tag_alc_hit,		L2C_TAD_EVENT_TAG_ALC_HIT),
+	UC_EVENT_ENTRY(tag_alc_miss,		L2C_TAD_EVENT_TAG_ALC_MISS),
+	UC_EVENT_ENTRY(tag_alc_nalc,		L2C_TAD_EVENT_TAG_ALC_NALC),
+	UC_EVENT_ENTRY(tag_nalc_hit,		L2C_TAD_EVENT_TAG_NALC_HIT),
+	UC_EVENT_ENTRY(tag_nalc_miss,		L2C_TAD_EVENT_TAG_NALC_MISS),
+	UC_EVENT_ENTRY(lmc_wr,			L2C_TAD_EVENT_LMC_WR),
+	UC_EVENT_ENTRY(lmc_sblkdty,		L2C_TAD_EVENT_LMC_SBLKDTY),
+	UC_EVENT_ENTRY(tag_alc_rtg_hit,		L2C_TAD_EVENT_TAG_ALC_RTG_HIT),
+	UC_EVENT_ENTRY(tag_alc_rtg_hite,	L2C_TAD_EVENT_TAG_ALC_RTG_HITE),
+	UC_EVENT_ENTRY(tag_alc_rtg_hits,	L2C_TAD_EVENT_TAG_ALC_RTG_HITS),
+	UC_EVENT_ENTRY(tag_alc_rtg_miss,	L2C_TAD_EVENT_TAG_ALC_RTG_MISS),
+	UC_EVENT_ENTRY(tag_alc_nalc_rtg_hit,	L2C_TAD_EVENT_TAG_NALC_RTG_HIT),
+	UC_EVENT_ENTRY(tag_nalc_rtg_miss,	L2C_TAD_EVENT_TAG_NALC_RTG_MISS),
+	UC_EVENT_ENTRY(tag_nalc_rtg_hite,	L2C_TAD_EVENT_TAG_NALC_RTG_HITE),
+	UC_EVENT_ENTRY(tag_nalc_rtg_hits,	L2C_TAD_EVENT_TAG_NALC_RTG_HITS),
+	UC_EVENT_ENTRY(tag_alc_lcl_evict,	L2C_TAD_EVENT_TAG_ALC_LCL_EVICT),
+	UC_EVENT_ENTRY(tag_alc_lcl_clnvic,	L2C_TAD_EVENT_TAG_ALC_LCL_CLNVIC),
+	UC_EVENT_ENTRY(tag_alc_lcl_dtyvic,	L2C_TAD_EVENT_TAG_ALC_LCL_DTYVIC),
+	UC_EVENT_ENTRY(tag_alc_rmt_evict,	L2C_TAD_EVENT_TAG_ALC_RMT_EVICT),
+	UC_EVENT_ENTRY(tag_alc_rmt_vic,		L2C_TAD_EVENT_TAG_ALC_RMT_VIC),
+	UC_EVENT_ENTRY(rtg_alc,			L2C_TAD_EVENT_RTG_ALC),
+	UC_EVENT_ENTRY(rtg_alc_hit,		L2C_TAD_EVENT_RTG_ALC_HIT),
+	UC_EVENT_ENTRY(rtg_alc_hitwb,		L2C_TAD_EVENT_RTG_ALC_HITWB),
+	UC_EVENT_ENTRY(stc_total,		L2C_TAD_EVENT_STC_TOTAL),
+	UC_EVENT_ENTRY(stc_total_fail,		L2C_TAD_EVENT_STC_TOTAL_FAIL),
+	UC_EVENT_ENTRY(stc_rmt,			L2C_TAD_EVENT_STC_RMT),
+	UC_EVENT_ENTRY(stc_rmt_fail,		L2C_TAD_EVENT_STC_RMT_FAIL),
+	UC_EVENT_ENTRY(stc_lcl,			L2C_TAD_EVENT_STC_LCL),
+	UC_EVENT_ENTRY(stc_lcl_fail,		L2C_TAD_EVENT_STC_LCL_FAIL),
+	UC_EVENT_ENTRY(oci_rtg_wait,		L2C_TAD_EVENT_OCI_RTG_WAIT),
+	UC_EVENT_ENTRY(oci_fwd_cyc_hit,		L2C_TAD_EVENT_OCI_FWD_CYC_HIT),
+	UC_EVENT_ENTRY(oci_fwd_race,		L2C_TAD_EVENT_OCI_FWD_RACE),
+	UC_EVENT_ENTRY(oci_haks,		L2C_TAD_EVENT_OCI_HAKS),
+	UC_EVENT_ENTRY(oci_fldx_tag_e_nodat,	L2C_TAD_EVENT_OCI_FLDX_TAG_E_NODAT),
+	UC_EVENT_ENTRY(oci_fldx_tag_e_dat,	L2C_TAD_EVENT_OCI_FLDX_TAG_E_DAT),
+	UC_EVENT_ENTRY(oci_rldd,		L2C_TAD_EVENT_OCI_RLDD),
+	UC_EVENT_ENTRY(oci_rldd_pemd,		L2C_TAD_EVENT_OCI_RLDD_PEMD),
+	UC_EVENT_ENTRY(oci_rrq_dat_cnt,		L2C_TAD_EVENT_OCI_RRQ_DAT_CNT),
+	UC_EVENT_ENTRY(oci_rrq_dat_dmask,	L2C_TAD_EVENT_OCI_RRQ_DAT_DMASK),
+	UC_EVENT_ENTRY(oci_rsp_dat_cnt,		L2C_TAD_EVENT_OCI_RSP_DAT_CNT),
+	UC_EVENT_ENTRY(oci_rsp_dat_dmaks,	L2C_TAD_EVENT_OCI_RSP_DAT_DMASK),
+	UC_EVENT_ENTRY(oci_rsp_dat_vicd_cnt,	L2C_TAD_EVENT_OCI_RSP_DAT_VICD_CNT),
+	UC_EVENT_ENTRY(oci_rsp_dat_vicd_dmask,	L2C_TAD_EVENT_OCI_RSP_DAT_VICD_DMASK),
+	UC_EVENT_ENTRY(oci_rtg_alc_evict,	L2C_TAD_EVENT_OCI_RTG_ALC_EVICT),
+	UC_EVENT_ENTRY(oci_rtg_alc_vic,		L2C_TAD_EVENT_OCI_RTG_ALC_VIC),
+	UC_EVENT_ENTRY(qd0_idx,			L2C_TAD_EVENT_QD0_IDX),
+	UC_EVENT_ENTRY(qd0_rdat,		L2C_TAD_EVENT_QD0_RDAT),
+	UC_EVENT_ENTRY(qd0_bnks,		L2C_TAD_EVENT_QD0_BNKS),
+	UC_EVENT_ENTRY(qd0_wdat,		L2C_TAD_EVENT_QD0_WDAT),
+	UC_EVENT_ENTRY(qd1_idx,			L2C_TAD_EVENT_QD1_IDX),
+	UC_EVENT_ENTRY(qd1_rdat,		L2C_TAD_EVENT_QD1_RDAT),
+	UC_EVENT_ENTRY(qd1_bnks,		L2C_TAD_EVENT_QD1_BNKS),
+	UC_EVENT_ENTRY(qd1_wdat,		L2C_TAD_EVENT_QD1_WDAT),
+	UC_EVENT_ENTRY(qd2_idx,			L2C_TAD_EVENT_QD2_IDX),
+	UC_EVENT_ENTRY(qd2_rdat,		L2C_TAD_EVENT_QD2_RDAT),
+	UC_EVENT_ENTRY(qd2_bnks,		L2C_TAD_EVENT_QD2_BNKS),
+	UC_EVENT_ENTRY(qd2_wdat,		L2C_TAD_EVENT_QD2_WDAT),
+	UC_EVENT_ENTRY(qd3_idx,			L2C_TAD_EVENT_QD3_IDX),
+	UC_EVENT_ENTRY(qd3_rdat,		L2C_TAD_EVENT_QD3_RDAT),
+	UC_EVENT_ENTRY(qd3_bnks,		L2C_TAD_EVENT_QD3_BNKS),
+	UC_EVENT_ENTRY(qd3_wdat,		L2C_TAD_EVENT_QD3_WDAT),
+	UC_EVENT_ENTRY(qd4_idx,			L2C_TAD_EVENT_QD4_IDX),
+	UC_EVENT_ENTRY(qd4_rdat,		L2C_TAD_EVENT_QD4_RDAT),
+	UC_EVENT_ENTRY(qd4_bnks,		L2C_TAD_EVENT_QD4_BNKS),
+	UC_EVENT_ENTRY(qd4_wdat,		L2C_TAD_EVENT_QD4_WDAT),
+	UC_EVENT_ENTRY(qd5_idx,			L2C_TAD_EVENT_QD5_IDX),
+	UC_EVENT_ENTRY(qd5_rdat,		L2C_TAD_EVENT_QD5_RDAT),
+	UC_EVENT_ENTRY(qd5_bnks,		L2C_TAD_EVENT_QD5_BNKS),
+	UC_EVENT_ENTRY(qd5_wdat,		L2C_TAD_EVENT_QD5_WDAT),
+	UC_EVENT_ENTRY(qd6_idx,			L2C_TAD_EVENT_QD6_IDX),
+	UC_EVENT_ENTRY(qd6_rdat,		L2C_TAD_EVENT_QD6_RDAT),
+	UC_EVENT_ENTRY(qd6_bnks,		L2C_TAD_EVENT_QD6_BNKS),
+	UC_EVENT_ENTRY(qd6_wdat,		L2C_TAD_EVENT_QD6_WDAT),
+	UC_EVENT_ENTRY(qd7_idx,			L2C_TAD_EVENT_QD7_IDX),
+	UC_EVENT_ENTRY(qd7_rdat,		L2C_TAD_EVENT_QD7_RDAT),
+	UC_EVENT_ENTRY(qd7_bnks,		L2C_TAD_EVENT_QD7_BNKS),
+	UC_EVENT_ENTRY(qd7_wdat,		L2C_TAD_EVENT_QD7_WDAT),
+	NULL,
+};
+
+static struct attribute_group thunder_l2c_tad_events_group = {
+	.name = "events",
+	.attrs = thunder_l2c_tad_events_attr,
+};
+
+static const struct attribute_group *thunder_l2c_tad_attr_groups[] = {
+	&thunder_uncore_attr_group,
+	&thunder_l2c_tad_format_group,
+	&thunder_l2c_tad_events_group,
+	NULL,
+};
+
+struct pmu thunder_l2c_tad_pmu = {
+	.name		= "thunder_l2c_tad",
+	.task_ctx_nr    = perf_invalid_context,
+	.event_init	= thunder_uncore_event_init,
+	.add		= thunder_uncore_add_l2c_tad,
+	.del		= thunder_uncore_del,
+	.start		= thunder_uncore_start_l2c_tad,
+	.stop		= thunder_uncore_stop_l2c_tad,
+	.read		= thunder_uncore_read,
+	.attr_groups	= thunder_l2c_tad_attr_groups,
+};
+
+static bool event_valid(u64 c)
+{
+	if ((c > 0 &&
+	     c <= L2C_TAD_EVENT_OPEN_CCPI) ||
+	    (c >= L2C_TAD_EVENT_LOOKUP &&
+	     c <= L2C_TAD_EVENT_LOOKUP_ALL) ||
+	    (c >= L2C_TAD_EVENT_TAG_ALC_HIT &&
+	     c <= L2C_TAD_EVENT_TAG_NALC_MISS) ||
+	    (c >= L2C_TAD_EVENT_LMC_WR &&
+	     c <= L2C_TAD_EVENT_STC_LCL_FAIL) ||
+	    (c >= L2C_TAD_EVENT_OCI_RTG_WAIT &&
+	     c <= L2C_TAD_EVENT_OCI_RTG_ALC_VIC) ||
+	    /* L2C_TAD_EVENT_QD[0..7] IDX,RDAT,BNKS,WDAT => 0x80 .. 0xf3 */
+	    ((c & 0x80) && ((c & 0xf) <= 3)))
+		return true;
+
+	return false;
+}
+
+int __init thunder_uncore_l2c_tad_setup(void)
+{
+	int ret = -ENOMEM;
+
+	thunder_uncore_l2c_tad = kzalloc(sizeof(*thunder_uncore_l2c_tad),
+					 GFP_KERNEL);
+	if (!thunder_uncore_l2c_tad)
+		goto fail_nomem;
+
+	ret = thunder_uncore_setup(thunder_uncore_l2c_tad, 0xa02e,
+				   &thunder_l2c_tad_pmu, L2C_TAD_NR_COUNTERS);
+	if (ret)
+		goto fail;
+
+	thunder_uncore_l2c_tad->event_valid = event_valid;
+	return 0;
+
+fail:
+	kfree(thunder_uncore_l2c_tad);
+fail_nomem:
+	return ret;
+}
-- 
2.9.0.rc0.21.g7777322

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v4 3/5] arm64: perf: Cavium ThunderX L2C CBC uncore support
  2016-10-29 11:55 ` Jan Glauber
@ 2016-10-29 11:55   ` Jan Glauber
  -1 siblings, 0 replies; 28+ messages in thread
From: Jan Glauber @ 2016-10-29 11:55 UTC (permalink / raw)
  To: Mark Rutland, Will Deacon; +Cc: linux-kernel, linux-arm-kernel, Jan Glauber

Support counters of the L2 cache crossbar connect.

Signed-off-by: Jan Glauber <jglauber@cavium.com>
---
 drivers/perf/uncore/Makefile                |   3 +-
 drivers/perf/uncore/uncore_cavium.c         |   1 +
 drivers/perf/uncore/uncore_cavium.h         |   1 +
 drivers/perf/uncore/uncore_cavium_l2c_cbc.c | 148 ++++++++++++++++++++++++++++
 4 files changed, 152 insertions(+), 1 deletion(-)
 create mode 100644 drivers/perf/uncore/uncore_cavium_l2c_cbc.c

diff --git a/drivers/perf/uncore/Makefile b/drivers/perf/uncore/Makefile
index 90850a2..d5ef3db 100644
--- a/drivers/perf/uncore/Makefile
+++ b/drivers/perf/uncore/Makefile
@@ -1,2 +1,3 @@
 obj-$(CONFIG_UNCORE_PMU_CAVIUM) += uncore_cavium.o		\
-				   uncore_cavium_l2c_tad.o
+				   uncore_cavium_l2c_tad.o	\
+				   uncore_cavium_l2c_cbc.o
diff --git a/drivers/perf/uncore/uncore_cavium.c b/drivers/perf/uncore/uncore_cavium.c
index 15e1aec..051f0fa 100644
--- a/drivers/perf/uncore/uncore_cavium.c
+++ b/drivers/perf/uncore/uncore_cavium.c
@@ -347,6 +347,7 @@ static int __init thunder_uncore_init(void)
 		return ret;
 
 	thunder_uncore_l2c_tad_setup();
+	thunder_uncore_l2c_cbc_setup();
 	return 0;
 }
 late_initcall(thunder_uncore_init);
diff --git a/drivers/perf/uncore/uncore_cavium.h b/drivers/perf/uncore/uncore_cavium.h
index 70a8214..91d674a 100644
--- a/drivers/perf/uncore/uncore_cavium.h
+++ b/drivers/perf/uncore/uncore_cavium.h
@@ -70,3 +70,4 @@ ssize_t thunder_events_sysfs_show(struct device *dev,
 				  struct device_attribute *attr,
 				  char *page);
 int thunder_uncore_l2c_tad_setup(void);
+int thunder_uncore_l2c_cbc_setup(void);
diff --git a/drivers/perf/uncore/uncore_cavium_l2c_cbc.c b/drivers/perf/uncore/uncore_cavium_l2c_cbc.c
new file mode 100644
index 0000000..95b6147
--- /dev/null
+++ b/drivers/perf/uncore/uncore_cavium_l2c_cbc.c
@@ -0,0 +1,148 @@
+/*
+ * Cavium Thunder uncore PMU support, L2 Cache,
+ * Crossbar connect (CBC) counters.
+ *
+ * Copyright 2016 Cavium Inc.
+ * Author: Jan Glauber <jan.glauber@cavium.com>
+ */
+
+#include <linux/perf_event.h>
+#include <linux/slab.h>
+
+#include "uncore_cavium.h"
+
+struct thunder_uncore *thunder_uncore_l2c_cbc;
+
+/* L2C CBC event list */
+#define L2C_CBC_EVENT_XMC0		0x00
+#define L2C_CBC_EVENT_XMD0		0x08
+#define L2C_CBC_EVENT_RSC0		0x10
+#define L2C_CBC_EVENT_RSD0		0x18
+#define L2C_CBC_EVENT_INV0		0x20
+#define L2C_CBC_EVENT_IOC0		0x28
+#define L2C_CBC_EVENT_IOR0		0x30
+#define L2C_CBC_EVENT_XMC1		0x40
+#define L2C_CBC_EVENT_XMD1		0x48
+#define L2C_CBC_EVENT_RSC1		0x50
+#define L2C_CBC_EVENT_RSD1		0x58
+#define L2C_CBC_EVENT_INV1		0x60
+#define L2C_CBC_EVENT_XMC2		0x80
+#define L2C_CBC_EVENT_XMD2		0x88
+#define L2C_CBC_EVENT_RSC2		0x90
+#define L2C_CBC_EVENT_RSD2		0x98
+
+static int l2c_cbc_events[] = {
+	L2C_CBC_EVENT_XMC0,
+	L2C_CBC_EVENT_XMD0,
+	L2C_CBC_EVENT_RSC0,
+	L2C_CBC_EVENT_RSD0,
+	L2C_CBC_EVENT_INV0,
+	L2C_CBC_EVENT_IOC0,
+	L2C_CBC_EVENT_IOR0,
+	L2C_CBC_EVENT_XMC1,
+	L2C_CBC_EVENT_XMD1,
+	L2C_CBC_EVENT_RSC1,
+	L2C_CBC_EVENT_RSD1,
+	L2C_CBC_EVENT_INV1,
+	L2C_CBC_EVENT_XMC2,
+	L2C_CBC_EVENT_XMD2,
+	L2C_CBC_EVENT_RSC2,
+	L2C_CBC_EVENT_RSD2,
+};
+
+static int thunder_uncore_add_l2c_cbc(struct perf_event *event, int flags)
+{
+	struct hw_perf_event *hwc = &event->hw;
+
+	return thunder_uncore_add(event, flags, 0,
+				  l2c_cbc_events[get_id(hwc->config)]);
+}
+
+PMU_FORMAT_ATTR(event, "config:0-4");
+
+static struct attribute *thunder_l2c_cbc_format_attr[] = {
+	&format_attr_event.attr,
+	&format_attr_node.attr,
+	NULL,
+};
+
+static struct attribute_group thunder_l2c_cbc_format_group = {
+	.name = "format",
+	.attrs = thunder_l2c_cbc_format_attr,
+};
+
+static struct attribute *thunder_l2c_cbc_events_attr[] = {
+	UC_EVENT_ENTRY(xmc0, 0),
+	UC_EVENT_ENTRY(xmd0, 1),
+	UC_EVENT_ENTRY(rsc0, 2),
+	UC_EVENT_ENTRY(rsd0, 3),
+	UC_EVENT_ENTRY(inv0, 4),
+	UC_EVENT_ENTRY(ioc0, 5),
+	UC_EVENT_ENTRY(ior0, 6),
+	UC_EVENT_ENTRY(xmc1, 7),
+	UC_EVENT_ENTRY(xmd1, 8),
+	UC_EVENT_ENTRY(rsc1, 9),
+	UC_EVENT_ENTRY(rsd1, 10),
+	UC_EVENT_ENTRY(inv1, 11),
+	UC_EVENT_ENTRY(xmc2, 12),
+	UC_EVENT_ENTRY(xmd2, 13),
+	UC_EVENT_ENTRY(rsc2, 14),
+	UC_EVENT_ENTRY(rsd2, 15),
+	NULL,
+};
+
+static struct attribute_group thunder_l2c_cbc_events_group = {
+	.name = "events",
+	.attrs = thunder_l2c_cbc_events_attr,
+};
+
+static const struct attribute_group *thunder_l2c_cbc_attr_groups[] = {
+	&thunder_uncore_attr_group,
+	&thunder_l2c_cbc_format_group,
+	&thunder_l2c_cbc_events_group,
+	NULL,
+};
+
+struct pmu thunder_l2c_cbc_pmu = {
+	.name		= "thunder_l2c_cbc",
+	.task_ctx_nr    = perf_invalid_context,
+	.event_init	= thunder_uncore_event_init,
+	.add		= thunder_uncore_add_l2c_cbc,
+	.del		= thunder_uncore_del,
+	.start		= thunder_uncore_start,
+	.stop		= thunder_uncore_stop,
+	.read		= thunder_uncore_read,
+	.attr_groups	= thunder_l2c_cbc_attr_groups,
+};
+
+static bool event_valid(u64 config)
+{
+	if (config < ARRAY_SIZE(l2c_cbc_events))
+		return true;
+
+	return false;
+}
+
+int __init thunder_uncore_l2c_cbc_setup(void)
+{
+	int ret = -ENOMEM;
+
+	thunder_uncore_l2c_cbc = kzalloc(sizeof(*thunder_uncore_l2c_cbc),
+					 GFP_KERNEL);
+	if (!thunder_uncore_l2c_cbc)
+		goto fail_nomem;
+
+	ret = thunder_uncore_setup(thunder_uncore_l2c_cbc, 0xa02f,
+				   &thunder_l2c_cbc_pmu,
+				   ARRAY_SIZE(l2c_cbc_events));
+	if (ret)
+		goto fail;
+
+	thunder_uncore_l2c_cbc->event_valid = event_valid;
+	return 0;
+
+fail:
+	kfree(thunder_uncore_l2c_cbc);
+fail_nomem:
+	return ret;
+}
-- 
2.9.0.rc0.21.g7777322

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v4 3/5] arm64: perf: Cavium ThunderX L2C CBC uncore support
@ 2016-10-29 11:55   ` Jan Glauber
  0 siblings, 0 replies; 28+ messages in thread
From: Jan Glauber @ 2016-10-29 11:55 UTC (permalink / raw)
  To: linux-arm-kernel

Support counters of the L2 cache crossbar connect.

Signed-off-by: Jan Glauber <jglauber@cavium.com>
---
 drivers/perf/uncore/Makefile                |   3 +-
 drivers/perf/uncore/uncore_cavium.c         |   1 +
 drivers/perf/uncore/uncore_cavium.h         |   1 +
 drivers/perf/uncore/uncore_cavium_l2c_cbc.c | 148 ++++++++++++++++++++++++++++
 4 files changed, 152 insertions(+), 1 deletion(-)
 create mode 100644 drivers/perf/uncore/uncore_cavium_l2c_cbc.c

diff --git a/drivers/perf/uncore/Makefile b/drivers/perf/uncore/Makefile
index 90850a2..d5ef3db 100644
--- a/drivers/perf/uncore/Makefile
+++ b/drivers/perf/uncore/Makefile
@@ -1,2 +1,3 @@
 obj-$(CONFIG_UNCORE_PMU_CAVIUM) += uncore_cavium.o		\
-				   uncore_cavium_l2c_tad.o
+				   uncore_cavium_l2c_tad.o	\
+				   uncore_cavium_l2c_cbc.o
diff --git a/drivers/perf/uncore/uncore_cavium.c b/drivers/perf/uncore/uncore_cavium.c
index 15e1aec..051f0fa 100644
--- a/drivers/perf/uncore/uncore_cavium.c
+++ b/drivers/perf/uncore/uncore_cavium.c
@@ -347,6 +347,7 @@ static int __init thunder_uncore_init(void)
 		return ret;
 
 	thunder_uncore_l2c_tad_setup();
+	thunder_uncore_l2c_cbc_setup();
 	return 0;
 }
 late_initcall(thunder_uncore_init);
diff --git a/drivers/perf/uncore/uncore_cavium.h b/drivers/perf/uncore/uncore_cavium.h
index 70a8214..91d674a 100644
--- a/drivers/perf/uncore/uncore_cavium.h
+++ b/drivers/perf/uncore/uncore_cavium.h
@@ -70,3 +70,4 @@ ssize_t thunder_events_sysfs_show(struct device *dev,
 				  struct device_attribute *attr,
 				  char *page);
 int thunder_uncore_l2c_tad_setup(void);
+int thunder_uncore_l2c_cbc_setup(void);
diff --git a/drivers/perf/uncore/uncore_cavium_l2c_cbc.c b/drivers/perf/uncore/uncore_cavium_l2c_cbc.c
new file mode 100644
index 0000000..95b6147
--- /dev/null
+++ b/drivers/perf/uncore/uncore_cavium_l2c_cbc.c
@@ -0,0 +1,148 @@
+/*
+ * Cavium Thunder uncore PMU support, L2 Cache,
+ * Crossbar connect (CBC) counters.
+ *
+ * Copyright 2016 Cavium Inc.
+ * Author: Jan Glauber <jan.glauber@cavium.com>
+ */
+
+#include <linux/perf_event.h>
+#include <linux/slab.h>
+
+#include "uncore_cavium.h"
+
+struct thunder_uncore *thunder_uncore_l2c_cbc;
+
+/* L2C CBC event list */
+#define L2C_CBC_EVENT_XMC0		0x00
+#define L2C_CBC_EVENT_XMD0		0x08
+#define L2C_CBC_EVENT_RSC0		0x10
+#define L2C_CBC_EVENT_RSD0		0x18
+#define L2C_CBC_EVENT_INV0		0x20
+#define L2C_CBC_EVENT_IOC0		0x28
+#define L2C_CBC_EVENT_IOR0		0x30
+#define L2C_CBC_EVENT_XMC1		0x40
+#define L2C_CBC_EVENT_XMD1		0x48
+#define L2C_CBC_EVENT_RSC1		0x50
+#define L2C_CBC_EVENT_RSD1		0x58
+#define L2C_CBC_EVENT_INV1		0x60
+#define L2C_CBC_EVENT_XMC2		0x80
+#define L2C_CBC_EVENT_XMD2		0x88
+#define L2C_CBC_EVENT_RSC2		0x90
+#define L2C_CBC_EVENT_RSD2		0x98
+
+static int l2c_cbc_events[] = {
+	L2C_CBC_EVENT_XMC0,
+	L2C_CBC_EVENT_XMD0,
+	L2C_CBC_EVENT_RSC0,
+	L2C_CBC_EVENT_RSD0,
+	L2C_CBC_EVENT_INV0,
+	L2C_CBC_EVENT_IOC0,
+	L2C_CBC_EVENT_IOR0,
+	L2C_CBC_EVENT_XMC1,
+	L2C_CBC_EVENT_XMD1,
+	L2C_CBC_EVENT_RSC1,
+	L2C_CBC_EVENT_RSD1,
+	L2C_CBC_EVENT_INV1,
+	L2C_CBC_EVENT_XMC2,
+	L2C_CBC_EVENT_XMD2,
+	L2C_CBC_EVENT_RSC2,
+	L2C_CBC_EVENT_RSD2,
+};
+
+static int thunder_uncore_add_l2c_cbc(struct perf_event *event, int flags)
+{
+	struct hw_perf_event *hwc = &event->hw;
+
+	return thunder_uncore_add(event, flags, 0,
+				  l2c_cbc_events[get_id(hwc->config)]);
+}
+
+PMU_FORMAT_ATTR(event, "config:0-4");
+
+static struct attribute *thunder_l2c_cbc_format_attr[] = {
+	&format_attr_event.attr,
+	&format_attr_node.attr,
+	NULL,
+};
+
+static struct attribute_group thunder_l2c_cbc_format_group = {
+	.name = "format",
+	.attrs = thunder_l2c_cbc_format_attr,
+};
+
+static struct attribute *thunder_l2c_cbc_events_attr[] = {
+	UC_EVENT_ENTRY(xmc0, 0),
+	UC_EVENT_ENTRY(xmd0, 1),
+	UC_EVENT_ENTRY(rsc0, 2),
+	UC_EVENT_ENTRY(rsd0, 3),
+	UC_EVENT_ENTRY(inv0, 4),
+	UC_EVENT_ENTRY(ioc0, 5),
+	UC_EVENT_ENTRY(ior0, 6),
+	UC_EVENT_ENTRY(xmc1, 7),
+	UC_EVENT_ENTRY(xmd1, 8),
+	UC_EVENT_ENTRY(rsc1, 9),
+	UC_EVENT_ENTRY(rsd1, 10),
+	UC_EVENT_ENTRY(inv1, 11),
+	UC_EVENT_ENTRY(xmc2, 12),
+	UC_EVENT_ENTRY(xmd2, 13),
+	UC_EVENT_ENTRY(rsc2, 14),
+	UC_EVENT_ENTRY(rsd2, 15),
+	NULL,
+};
+
+static struct attribute_group thunder_l2c_cbc_events_group = {
+	.name = "events",
+	.attrs = thunder_l2c_cbc_events_attr,
+};
+
+static const struct attribute_group *thunder_l2c_cbc_attr_groups[] = {
+	&thunder_uncore_attr_group,
+	&thunder_l2c_cbc_format_group,
+	&thunder_l2c_cbc_events_group,
+	NULL,
+};
+
+struct pmu thunder_l2c_cbc_pmu = {
+	.name		= "thunder_l2c_cbc",
+	.task_ctx_nr    = perf_invalid_context,
+	.event_init	= thunder_uncore_event_init,
+	.add		= thunder_uncore_add_l2c_cbc,
+	.del		= thunder_uncore_del,
+	.start		= thunder_uncore_start,
+	.stop		= thunder_uncore_stop,
+	.read		= thunder_uncore_read,
+	.attr_groups	= thunder_l2c_cbc_attr_groups,
+};
+
+static bool event_valid(u64 config)
+{
+	if (config < ARRAY_SIZE(l2c_cbc_events))
+		return true;
+
+	return false;
+}
+
+int __init thunder_uncore_l2c_cbc_setup(void)
+{
+	int ret = -ENOMEM;
+
+	thunder_uncore_l2c_cbc = kzalloc(sizeof(*thunder_uncore_l2c_cbc),
+					 GFP_KERNEL);
+	if (!thunder_uncore_l2c_cbc)
+		goto fail_nomem;
+
+	ret = thunder_uncore_setup(thunder_uncore_l2c_cbc, 0xa02f,
+				   &thunder_l2c_cbc_pmu,
+				   ARRAY_SIZE(l2c_cbc_events));
+	if (ret)
+		goto fail;
+
+	thunder_uncore_l2c_cbc->event_valid = event_valid;
+	return 0;
+
+fail:
+	kfree(thunder_uncore_l2c_cbc);
+fail_nomem:
+	return ret;
+}
-- 
2.9.0.rc0.21.g7777322

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v4 4/5] arm64: perf: Cavium ThunderX LMC uncore support
  2016-10-29 11:55 ` Jan Glauber
@ 2016-10-29 11:55   ` Jan Glauber
  -1 siblings, 0 replies; 28+ messages in thread
From: Jan Glauber @ 2016-10-29 11:55 UTC (permalink / raw)
  To: Mark Rutland, Will Deacon; +Cc: linux-kernel, linux-arm-kernel, Jan Glauber

Support counters on the DRAM controllers.

Signed-off-by: Jan Glauber <jglauber@cavium.com>
---
 drivers/perf/uncore/Makefile            |   3 +-
 drivers/perf/uncore/uncore_cavium.c     |   1 +
 drivers/perf/uncore/uncore_cavium.h     |   1 +
 drivers/perf/uncore/uncore_cavium_lmc.c | 118 ++++++++++++++++++++++++++++++++
 4 files changed, 122 insertions(+), 1 deletion(-)
 create mode 100644 drivers/perf/uncore/uncore_cavium_lmc.c

diff --git a/drivers/perf/uncore/Makefile b/drivers/perf/uncore/Makefile
index d5ef3db..ef04a2b9 100644
--- a/drivers/perf/uncore/Makefile
+++ b/drivers/perf/uncore/Makefile
@@ -1,3 +1,4 @@
 obj-$(CONFIG_UNCORE_PMU_CAVIUM) += uncore_cavium.o		\
 				   uncore_cavium_l2c_tad.o	\
-				   uncore_cavium_l2c_cbc.o
+				   uncore_cavium_l2c_cbc.o	\
+				   uncore_cavium_lmc.o
diff --git a/drivers/perf/uncore/uncore_cavium.c b/drivers/perf/uncore/uncore_cavium.c
index 051f0fa..fd9e49e 100644
--- a/drivers/perf/uncore/uncore_cavium.c
+++ b/drivers/perf/uncore/uncore_cavium.c
@@ -348,6 +348,7 @@ static int __init thunder_uncore_init(void)
 
 	thunder_uncore_l2c_tad_setup();
 	thunder_uncore_l2c_cbc_setup();
+	thunder_uncore_lmc_setup();
 	return 0;
 }
 late_initcall(thunder_uncore_init);
diff --git a/drivers/perf/uncore/uncore_cavium.h b/drivers/perf/uncore/uncore_cavium.h
index 91d674a..3897586 100644
--- a/drivers/perf/uncore/uncore_cavium.h
+++ b/drivers/perf/uncore/uncore_cavium.h
@@ -71,3 +71,4 @@ ssize_t thunder_events_sysfs_show(struct device *dev,
 				  char *page);
 int thunder_uncore_l2c_tad_setup(void);
 int thunder_uncore_l2c_cbc_setup(void);
+int thunder_uncore_lmc_setup(void);
diff --git a/drivers/perf/uncore/uncore_cavium_lmc.c b/drivers/perf/uncore/uncore_cavium_lmc.c
new file mode 100644
index 0000000..9668197
--- /dev/null
+++ b/drivers/perf/uncore/uncore_cavium_lmc.c
@@ -0,0 +1,118 @@
+/*
+ * Cavium Thunder uncore PMU support, Local memory controller (LMC) counters.
+ *
+ * Copyright 2016 Cavium Inc.
+ * Author: Jan Glauber <jan.glauber@cavium.com>
+ */
+
+#include <linux/perf_event.h>
+#include <linux/slab.h>
+
+#include "uncore_cavium.h"
+
+struct thunder_uncore *thunder_uncore_lmc;
+
+#define LMC_CONFIG_OFFSET		0x188
+#define LMC_CONFIG_RESET_BIT		BIT_ULL(17)
+
+/* LMC event list */
+#define LMC_EVENT_IFB_CNT		0x1d0
+#define LMC_EVENT_OPS_CNT		0x1d8
+#define LMC_EVENT_DCLK_CNT		0x1e0
+#define LMC_EVENT_BANK_CONFLICT1	0x360
+#define LMC_EVENT_BANK_CONFLICT2	0x368
+
+/* map counter numbers to register offsets */
+static int lmc_events[] = {
+	LMC_EVENT_IFB_CNT,
+	LMC_EVENT_OPS_CNT,
+	LMC_EVENT_DCLK_CNT,
+	LMC_EVENT_BANK_CONFLICT1,
+	LMC_EVENT_BANK_CONFLICT2,
+};
+
+static int thunder_uncore_add_lmc(struct perf_event *event, int flags)
+{
+	struct hw_perf_event *hwc = &event->hw;
+
+	return thunder_uncore_add(event, flags,
+				  LMC_CONFIG_OFFSET,
+				  lmc_events[get_id(hwc->config)]);
+}
+
+PMU_FORMAT_ATTR(event, "config:0-2");
+
+static struct attribute *thunder_lmc_format_attr[] = {
+	&format_attr_event.attr,
+	&format_attr_node.attr,
+	NULL,
+};
+
+static struct attribute_group thunder_lmc_format_group = {
+	.name = "format",
+	.attrs = thunder_lmc_format_attr,
+};
+
+static struct attribute *thunder_lmc_events_attr[] = {
+	UC_EVENT_ENTRY(ifb_cnt, 0),
+	UC_EVENT_ENTRY(ops_cnt, 1),
+	UC_EVENT_ENTRY(dclk_cnt, 2),
+	UC_EVENT_ENTRY(bank_conflict1, 3),
+	UC_EVENT_ENTRY(bank_conflict2, 4),
+	NULL,
+};
+
+static struct attribute_group thunder_lmc_events_group = {
+	.name = "events",
+	.attrs = thunder_lmc_events_attr,
+};
+
+static const struct attribute_group *thunder_lmc_attr_groups[] = {
+	&thunder_uncore_attr_group,
+	&thunder_lmc_format_group,
+	&thunder_lmc_events_group,
+	NULL,
+};
+
+struct pmu thunder_lmc_pmu = {
+	.name		= "thunder_lmc",
+	.task_ctx_nr    = perf_invalid_context,
+	.event_init	= thunder_uncore_event_init,
+	.add		= thunder_uncore_add_lmc,
+	.del		= thunder_uncore_del,
+	.start		= thunder_uncore_start,
+	.stop		= thunder_uncore_stop,
+	.read		= thunder_uncore_read,
+	.attr_groups	= thunder_lmc_attr_groups,
+};
+
+static bool event_valid(u64 config)
+{
+	if (config < ARRAY_SIZE(lmc_events))
+		return true;
+
+	return false;
+}
+
+int __init thunder_uncore_lmc_setup(void)
+{
+	int ret = -ENOMEM;
+
+	thunder_uncore_lmc = kzalloc(sizeof(*thunder_uncore_lmc), GFP_KERNEL);
+	if (!thunder_uncore_lmc)
+		goto fail_nomem;
+
+	ret = thunder_uncore_setup(thunder_uncore_lmc, 0xa022,
+				   &thunder_lmc_pmu,
+				   ARRAY_SIZE(lmc_events));
+	if (ret)
+		goto fail;
+
+	thunder_uncore_lmc->event_valid = event_valid;
+	return 0;
+
+fail:
+	kfree(thunder_uncore_lmc);
+fail_nomem:
+	return ret;
+}
-- 
2.9.0.rc0.21.g7777322

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v4 4/5] arm64: perf: Cavium ThunderX LMC uncore support
@ 2016-10-29 11:55   ` Jan Glauber
  0 siblings, 0 replies; 28+ messages in thread
From: Jan Glauber @ 2016-10-29 11:55 UTC (permalink / raw)
  To: linux-arm-kernel

Support counters on the DRAM controllers.

Signed-off-by: Jan Glauber <jglauber@cavium.com>
---
 drivers/perf/uncore/Makefile            |   3 +-
 drivers/perf/uncore/uncore_cavium.c     |   1 +
 drivers/perf/uncore/uncore_cavium.h     |   1 +
 drivers/perf/uncore/uncore_cavium_lmc.c | 118 ++++++++++++++++++++++++++++++++
 4 files changed, 122 insertions(+), 1 deletion(-)
 create mode 100644 drivers/perf/uncore/uncore_cavium_lmc.c

diff --git a/drivers/perf/uncore/Makefile b/drivers/perf/uncore/Makefile
index d5ef3db..ef04a2b9 100644
--- a/drivers/perf/uncore/Makefile
+++ b/drivers/perf/uncore/Makefile
@@ -1,3 +1,4 @@
 obj-$(CONFIG_UNCORE_PMU_CAVIUM) += uncore_cavium.o		\
 				   uncore_cavium_l2c_tad.o	\
-				   uncore_cavium_l2c_cbc.o
+				   uncore_cavium_l2c_cbc.o	\
+				   uncore_cavium_lmc.o
diff --git a/drivers/perf/uncore/uncore_cavium.c b/drivers/perf/uncore/uncore_cavium.c
index 051f0fa..fd9e49e 100644
--- a/drivers/perf/uncore/uncore_cavium.c
+++ b/drivers/perf/uncore/uncore_cavium.c
@@ -348,6 +348,7 @@ static int __init thunder_uncore_init(void)
 
 	thunder_uncore_l2c_tad_setup();
 	thunder_uncore_l2c_cbc_setup();
+	thunder_uncore_lmc_setup();
 	return 0;
 }
 late_initcall(thunder_uncore_init);
diff --git a/drivers/perf/uncore/uncore_cavium.h b/drivers/perf/uncore/uncore_cavium.h
index 91d674a..3897586 100644
--- a/drivers/perf/uncore/uncore_cavium.h
+++ b/drivers/perf/uncore/uncore_cavium.h
@@ -71,3 +71,4 @@ ssize_t thunder_events_sysfs_show(struct device *dev,
 				  char *page);
 int thunder_uncore_l2c_tad_setup(void);
 int thunder_uncore_l2c_cbc_setup(void);
+int thunder_uncore_lmc_setup(void);
diff --git a/drivers/perf/uncore/uncore_cavium_lmc.c b/drivers/perf/uncore/uncore_cavium_lmc.c
new file mode 100644
index 0000000..9668197
--- /dev/null
+++ b/drivers/perf/uncore/uncore_cavium_lmc.c
@@ -0,0 +1,118 @@
+/*
+ * Cavium Thunder uncore PMU support, Local memory controller (LMC) counters.
+ *
+ * Copyright 2016 Cavium Inc.
+ * Author: Jan Glauber <jan.glauber@cavium.com>
+ */
+
+#include <linux/perf_event.h>
+#include <linux/slab.h>
+
+#include "uncore_cavium.h"
+
+struct thunder_uncore *thunder_uncore_lmc;
+
+#define LMC_CONFIG_OFFSET		0x188
+#define LMC_CONFIG_RESET_BIT		BIT_ULL(17)
+
+/* LMC event list */
+#define LMC_EVENT_IFB_CNT		0x1d0
+#define LMC_EVENT_OPS_CNT		0x1d8
+#define LMC_EVENT_DCLK_CNT		0x1e0
+#define LMC_EVENT_BANK_CONFLICT1	0x360
+#define LMC_EVENT_BANK_CONFLICT2	0x368
+
+/* map counter numbers to register offsets */
+static int lmc_events[] = {
+	LMC_EVENT_IFB_CNT,
+	LMC_EVENT_OPS_CNT,
+	LMC_EVENT_DCLK_CNT,
+	LMC_EVENT_BANK_CONFLICT1,
+	LMC_EVENT_BANK_CONFLICT2,
+};
+
+static int thunder_uncore_add_lmc(struct perf_event *event, int flags)
+{
+	struct hw_perf_event *hwc = &event->hw;
+
+	return thunder_uncore_add(event, flags,
+				  LMC_CONFIG_OFFSET,
+				  lmc_events[get_id(hwc->config)]);
+}
+
+PMU_FORMAT_ATTR(event, "config:0-2");
+
+static struct attribute *thunder_lmc_format_attr[] = {
+	&format_attr_event.attr,
+	&format_attr_node.attr,
+	NULL,
+};
+
+static struct attribute_group thunder_lmc_format_group = {
+	.name = "format",
+	.attrs = thunder_lmc_format_attr,
+};
+
+static struct attribute *thunder_lmc_events_attr[] = {
+	UC_EVENT_ENTRY(ifb_cnt, 0),
+	UC_EVENT_ENTRY(ops_cnt, 1),
+	UC_EVENT_ENTRY(dclk_cnt, 2),
+	UC_EVENT_ENTRY(bank_conflict1, 3),
+	UC_EVENT_ENTRY(bank_conflict2, 4),
+	NULL,
+};
+
+static struct attribute_group thunder_lmc_events_group = {
+	.name = "events",
+	.attrs = thunder_lmc_events_attr,
+};
+
+static const struct attribute_group *thunder_lmc_attr_groups[] = {
+	&thunder_uncore_attr_group,
+	&thunder_lmc_format_group,
+	&thunder_lmc_events_group,
+	NULL,
+};
+
+struct pmu thunder_lmc_pmu = {
+	.name		= "thunder_lmc",
+	.task_ctx_nr    = perf_invalid_context,
+	.event_init	= thunder_uncore_event_init,
+	.add		= thunder_uncore_add_lmc,
+	.del		= thunder_uncore_del,
+	.start		= thunder_uncore_start,
+	.stop		= thunder_uncore_stop,
+	.read		= thunder_uncore_read,
+	.attr_groups	= thunder_lmc_attr_groups,
+};
+
+static bool event_valid(u64 config)
+{
+	if (config < ARRAY_SIZE(lmc_events))
+		return true;
+
+	return false;
+}
+
+int __init thunder_uncore_lmc_setup(void)
+{
+	int ret = -ENOMEM;
+
+	thunder_uncore_lmc = kzalloc(sizeof(*thunder_uncore_lmc), GFP_KERNEL);
+	if (!thunder_uncore_lmc)
+		goto fail_nomem;
+
+	ret = thunder_uncore_setup(thunder_uncore_lmc, 0xa022,
+				   &thunder_lmc_pmu,
+				   ARRAY_SIZE(lmc_events));
+	if (ret)
+		goto fail;
+
+	thunder_uncore_lmc->event_valid = event_valid;
+	return 0;
+
+fail:
+	kfree(thunder_uncore_lmc);
+fail_nomem:
+	return ret;
+}
-- 
2.9.0.rc0.21.g7777322

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v4 5/5] arm64: perf: Cavium ThunderX OCX TLK uncore support
  2016-10-29 11:55 ` Jan Glauber
@ 2016-10-29 11:55   ` Jan Glauber
  -1 siblings, 0 replies; 28+ messages in thread
From: Jan Glauber @ 2016-10-29 11:55 UTC (permalink / raw)
  To: Mark Rutland, Will Deacon; +Cc: linux-kernel, linux-arm-kernel, Jan Glauber

Support for the OCX transmit link counters.

Signed-off-by: Jan Glauber <jglauber@cavium.com>
---
 drivers/perf/uncore/Makefile                |   3 +-
 drivers/perf/uncore/uncore_cavium.c         |   1 +
 drivers/perf/uncore/uncore_cavium.h         |   1 +
 drivers/perf/uncore/uncore_cavium_ocx_tlk.c | 344 ++++++++++++++++++++++++++++
 4 files changed, 348 insertions(+), 1 deletion(-)
 create mode 100644 drivers/perf/uncore/uncore_cavium_ocx_tlk.c

diff --git a/drivers/perf/uncore/Makefile b/drivers/perf/uncore/Makefile
index ef04a2b9..7e2e8e5 100644
--- a/drivers/perf/uncore/Makefile
+++ b/drivers/perf/uncore/Makefile
@@ -1,4 +1,5 @@
 obj-$(CONFIG_UNCORE_PMU_CAVIUM) += uncore_cavium.o		\
 				   uncore_cavium_l2c_tad.o	\
 				   uncore_cavium_l2c_cbc.o	\
-				   uncore_cavium_lmc.o
+				   uncore_cavium_lmc.o		\
+				   uncore_cavium_ocx_tlk.o
diff --git a/drivers/perf/uncore/uncore_cavium.c b/drivers/perf/uncore/uncore_cavium.c
index fd9e49e..46ced45 100644
--- a/drivers/perf/uncore/uncore_cavium.c
+++ b/drivers/perf/uncore/uncore_cavium.c
@@ -349,6 +349,7 @@ static int __init thunder_uncore_init(void)
 	thunder_uncore_l2c_tad_setup();
 	thunder_uncore_l2c_cbc_setup();
 	thunder_uncore_lmc_setup();
+	thunder_uncore_ocx_tlk_setup();
 	return 0;
 }
 late_initcall(thunder_uncore_init);
diff --git a/drivers/perf/uncore/uncore_cavium.h b/drivers/perf/uncore/uncore_cavium.h
index 3897586..43ab426 100644
--- a/drivers/perf/uncore/uncore_cavium.h
+++ b/drivers/perf/uncore/uncore_cavium.h
@@ -72,3 +72,4 @@ ssize_t thunder_events_sysfs_show(struct device *dev,
 int thunder_uncore_l2c_tad_setup(void);
 int thunder_uncore_l2c_cbc_setup(void);
 int thunder_uncore_lmc_setup(void);
+int thunder_uncore_ocx_tlk_setup(void);
diff --git a/drivers/perf/uncore/uncore_cavium_ocx_tlk.c b/drivers/perf/uncore/uncore_cavium_ocx_tlk.c
new file mode 100644
index 0000000..b4fc32b
--- /dev/null
+++ b/drivers/perf/uncore/uncore_cavium_ocx_tlk.c
@@ -0,0 +1,344 @@
+/*
+ * Cavium Thunder uncore PMU support,
+ * CCPI interface controller (OCX) Transmit link (TLK) counters.
+ *
+ * Copyright 2016 Cavium Inc.
+ * Author: Jan Glauber <jan.glauber@cavium.com>
+ */
+
+#include <linux/perf_event.h>
+#include <linux/slab.h>
+
+#include "uncore_cavium.h"
+
+struct thunder_uncore *thunder_uncore_ocx_tlk;
+
+#define OCX_TLK_NR_UNITS			3
+#define OCX_TLK_UNIT_OFFSET			0x2000
+#define OCX_TLK_STAT_CTL			0x10040
+#define OCX_TLK_STAT_OFFSET			0x10400
+
+#define OCX_TLK_STAT_ENABLE_BIT			BIT_ULL(0)
+#define OCX_TLK_STAT_RESET_BIT			BIT_ULL(1)
+
+/* OCX TLK event list */
+#define OCX_TLK_EVENT_STAT_IDLE_CNT		0x00
+#define OCX_TLK_EVENT_STAT_DATA_CNT		0x08
+#define OCX_TLK_EVENT_STAT_SYNC_CNT		0x10
+#define OCX_TLK_EVENT_STAT_RETRY_CNT		0x18
+#define OCX_TLK_EVENT_STAT_ERR_CNT		0x20
+#define OCX_TLK_EVENT_STAT_MAT0_CNT		0x40
+#define OCX_TLK_EVENT_STAT_MAT1_CNT		0x48
+#define OCX_TLK_EVENT_STAT_MAT2_CNT		0x50
+#define OCX_TLK_EVENT_STAT_MAT3_CNT		0x58
+#define OCX_TLK_EVENT_STAT_VC0_CMD		0x80
+#define OCX_TLK_EVENT_STAT_VC1_CMD		0x88
+#define OCX_TLK_EVENT_STAT_VC2_CMD		0x90
+#define OCX_TLK_EVENT_STAT_VC3_CMD		0x98
+#define OCX_TLK_EVENT_STAT_VC4_CMD		0xa0
+#define OCX_TLK_EVENT_STAT_VC5_CMD		0xa8
+#define OCX_TLK_EVENT_STAT_VC0_PKT		0x100
+#define OCX_TLK_EVENT_STAT_VC1_PKT		0x108
+#define OCX_TLK_EVENT_STAT_VC2_PKT		0x110
+#define OCX_TLK_EVENT_STAT_VC3_PKT		0x118
+#define OCX_TLK_EVENT_STAT_VC4_PKT		0x120
+#define OCX_TLK_EVENT_STAT_VC5_PKT		0x128
+#define OCX_TLK_EVENT_STAT_VC6_PKT		0x130
+#define OCX_TLK_EVENT_STAT_VC7_PKT		0x138
+#define OCX_TLK_EVENT_STAT_VC8_PKT		0x140
+#define OCX_TLK_EVENT_STAT_VC9_PKT		0x148
+#define OCX_TLK_EVENT_STAT_VC10_PKT		0x150
+#define OCX_TLK_EVENT_STAT_VC11_PKT		0x158
+#define OCX_TLK_EVENT_STAT_VC12_PKT		0x160
+#define OCX_TLK_EVENT_STAT_VC13_PKT		0x168
+#define OCX_TLK_EVENT_STAT_VC0_CON		0x180
+#define OCX_TLK_EVENT_STAT_VC1_CON		0x188
+#define OCX_TLK_EVENT_STAT_VC2_CON		0x190
+#define OCX_TLK_EVENT_STAT_VC3_CON		0x198
+#define OCX_TLK_EVENT_STAT_VC4_CON		0x1a0
+#define OCX_TLK_EVENT_STAT_VC5_CON		0x1a8
+#define OCX_TLK_EVENT_STAT_VC6_CON		0x1b0
+#define OCX_TLK_EVENT_STAT_VC7_CON		0x1b8
+#define OCX_TLK_EVENT_STAT_VC8_CON		0x1c0
+#define OCX_TLK_EVENT_STAT_VC9_CON		0x1c8
+#define OCX_TLK_EVENT_STAT_VC10_CON		0x1d0
+#define OCX_TLK_EVENT_STAT_VC11_CON		0x1d8
+#define OCX_TLK_EVENT_STAT_VC12_CON		0x1e0
+#define OCX_TLK_EVENT_STAT_VC13_CON		0x1e8
+
+static int ocx_tlk_events[] = {
+	OCX_TLK_EVENT_STAT_IDLE_CNT,
+	OCX_TLK_EVENT_STAT_DATA_CNT,
+	OCX_TLK_EVENT_STAT_SYNC_CNT,
+	OCX_TLK_EVENT_STAT_RETRY_CNT,
+	OCX_TLK_EVENT_STAT_ERR_CNT,
+	OCX_TLK_EVENT_STAT_MAT0_CNT,
+	OCX_TLK_EVENT_STAT_MAT1_CNT,
+	OCX_TLK_EVENT_STAT_MAT2_CNT,
+	OCX_TLK_EVENT_STAT_MAT3_CNT,
+	OCX_TLK_EVENT_STAT_VC0_CMD,
+	OCX_TLK_EVENT_STAT_VC1_CMD,
+	OCX_TLK_EVENT_STAT_VC2_CMD,
+	OCX_TLK_EVENT_STAT_VC3_CMD,
+	OCX_TLK_EVENT_STAT_VC4_CMD,
+	OCX_TLK_EVENT_STAT_VC5_CMD,
+	OCX_TLK_EVENT_STAT_VC0_PKT,
+	OCX_TLK_EVENT_STAT_VC1_PKT,
+	OCX_TLK_EVENT_STAT_VC2_PKT,
+	OCX_TLK_EVENT_STAT_VC3_PKT,
+	OCX_TLK_EVENT_STAT_VC4_PKT,
+	OCX_TLK_EVENT_STAT_VC5_PKT,
+	OCX_TLK_EVENT_STAT_VC6_PKT,
+	OCX_TLK_EVENT_STAT_VC7_PKT,
+	OCX_TLK_EVENT_STAT_VC8_PKT,
+	OCX_TLK_EVENT_STAT_VC9_PKT,
+	OCX_TLK_EVENT_STAT_VC10_PKT,
+	OCX_TLK_EVENT_STAT_VC11_PKT,
+	OCX_TLK_EVENT_STAT_VC12_PKT,
+	OCX_TLK_EVENT_STAT_VC13_PKT,
+	OCX_TLK_EVENT_STAT_VC0_CON,
+	OCX_TLK_EVENT_STAT_VC1_CON,
+	OCX_TLK_EVENT_STAT_VC2_CON,
+	OCX_TLK_EVENT_STAT_VC3_CON,
+	OCX_TLK_EVENT_STAT_VC4_CON,
+	OCX_TLK_EVENT_STAT_VC5_CON,
+	OCX_TLK_EVENT_STAT_VC6_CON,
+	OCX_TLK_EVENT_STAT_VC7_CON,
+	OCX_TLK_EVENT_STAT_VC8_CON,
+	OCX_TLK_EVENT_STAT_VC9_CON,
+	OCX_TLK_EVENT_STAT_VC10_CON,
+	OCX_TLK_EVENT_STAT_VC11_CON,
+	OCX_TLK_EVENT_STAT_VC12_CON,
+	OCX_TLK_EVENT_STAT_VC13_CON,
+};
+
+/*
+ * The OCX devices have a single device per node, therefore picking the
+ * first device from the list is correct.
+ */
+static inline void __iomem *map_offset(struct thunder_uncore_node *node,
+				       unsigned long addr, int offset, int nr)
+{
+	struct thunder_uncore_unit *unit;
+
+	unit = list_first_entry(&node->unit_list, struct thunder_uncore_unit,
+				entry);
+	return (void __iomem *)(addr + unit->map + nr * offset);
+}
+
+static void __iomem *map_offset_ocx_tlk(struct thunder_uncore_node *node,
+					unsigned long addr, int nr)
+{
+	return (void __iomem *)map_offset(node, addr, nr, OCX_TLK_UNIT_OFFSET);
+}
+
+/*
+ * The OCX TLK counters can only be enabled/disabled as a set so we do
+ * this in pmu_enable/disable instead of start/stop.
+ */
+static void thunder_uncore_pmu_enable_ocx_tlk(struct pmu *pmu)
+{
+	struct thunder_uncore *uncore =
+		container_of(pmu, struct thunder_uncore, pmu);
+	int node = 0, i;
+
+	while (uncore->nodes[node++]) {
+		for (i = 0; i < OCX_TLK_NR_UNITS; i++) {
+			/* reset all TLK counters to zero */
+			writeb(OCX_TLK_STAT_RESET_BIT,
+			       map_offset_ocx_tlk(uncore->nodes[node],
+						  OCX_TLK_STAT_CTL, i));
+			/* enable all TLK counters */
+			writeb(OCX_TLK_STAT_ENABLE_BIT,
+			       map_offset_ocx_tlk(uncore->nodes[node],
+						  OCX_TLK_STAT_CTL, i));
+		}
+	}
+}
+
+/*
+ * The OCX TLK counters can only be enabled/disabled as a set so we do
+ * this in pmu_enable/disable instead of start/stop.
+ */
+static void thunder_uncore_pmu_disable_ocx_tlk(struct pmu *pmu)
+{
+	struct thunder_uncore *uncore =
+		container_of(pmu, struct thunder_uncore, pmu);
+	int node = 0, i;
+
+	while (uncore->nodes[node++]) {
+		for (i = 0; i < OCX_TLK_NR_UNITS; i++) {
+			/* disable all TLK counters */
+			writeb(0, map_offset_ocx_tlk(uncore->nodes[node],
+						     OCX_TLK_STAT_CTL, i));
+		}
+	}
+}
+
+/*
+ * Summarize counters across all TLK's. Different from the other uncore
+ * PMUs because all TLK's are on one PCI device.
+ */
+static void thunder_uncore_read_ocx_tlk(struct perf_event *event)
+{
+	struct thunder_uncore *uncore = to_uncore(event->pmu);
+	struct hw_perf_event *hwc = &event->hw;
+	struct thunder_uncore_node *node;
+	u64 new = 0;
+	int i;
+
+	/* read counter values from all units */
+	node = get_node(hwc->config, uncore);
+	for (i = 0; i < OCX_TLK_NR_UNITS; i++)
+		new += readq(map_offset_ocx_tlk(node, hwc->event_base, i));
+
+	local64_add(new, &hwc->prev_count);
+	local64_add(new, &event->count);
+}
+
+static void thunder_uncore_start_ocx_tlk(struct perf_event *event, int flags)
+{
+	struct thunder_uncore *uncore = to_uncore(event->pmu);
+	struct hw_perf_event *hwc = &event->hw;
+	struct thunder_uncore_node *node;
+	u64 new = 0;
+	int i;
+
+	/* read counter values from all units on the node */
+	node = get_node(hwc->config, uncore);
+	for (i = 0; i < OCX_TLK_NR_UNITS; i++)
+		new += readq(map_offset_ocx_tlk(node, hwc->event_base, i));
+	local64_set(&hwc->prev_count, new);
+
+	hwc->state = 0;
+	perf_event_update_userpage(event);
+}
+
+static int thunder_uncore_add_ocx_tlk(struct perf_event *event, int flags)
+{
+	struct hw_perf_event *hwc = &event->hw;
+
+	return thunder_uncore_add(event, flags,
+				  OCX_TLK_STAT_CTL,
+				  OCX_TLK_STAT_OFFSET + ocx_tlk_events[get_id(hwc->config)]);
+}
+
+PMU_FORMAT_ATTR(event, "config:0-5");
+
+static struct attribute *thunder_ocx_tlk_format_attr[] = {
+	&format_attr_event.attr,
+	&format_attr_node.attr,
+	NULL,
+};
+
+static struct attribute_group thunder_ocx_tlk_format_group = {
+	.name = "format",
+	.attrs = thunder_ocx_tlk_format_attr,
+};
+
+static struct attribute *thunder_ocx_tlk_events_attr[] = {
+	UC_EVENT_ENTRY(idle_cnt,	0),
+	UC_EVENT_ENTRY(data_cnt,	1),
+	UC_EVENT_ENTRY(sync_cnt,	2),
+	UC_EVENT_ENTRY(retry_cnt,	3),
+	UC_EVENT_ENTRY(err_cnt,		4),
+	UC_EVENT_ENTRY(mat0_cnt,	5),
+	UC_EVENT_ENTRY(mat1_cnt,	6),
+	UC_EVENT_ENTRY(mat2_cnt,	7),
+	UC_EVENT_ENTRY(mat3_cnt,	8),
+	UC_EVENT_ENTRY(vc0_cmd,		9),
+	UC_EVENT_ENTRY(vc1_cmd,		10),
+	UC_EVENT_ENTRY(vc2_cmd,		11),
+	UC_EVENT_ENTRY(vc3_cmd,		12),
+	UC_EVENT_ENTRY(vc4_cmd,		13),
+	UC_EVENT_ENTRY(vc5_cmd,		14),
+	UC_EVENT_ENTRY(vc0_pkt,		15),
+	UC_EVENT_ENTRY(vc1_pkt,		16),
+	UC_EVENT_ENTRY(vc2_pkt,		17),
+	UC_EVENT_ENTRY(vc3_pkt,		18),
+	UC_EVENT_ENTRY(vc4_pkt,		19),
+	UC_EVENT_ENTRY(vc5_pkt,		20),
+	UC_EVENT_ENTRY(vc6_pkt,		21),
+	UC_EVENT_ENTRY(vc7_pkt,		22),
+	UC_EVENT_ENTRY(vc8_pkt,		23),
+	UC_EVENT_ENTRY(vc9_pkt,		24),
+	UC_EVENT_ENTRY(vc10_pkt,	25),
+	UC_EVENT_ENTRY(vc11_pkt,	26),
+	UC_EVENT_ENTRY(vc12_pkt,	27),
+	UC_EVENT_ENTRY(vc13_pkt,	28),
+	UC_EVENT_ENTRY(vc0_con,		29),
+	UC_EVENT_ENTRY(vc1_con,		30),
+	UC_EVENT_ENTRY(vc2_con,		31),
+	UC_EVENT_ENTRY(vc3_con,		32),
+	UC_EVENT_ENTRY(vc4_con,		33),
+	UC_EVENT_ENTRY(vc5_con,		34),
+	UC_EVENT_ENTRY(vc6_con,		35),
+	UC_EVENT_ENTRY(vc7_con,		36),
+	UC_EVENT_ENTRY(vc8_con,		37),
+	UC_EVENT_ENTRY(vc9_con,		38),
+	UC_EVENT_ENTRY(vc10_con,	39),
+	UC_EVENT_ENTRY(vc11_con,	40),
+	UC_EVENT_ENTRY(vc12_con,	41),
+	UC_EVENT_ENTRY(vc13_con,	42),
+	NULL,
+};
+
+static struct attribute_group thunder_ocx_tlk_events_group = {
+	.name = "events",
+	.attrs = thunder_ocx_tlk_events_attr,
+};
+
+static const struct attribute_group *thunder_ocx_tlk_attr_groups[] = {
+	&thunder_uncore_attr_group,
+	&thunder_ocx_tlk_format_group,
+	&thunder_ocx_tlk_events_group,
+	NULL,
+};
+
+struct pmu thunder_ocx_tlk_pmu = {
+	.name		= "thunder_ocx_tlk",
+	.task_ctx_nr    = perf_invalid_context,
+	.pmu_enable	= thunder_uncore_pmu_enable_ocx_tlk,
+	.pmu_disable	= thunder_uncore_pmu_disable_ocx_tlk,
+	.event_init	= thunder_uncore_event_init,
+	.add		= thunder_uncore_add_ocx_tlk,
+	.del		= thunder_uncore_del,
+	.start		= thunder_uncore_start_ocx_tlk,
+	.stop		= thunder_uncore_stop,
+	.read		= thunder_uncore_read_ocx_tlk,
+	.attr_groups	= thunder_ocx_tlk_attr_groups,
+};
+
+static bool event_valid(u64 config)
+{
+	if (config < ARRAY_SIZE(ocx_tlk_events))
+		return true;
+
+	return false;
+}
+
+int __init thunder_uncore_ocx_tlk_setup(void)
+{
+	int ret;
+
+	thunder_uncore_ocx_tlk = kzalloc(sizeof(*thunder_uncore_ocx_tlk),
+					 GFP_KERNEL);
+	if (!thunder_uncore_ocx_tlk) {
+		ret = -ENOMEM;
+		goto fail_nomem;
+	}
+
+	ret = thunder_uncore_setup(thunder_uncore_ocx_tlk, 0xa013,
+				   &thunder_ocx_tlk_pmu,
+				   ARRAY_SIZE(ocx_tlk_events));
+	if (ret)
+		goto fail;
+
+	thunder_uncore_ocx_tlk->event_valid = event_valid;
+	return 0;
+
+fail:
+	kfree(thunder_uncore_ocx_tlk);
+fail_nomem:
+	return ret;
+}
-- 
2.9.0.rc0.21.g7777322

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v4 5/5] arm64: perf: Cavium ThunderX OCX TLK uncore support
@ 2016-10-29 11:55   ` Jan Glauber
  0 siblings, 0 replies; 28+ messages in thread
From: Jan Glauber @ 2016-10-29 11:55 UTC (permalink / raw)
  To: linux-arm-kernel

Support for the OCX transmit link counters.

Signed-off-by: Jan Glauber <jglauber@cavium.com>
---
 drivers/perf/uncore/Makefile                |   3 +-
 drivers/perf/uncore/uncore_cavium.c         |   1 +
 drivers/perf/uncore/uncore_cavium.h         |   1 +
 drivers/perf/uncore/uncore_cavium_ocx_tlk.c | 344 ++++++++++++++++++++++++++++
 4 files changed, 348 insertions(+), 1 deletion(-)
 create mode 100644 drivers/perf/uncore/uncore_cavium_ocx_tlk.c

diff --git a/drivers/perf/uncore/Makefile b/drivers/perf/uncore/Makefile
index ef04a2b9..7e2e8e5 100644
--- a/drivers/perf/uncore/Makefile
+++ b/drivers/perf/uncore/Makefile
@@ -1,4 +1,5 @@
 obj-$(CONFIG_UNCORE_PMU_CAVIUM) += uncore_cavium.o		\
 				   uncore_cavium_l2c_tad.o	\
 				   uncore_cavium_l2c_cbc.o	\
-				   uncore_cavium_lmc.o
+				   uncore_cavium_lmc.o		\
+				   uncore_cavium_ocx_tlk.o
diff --git a/drivers/perf/uncore/uncore_cavium.c b/drivers/perf/uncore/uncore_cavium.c
index fd9e49e..46ced45 100644
--- a/drivers/perf/uncore/uncore_cavium.c
+++ b/drivers/perf/uncore/uncore_cavium.c
@@ -349,6 +349,7 @@ static int __init thunder_uncore_init(void)
 	thunder_uncore_l2c_tad_setup();
 	thunder_uncore_l2c_cbc_setup();
 	thunder_uncore_lmc_setup();
+	thunder_uncore_ocx_tlk_setup();
 	return 0;
 }
 late_initcall(thunder_uncore_init);
diff --git a/drivers/perf/uncore/uncore_cavium.h b/drivers/perf/uncore/uncore_cavium.h
index 3897586..43ab426 100644
--- a/drivers/perf/uncore/uncore_cavium.h
+++ b/drivers/perf/uncore/uncore_cavium.h
@@ -72,3 +72,4 @@ ssize_t thunder_events_sysfs_show(struct device *dev,
 int thunder_uncore_l2c_tad_setup(void);
 int thunder_uncore_l2c_cbc_setup(void);
 int thunder_uncore_lmc_setup(void);
+int thunder_uncore_ocx_tlk_setup(void);
diff --git a/drivers/perf/uncore/uncore_cavium_ocx_tlk.c b/drivers/perf/uncore/uncore_cavium_ocx_tlk.c
new file mode 100644
index 0000000..b4fc32b
--- /dev/null
+++ b/drivers/perf/uncore/uncore_cavium_ocx_tlk.c
@@ -0,0 +1,344 @@
+/*
+ * Cavium Thunder uncore PMU support,
+ * CCPI interface controller (OCX) Transmit link (TLK) counters.
+ *
+ * Copyright 2016 Cavium Inc.
+ * Author: Jan Glauber <jan.glauber@cavium.com>
+ */
+
+#include <linux/perf_event.h>
+#include <linux/slab.h>
+
+#include "uncore_cavium.h"
+
+struct thunder_uncore *thunder_uncore_ocx_tlk;
+
+#define OCX_TLK_NR_UNITS			3
+#define OCX_TLK_UNIT_OFFSET			0x2000
+#define OCX_TLK_STAT_CTL			0x10040
+#define OCX_TLK_STAT_OFFSET			0x10400
+
+#define OCX_TLK_STAT_ENABLE_BIT			BIT_ULL(0)
+#define OCX_TLK_STAT_RESET_BIT			BIT_ULL(1)
+
+/* OCX TLK event list */
+#define OCX_TLK_EVENT_STAT_IDLE_CNT		0x00
+#define OCX_TLK_EVENT_STAT_DATA_CNT		0x08
+#define OCX_TLK_EVENT_STAT_SYNC_CNT		0x10
+#define OCX_TLK_EVENT_STAT_RETRY_CNT		0x18
+#define OCX_TLK_EVENT_STAT_ERR_CNT		0x20
+#define OCX_TLK_EVENT_STAT_MAT0_CNT		0x40
+#define OCX_TLK_EVENT_STAT_MAT1_CNT		0x48
+#define OCX_TLK_EVENT_STAT_MAT2_CNT		0x50
+#define OCX_TLK_EVENT_STAT_MAT3_CNT		0x58
+#define OCX_TLK_EVENT_STAT_VC0_CMD		0x80
+#define OCX_TLK_EVENT_STAT_VC1_CMD		0x88
+#define OCX_TLK_EVENT_STAT_VC2_CMD		0x90
+#define OCX_TLK_EVENT_STAT_VC3_CMD		0x98
+#define OCX_TLK_EVENT_STAT_VC4_CMD		0xa0
+#define OCX_TLK_EVENT_STAT_VC5_CMD		0xa8
+#define OCX_TLK_EVENT_STAT_VC0_PKT		0x100
+#define OCX_TLK_EVENT_STAT_VC1_PKT		0x108
+#define OCX_TLK_EVENT_STAT_VC2_PKT		0x110
+#define OCX_TLK_EVENT_STAT_VC3_PKT		0x118
+#define OCX_TLK_EVENT_STAT_VC4_PKT		0x120
+#define OCX_TLK_EVENT_STAT_VC5_PKT		0x128
+#define OCX_TLK_EVENT_STAT_VC6_PKT		0x130
+#define OCX_TLK_EVENT_STAT_VC7_PKT		0x138
+#define OCX_TLK_EVENT_STAT_VC8_PKT		0x140
+#define OCX_TLK_EVENT_STAT_VC9_PKT		0x148
+#define OCX_TLK_EVENT_STAT_VC10_PKT		0x150
+#define OCX_TLK_EVENT_STAT_VC11_PKT		0x158
+#define OCX_TLK_EVENT_STAT_VC12_PKT		0x160
+#define OCX_TLK_EVENT_STAT_VC13_PKT		0x168
+#define OCX_TLK_EVENT_STAT_VC0_CON		0x180
+#define OCX_TLK_EVENT_STAT_VC1_CON		0x188
+#define OCX_TLK_EVENT_STAT_VC2_CON		0x190
+#define OCX_TLK_EVENT_STAT_VC3_CON		0x198
+#define OCX_TLK_EVENT_STAT_VC4_CON		0x1a0
+#define OCX_TLK_EVENT_STAT_VC5_CON		0x1a8
+#define OCX_TLK_EVENT_STAT_VC6_CON		0x1b0
+#define OCX_TLK_EVENT_STAT_VC7_CON		0x1b8
+#define OCX_TLK_EVENT_STAT_VC8_CON		0x1c0
+#define OCX_TLK_EVENT_STAT_VC9_CON		0x1c8
+#define OCX_TLK_EVENT_STAT_VC10_CON		0x1d0
+#define OCX_TLK_EVENT_STAT_VC11_CON		0x1d8
+#define OCX_TLK_EVENT_STAT_VC12_CON		0x1e0
+#define OCX_TLK_EVENT_STAT_VC13_CON		0x1e8
+
+static int ocx_tlk_events[] = {
+	OCX_TLK_EVENT_STAT_IDLE_CNT,
+	OCX_TLK_EVENT_STAT_DATA_CNT,
+	OCX_TLK_EVENT_STAT_SYNC_CNT,
+	OCX_TLK_EVENT_STAT_RETRY_CNT,
+	OCX_TLK_EVENT_STAT_ERR_CNT,
+	OCX_TLK_EVENT_STAT_MAT0_CNT,
+	OCX_TLK_EVENT_STAT_MAT1_CNT,
+	OCX_TLK_EVENT_STAT_MAT2_CNT,
+	OCX_TLK_EVENT_STAT_MAT3_CNT,
+	OCX_TLK_EVENT_STAT_VC0_CMD,
+	OCX_TLK_EVENT_STAT_VC1_CMD,
+	OCX_TLK_EVENT_STAT_VC2_CMD,
+	OCX_TLK_EVENT_STAT_VC3_CMD,
+	OCX_TLK_EVENT_STAT_VC4_CMD,
+	OCX_TLK_EVENT_STAT_VC5_CMD,
+	OCX_TLK_EVENT_STAT_VC0_PKT,
+	OCX_TLK_EVENT_STAT_VC1_PKT,
+	OCX_TLK_EVENT_STAT_VC2_PKT,
+	OCX_TLK_EVENT_STAT_VC3_PKT,
+	OCX_TLK_EVENT_STAT_VC4_PKT,
+	OCX_TLK_EVENT_STAT_VC5_PKT,
+	OCX_TLK_EVENT_STAT_VC6_PKT,
+	OCX_TLK_EVENT_STAT_VC7_PKT,
+	OCX_TLK_EVENT_STAT_VC8_PKT,
+	OCX_TLK_EVENT_STAT_VC9_PKT,
+	OCX_TLK_EVENT_STAT_VC10_PKT,
+	OCX_TLK_EVENT_STAT_VC11_PKT,
+	OCX_TLK_EVENT_STAT_VC12_PKT,
+	OCX_TLK_EVENT_STAT_VC13_PKT,
+	OCX_TLK_EVENT_STAT_VC0_CON,
+	OCX_TLK_EVENT_STAT_VC1_CON,
+	OCX_TLK_EVENT_STAT_VC2_CON,
+	OCX_TLK_EVENT_STAT_VC3_CON,
+	OCX_TLK_EVENT_STAT_VC4_CON,
+	OCX_TLK_EVENT_STAT_VC5_CON,
+	OCX_TLK_EVENT_STAT_VC6_CON,
+	OCX_TLK_EVENT_STAT_VC7_CON,
+	OCX_TLK_EVENT_STAT_VC8_CON,
+	OCX_TLK_EVENT_STAT_VC9_CON,
+	OCX_TLK_EVENT_STAT_VC10_CON,
+	OCX_TLK_EVENT_STAT_VC11_CON,
+	OCX_TLK_EVENT_STAT_VC12_CON,
+	OCX_TLK_EVENT_STAT_VC13_CON,
+};
+
+/*
+ * The OCX devices have a single device per node, therefore picking the
+ * first device from the list is correct.
+ */
+static inline void __iomem *map_offset(struct thunder_uncore_node *node,
+				       unsigned long addr, int offset, int nr)
+{
+	struct thunder_uncore_unit *unit;
+
+	unit = list_first_entry(&node->unit_list, struct thunder_uncore_unit,
+				entry);
+	return (void __iomem *)(addr + unit->map + nr * offset);
+}
+
+static void __iomem *map_offset_ocx_tlk(struct thunder_uncore_node *node,
+					unsigned long addr, int nr)
+{
+	return (void __iomem *)map_offset(node, addr, nr, OCX_TLK_UNIT_OFFSET);
+}
+
+/*
+ * The OCX TLK counters can only be enabled/disabled as a set so we do
+ * this in pmu_enable/disable instead of start/stop.
+ */
+static void thunder_uncore_pmu_enable_ocx_tlk(struct pmu *pmu)
+{
+	struct thunder_uncore *uncore =
+		container_of(pmu, struct thunder_uncore, pmu);
+	int node = 0, i;
+
+	while (uncore->nodes[node++]) {
+		for (i = 0; i < OCX_TLK_NR_UNITS; i++) {
+			/* reset all TLK counters to zero */
+			writeb(OCX_TLK_STAT_RESET_BIT,
+			       map_offset_ocx_tlk(uncore->nodes[node],
+						  OCX_TLK_STAT_CTL, i));
+			/* enable all TLK counters */
+			writeb(OCX_TLK_STAT_ENABLE_BIT,
+			       map_offset_ocx_tlk(uncore->nodes[node],
+						  OCX_TLK_STAT_CTL, i));
+		}
+	}
+}
+
+/*
+ * The OCX TLK counters can only be enabled/disabled as a set so we do
+ * this in pmu_enable/disable instead of start/stop.
+ */
+static void thunder_uncore_pmu_disable_ocx_tlk(struct pmu *pmu)
+{
+	struct thunder_uncore *uncore =
+		container_of(pmu, struct thunder_uncore, pmu);
+	int node = 0, i;
+
+	while (uncore->nodes[node++]) {
+		for (i = 0; i < OCX_TLK_NR_UNITS; i++) {
+			/* disable all TLK counters */
+			writeb(0, map_offset_ocx_tlk(uncore->nodes[node],
+						     OCX_TLK_STAT_CTL, i));
+		}
+	}
+}
+
+/*
+ * Summarize counters across all TLK's. Different from the other uncore
+ * PMUs because all TLK's are on one PCI device.
+ */
+static void thunder_uncore_read_ocx_tlk(struct perf_event *event)
+{
+	struct thunder_uncore *uncore = to_uncore(event->pmu);
+	struct hw_perf_event *hwc = &event->hw;
+	struct thunder_uncore_node *node;
+	u64 new = 0;
+	int i;
+
+	/* read counter values from all units */
+	node = get_node(hwc->config, uncore);
+	for (i = 0; i < OCX_TLK_NR_UNITS; i++)
+		new += readq(map_offset_ocx_tlk(node, hwc->event_base, i));
+
+	local64_add(new, &hwc->prev_count);
+	local64_add(new, &event->count);
+}
+
+static void thunder_uncore_start_ocx_tlk(struct perf_event *event, int flags)
+{
+	struct thunder_uncore *uncore = to_uncore(event->pmu);
+	struct hw_perf_event *hwc = &event->hw;
+	struct thunder_uncore_node *node;
+	u64 new = 0;
+	int i;
+
+	/* read counter values from all units on the node */
+	node = get_node(hwc->config, uncore);
+	for (i = 0; i < OCX_TLK_NR_UNITS; i++)
+		new += readq(map_offset_ocx_tlk(node, hwc->event_base, i));
+	local64_set(&hwc->prev_count, new);
+
+	hwc->state = 0;
+	perf_event_update_userpage(event);
+}
+
+static int thunder_uncore_add_ocx_tlk(struct perf_event *event, int flags)
+{
+	struct hw_perf_event *hwc = &event->hw;
+
+	return thunder_uncore_add(event, flags,
+				  OCX_TLK_STAT_CTL,
+				  OCX_TLK_STAT_OFFSET + ocx_tlk_events[get_id(hwc->config)]);
+}
+
+PMU_FORMAT_ATTR(event, "config:0-5");
+
+static struct attribute *thunder_ocx_tlk_format_attr[] = {
+	&format_attr_event.attr,
+	&format_attr_node.attr,
+	NULL,
+};
+
+static struct attribute_group thunder_ocx_tlk_format_group = {
+	.name = "format",
+	.attrs = thunder_ocx_tlk_format_attr,
+};
+
+static struct attribute *thunder_ocx_tlk_events_attr[] = {
+	UC_EVENT_ENTRY(idle_cnt,	0),
+	UC_EVENT_ENTRY(data_cnt,	1),
+	UC_EVENT_ENTRY(sync_cnt,	2),
+	UC_EVENT_ENTRY(retry_cnt,	3),
+	UC_EVENT_ENTRY(err_cnt,		4),
+	UC_EVENT_ENTRY(mat0_cnt,	5),
+	UC_EVENT_ENTRY(mat1_cnt,	6),
+	UC_EVENT_ENTRY(mat2_cnt,	7),
+	UC_EVENT_ENTRY(mat3_cnt,	8),
+	UC_EVENT_ENTRY(vc0_cmd,		9),
+	UC_EVENT_ENTRY(vc1_cmd,		10),
+	UC_EVENT_ENTRY(vc2_cmd,		11),
+	UC_EVENT_ENTRY(vc3_cmd,		12),
+	UC_EVENT_ENTRY(vc4_cmd,		13),
+	UC_EVENT_ENTRY(vc5_cmd,		14),
+	UC_EVENT_ENTRY(vc0_pkt,		15),
+	UC_EVENT_ENTRY(vc1_pkt,		16),
+	UC_EVENT_ENTRY(vc2_pkt,		17),
+	UC_EVENT_ENTRY(vc3_pkt,		18),
+	UC_EVENT_ENTRY(vc4_pkt,		19),
+	UC_EVENT_ENTRY(vc5_pkt,		20),
+	UC_EVENT_ENTRY(vc6_pkt,		21),
+	UC_EVENT_ENTRY(vc7_pkt,		22),
+	UC_EVENT_ENTRY(vc8_pkt,		23),
+	UC_EVENT_ENTRY(vc9_pkt,		24),
+	UC_EVENT_ENTRY(vc10_pkt,	25),
+	UC_EVENT_ENTRY(vc11_pkt,	26),
+	UC_EVENT_ENTRY(vc12_pkt,	27),
+	UC_EVENT_ENTRY(vc13_pkt,	28),
+	UC_EVENT_ENTRY(vc0_con,		29),
+	UC_EVENT_ENTRY(vc1_con,		30),
+	UC_EVENT_ENTRY(vc2_con,		31),
+	UC_EVENT_ENTRY(vc3_con,		32),
+	UC_EVENT_ENTRY(vc4_con,		33),
+	UC_EVENT_ENTRY(vc5_con,		34),
+	UC_EVENT_ENTRY(vc6_con,		35),
+	UC_EVENT_ENTRY(vc7_con,		36),
+	UC_EVENT_ENTRY(vc8_con,		37),
+	UC_EVENT_ENTRY(vc9_con,		38),
+	UC_EVENT_ENTRY(vc10_con,	39),
+	UC_EVENT_ENTRY(vc11_con,	40),
+	UC_EVENT_ENTRY(vc12_con,	41),
+	UC_EVENT_ENTRY(vc13_con,	42),
+	NULL,
+};
+
+static struct attribute_group thunder_ocx_tlk_events_group = {
+	.name = "events",
+	.attrs = thunder_ocx_tlk_events_attr,
+};
+
+static const struct attribute_group *thunder_ocx_tlk_attr_groups[] = {
+	&thunder_uncore_attr_group,
+	&thunder_ocx_tlk_format_group,
+	&thunder_ocx_tlk_events_group,
+	NULL,
+};
+
+struct pmu thunder_ocx_tlk_pmu = {
+	.name		= "thunder_ocx_tlk",
+	.task_ctx_nr    = perf_invalid_context,
+	.pmu_enable	= thunder_uncore_pmu_enable_ocx_tlk,
+	.pmu_disable	= thunder_uncore_pmu_disable_ocx_tlk,
+	.event_init	= thunder_uncore_event_init,
+	.add		= thunder_uncore_add_ocx_tlk,
+	.del		= thunder_uncore_del,
+	.start		= thunder_uncore_start_ocx_tlk,
+	.stop		= thunder_uncore_stop,
+	.read		= thunder_uncore_read_ocx_tlk,
+	.attr_groups	= thunder_ocx_tlk_attr_groups,
+};
+
+static bool event_valid(u64 config)
+{
+	if (config < ARRAY_SIZE(ocx_tlk_events))
+		return true;
+
+	return false;
+}
+
+int __init thunder_uncore_ocx_tlk_setup(void)
+{
+	int ret;
+
+	thunder_uncore_ocx_tlk = kzalloc(sizeof(*thunder_uncore_ocx_tlk),
+					 GFP_KERNEL);
+	if (!thunder_uncore_ocx_tlk) {
+		ret = -ENOMEM;
+		goto fail_nomem;
+	}
+
+	ret = thunder_uncore_setup(thunder_uncore_ocx_tlk, 0xa013,
+				   &thunder_ocx_tlk_pmu,
+				   ARRAY_SIZE(ocx_tlk_events));
+	if (ret)
+		goto fail;
+
+	thunder_uncore_ocx_tlk->event_valid = event_valid;
+	return 0;
+
+fail:
+	kfree(thunder_uncore_ocx_tlk);
+fail_nomem:
+	return ret;
+}
-- 
2.9.0.rc0.21.g7777322

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 1/5] arm64: perf: Basic uncore counter support for Cavium ThunderX SOC
  2016-10-29 11:55   ` Jan Glauber
@ 2016-11-08 23:50     ` Will Deacon
  -1 siblings, 0 replies; 28+ messages in thread
From: Will Deacon @ 2016-11-08 23:50 UTC (permalink / raw)
  To: Jan Glauber; +Cc: Mark Rutland, linux-kernel, linux-arm-kernel

Hi Jan,

Thanks for posting an updated series. I have a few minor comments, which
we can hopefully address in time for 4.10.

Also, have you run the perf fuzzer with this series applied?

https://github.com/deater/perf_event_tests

(build the tests and look under the fuzzer/ directory for the tool)

On Sat, Oct 29, 2016 at 01:55:29PM +0200, Jan Glauber wrote:
> Provide "uncore" facilities for different non-CPU performance
> counter units.
> 
> The uncore PMUs can be found under /sys/bus/event_source/devices.
> All counters are exported via sysfs in the corresponding events
> files under the PMU directory so the perf tool can list the event names.
> 
> There are some points that are special in this implementation:
> 
> 1) The PMU detection relies on PCI device detection. If a
>    matching PCI device is found the PMU is created. The code can deal
>    with multiple units of the same type, e.g. more than one memory
>    controller.
> 
> 2) Counters are summarized across different units of the same type
>    on one NUMA node but not across NUMA nodes.
>    For instance L2C TAD 0..7 are presented as a single counter
>    (adding the values from TAD 0 to 7). Although losing the ability
>    to read a single value the merged values are easier to use.
> 
> 3) The counters are not CPU related. A random CPU is picked regardless
>    of the NUMA node. There is a small performance penalty for accessing
>    counters on a remote note but reading a performance counter is a
>    slow operation anyway.
> 
> Signed-off-by: Jan Glauber <jglauber@cavium.com>
> ---
>  drivers/perf/Kconfig                |  13 ++
>  drivers/perf/Makefile               |   1 +
>  drivers/perf/uncore/Makefile        |   1 +
>  drivers/perf/uncore/uncore_cavium.c | 351 ++++++++++++++++++++++++++++++++++++
>  drivers/perf/uncore/uncore_cavium.h |  71 ++++++++

We already have "uncore" PMUs under drivers/perf, so I'd prefer that we
renamed this a bit to reflect better what's going on. How about:

  drivers/perf/cavium/

and then

  drivers/perf/cavium/uncore_thunder.[ch]

?

>  include/linux/cpuhotplug.h          |   1 +
>  6 files changed, 438 insertions(+)
>  create mode 100644 drivers/perf/uncore/Makefile
>  create mode 100644 drivers/perf/uncore/uncore_cavium.c
>  create mode 100644 drivers/perf/uncore/uncore_cavium.h
> 
> diff --git a/drivers/perf/Kconfig b/drivers/perf/Kconfig
> index 4d5c5f9..3266c87 100644
> --- a/drivers/perf/Kconfig
> +++ b/drivers/perf/Kconfig
> @@ -19,4 +19,17 @@ config XGENE_PMU
>          help
>            Say y if you want to use APM X-Gene SoC performance monitors.
>  
> +config UNCORE_PMU
> +	bool

This isn't needed.

> +
> +config UNCORE_PMU_CAVIUM
> +	depends on PERF_EVENTS && NUMA && ARM64
> +	bool "Cavium uncore PMU support"

Please mentioned Thunder somewhere, since that's the SoC being supported.

> +	select UNCORE_PMU
> +	default y
> +	help
> +	  Say y if you want to access performance counters of subsystems
> +	  on a Cavium SOC like cache controller, memory controller or
> +	  processor interconnect.
> +
>  endmenu
> diff --git a/drivers/perf/Makefile b/drivers/perf/Makefile
> index b116e98..d6c02c9 100644
> --- a/drivers/perf/Makefile
> +++ b/drivers/perf/Makefile
> @@ -1,2 +1,3 @@
>  obj-$(CONFIG_ARM_PMU) += arm_pmu.o
>  obj-$(CONFIG_XGENE_PMU) += xgene_pmu.o
> +obj-y += uncore/
> diff --git a/drivers/perf/uncore/Makefile b/drivers/perf/uncore/Makefile
> new file mode 100644
> index 0000000..6130e18
> --- /dev/null
> +++ b/drivers/perf/uncore/Makefile
> @@ -0,0 +1 @@
> +obj-$(CONFIG_UNCORE_PMU_CAVIUM) += uncore_cavium.o
> diff --git a/drivers/perf/uncore/uncore_cavium.c b/drivers/perf/uncore/uncore_cavium.c
> new file mode 100644
> index 0000000..a7b4277
> --- /dev/null
> +++ b/drivers/perf/uncore/uncore_cavium.c
> @@ -0,0 +1,351 @@
> +/*
> + * Cavium Thunder uncore PMU support.
> + *
> + * Copyright (C) 2015,2016 Cavium Inc.
> + * Author: Jan Glauber <jan.glauber@cavium.com>
> + */
> +
> +#include <linux/cpufeature.h>
> +#include <linux/numa.h>
> +#include <linux/slab.h>
> +
> +#include "uncore_cavium.h"
> +
> +/*
> + * Some notes about the various counters supported by this "uncore" PMU
> + * and the design:
> + *
> + * All counters are 64 bit long.
> + * There are no overflow interrupts.
> + * Counters are summarized per node/socket.
> + * Most devices appear as separate PCI devices per socket with the exception
> + * of OCX TLK which appears as one PCI device per socket and contains several
> + * units with counters that are merged.
> + * Some counters are selected via a control register (L2C TAD) and read by
> + * a number of counter registers, others (L2C CBC, LMC & OCX TLK) have
> + * one dedicated counter per event.
> + * Some counters are not stoppable (L2C CBC & LMC).
> + * Some counters are read-only (LMC).
> + * All counters belong to PCI devices, the devices may have additional
> + * drivers but we assume we are the only user of the counter registers.
> + * We map the whole PCI BAR so we must be careful to forbid access to
> + * addresses that contain neither counters nor counter control registers.
> + */
> +
> +void thunder_uncore_read(struct perf_event *event)
> +{

Rather than have a bunch of global symbols that are called from the
individual drivers, why don't you pass a struct of function pointers to
their respective init functions and keep the internals private?

> +	struct thunder_uncore *uncore = to_uncore(event->pmu);
> +	struct hw_perf_event *hwc = &event->hw;
> +	struct thunder_uncore_node *node;
> +	struct thunder_uncore_unit *unit;
> +	u64 prev, delta, new = 0;
> +
> +	node = get_node(hwc->config, uncore);
> +
> +	/* read counter values from all units on the node */
> +	list_for_each_entry(unit, &node->unit_list, entry)
> +		new += readq(hwc->event_base + unit->map);
> +
> +	prev = local64_read(&hwc->prev_count);
> +	local64_set(&hwc->prev_count, new);
> +	delta = new - prev;
> +	local64_add(delta, &event->count);
> +}
> +
> +int thunder_uncore_add(struct perf_event *event, int flags, u64 config_base,
> +		       u64 event_base)
> +{
> +	struct thunder_uncore *uncore = to_uncore(event->pmu);
> +	struct hw_perf_event *hwc = &event->hw;
> +	struct thunder_uncore_node *node;
> +	int id;
> +
> +	node = get_node(hwc->config, uncore);
> +	id = get_id(hwc->config);
> +
> +	if (!cmpxchg(&node->events[id], NULL, event))
> +		hwc->idx = id;

Does this need to be a full-fat cmpxchg? Who are you racing with?

> +
> +	if (hwc->idx == -1)
> +		return -EBUSY;

This would be much clearer as an else statement after the cmpxchg.

> +
> +	hwc->config_base = config_base;
> +	hwc->event_base = event_base;
> +	hwc->state = PERF_HES_UPTODATE | PERF_HES_STOPPED;
> +
> +	if (flags & PERF_EF_START)
> +		uncore->pmu.start(event, PERF_EF_RELOAD);
> +
> +	return 0;
> +}
> +
> +void thunder_uncore_del(struct perf_event *event, int flags)
> +{
> +	struct thunder_uncore *uncore = to_uncore(event->pmu);
> +	struct hw_perf_event *hwc = &event->hw;
> +	struct thunder_uncore_node *node;
> +	int i;
> +
> +	event->pmu->stop(event, PERF_EF_UPDATE);
> +
> +	/*
> +	 * For programmable counters we need to check where we installed it.
> +	 * To keep this function generic always test the more complicated
> +	 * case (free running counters won't need the loop).
> +	 */
> +	node = get_node(hwc->config, uncore);
> +	for (i = 0; i < node->num_counters; i++) {
> +		if (cmpxchg(&node->events[i], event, NULL) == event)
> +			break;
> +	}
> +	hwc->idx = -1;
> +}
> +
> +void thunder_uncore_start(struct perf_event *event, int flags)
> +{
> +	struct thunder_uncore *uncore = to_uncore(event->pmu);
> +	struct hw_perf_event *hwc = &event->hw;
> +	struct thunder_uncore_node *node;
> +	struct thunder_uncore_unit *unit;
> +	u64 new = 0;
> +
> +	/* read counter values from all units on the node */
> +	node = get_node(hwc->config, uncore);
> +	list_for_each_entry(unit, &node->unit_list, entry)
> +		new += readq(hwc->event_base + unit->map);
> +	local64_set(&hwc->prev_count, new);
> +
> +	hwc->state = 0;
> +	perf_event_update_userpage(event);
> +}
> +
> +void thunder_uncore_stop(struct perf_event *event, int flags)
> +{
> +	struct hw_perf_event *hwc = &event->hw;
> +
> +	hwc->state |= PERF_HES_STOPPED;
> +
> +	if ((flags & PERF_EF_UPDATE) && !(hwc->state & PERF_HES_UPTODATE)) {
> +		thunder_uncore_read(event);
> +		hwc->state |= PERF_HES_UPTODATE;
> +	}
> +}
> +
> +int thunder_uncore_event_init(struct perf_event *event)
> +{
> +	struct hw_perf_event *hwc = &event->hw;
> +	struct thunder_uncore_node *node;
> +	struct thunder_uncore *uncore;
> +
> +	if (event->attr.type != event->pmu->type)
> +		return -ENOENT;
> +
> +	/* we do not support sampling */
> +	if (is_sampling_event(event))
> +		return -EINVAL;
> +
> +	/* counters do not have these bits */
> +	if (event->attr.exclude_user	||
> +	    event->attr.exclude_kernel	||
> +	    event->attr.exclude_host	||
> +	    event->attr.exclude_guest	||
> +	    event->attr.exclude_hv	||
> +	    event->attr.exclude_idle)
> +		return -EINVAL;
> +
> +	uncore = to_uncore(event->pmu);
> +	if (!uncore)
> +		return -ENODEV;
> +	if (!uncore->event_valid(event->attr.config & UNCORE_EVENT_ID_MASK))
> +		return -EINVAL;
> +
> +	/* check NUMA node */
> +	node = get_node(event->attr.config, uncore);
> +	if (!node) {
> +		pr_debug("Invalid NUMA node selected\n");
> +		return -EINVAL;
> +	}
> +
> +	hwc->config = event->attr.config;
> +	hwc->idx = -1;
> +	return 0;
> +}
> +
> +static ssize_t thunder_uncore_attr_show_cpumask(struct device *dev,
> +						struct device_attribute *attr,
> +						char *buf)
> +{
> +	struct pmu *pmu = dev_get_drvdata(dev);
> +	struct thunder_uncore *uncore =
> +		container_of(pmu, struct thunder_uncore, pmu);
> +
> +	return cpumap_print_to_pagebuf(true, buf, &uncore->active_mask);
> +}
> +static DEVICE_ATTR(cpumask, S_IRUGO, thunder_uncore_attr_show_cpumask, NULL);
> +
> +static struct attribute *thunder_uncore_attrs[] = {
> +	&dev_attr_cpumask.attr,
> +	NULL,
> +};
> +
> +struct attribute_group thunder_uncore_attr_group = {
> +	.attrs = thunder_uncore_attrs,
> +};
> +
> +ssize_t thunder_events_sysfs_show(struct device *dev,
> +				  struct device_attribute *attr,
> +				  char *page)
> +{
> +	struct perf_pmu_events_attr *pmu_attr =
> +		container_of(attr, struct perf_pmu_events_attr, attr);
> +
> +	if (pmu_attr->event_str)
> +		return sprintf(page, "%s", pmu_attr->event_str);
> +
> +	return 0;
> +}
> +
> +/* node attribute depending on number of NUMA nodes */
> +static ssize_t node_show(struct device *dev, struct device_attribute *attr,
> +			 char *page)
> +{
> +	if (NODES_SHIFT)
> +		return sprintf(page, "config:16-%d\n", 16 + NODES_SHIFT - 1);

If NODES_SHIFT is 1, you'll end up with "config:16-16", which might confuse
userspace.

> +	else
> +		return sprintf(page, "config:16\n");
> +}
> +
> +struct device_attribute format_attr_node = __ATTR_RO(node);
> +
> +/*
> + * Thunder uncore events are independent from CPUs. Provide a cpumask
> + * nevertheless to prevent perf from adding the event per-cpu and just
> + * set the mask to one online CPU. Use the same cpumask for all uncore
> + * devices.
> + *
> + * There is a performance penalty for accessing a device from a CPU on
> + * another socket, but we do not care (yet).
> + */
> +static int thunder_uncore_offline_cpu(unsigned int old_cpu, struct hlist_node *node)
> +{
> +	struct thunder_uncore *uncore = hlist_entry_safe(node, struct thunder_uncore, node);

Why _safe?

> +	int new_cpu;
> +
> +	if (!cpumask_test_and_clear_cpu(old_cpu, &uncore->active_mask))
> +		return 0;
> +	new_cpu = cpumask_any_but(cpu_online_mask, old_cpu);
> +	if (new_cpu >= nr_cpu_ids)
> +		return 0;
> +	perf_pmu_migrate_context(&uncore->pmu, old_cpu, new_cpu);
> +	cpumask_set_cpu(new_cpu, &uncore->active_mask);
> +	return 0;
> +}
> +
> +static struct thunder_uncore_node * __init alloc_node(struct thunder_uncore *uncore,
> +						      int node_id, int counters)
> +{
> +	struct thunder_uncore_node *node;
> +
> +	node = kzalloc(sizeof(*node), GFP_KERNEL);
> +	if (!node)
> +		return NULL;
> +	node->num_counters = counters;
> +	INIT_LIST_HEAD(&node->unit_list);
> +	return node;
> +}
> +
> +int __init thunder_uncore_setup(struct thunder_uncore *uncore, int device_id,
> +				struct pmu *pmu, int counters)
> +{
> +	unsigned int vendor_id = PCI_VENDOR_ID_CAVIUM;
> +	struct thunder_uncore_unit  *unit, *tmp;
> +	struct thunder_uncore_node *node;
> +	struct pci_dev *pdev = NULL;
> +	int ret, node_id, found = 0;
> +
> +	/* detect PCI devices */
> +	while ((pdev = pci_get_device(vendor_id, device_id, pdev))) {

Redundant brackets?

> +		if (!pdev)
> +			break;

Redundant check?

> +		node_id = dev_to_node(&pdev->dev);
> +
> +		/* allocate node if necessary */
> +		if (!uncore->nodes[node_id])
> +			uncore->nodes[node_id] = alloc_node(uncore, node_id, counters);
> +
> +		node = uncore->nodes[node_id];
> +		if (!node) {
> +			ret = -ENOMEM;
> +			goto fail;
> +		}
> +
> +		unit = kzalloc(sizeof(*unit), GFP_KERNEL);
> +		if (!unit) {
> +			ret = -ENOMEM;
> +			goto fail;
> +		}
> +
> +		unit->pdev = pdev;
> +		unit->map = ioremap(pci_resource_start(pdev, 0),
> +				    pci_resource_len(pdev, 0));
> +		list_add(&unit->entry, &node->unit_list);
> +		node->nr_units++;
> +		found++;
> +	}
> +
> +	if (!found)
> +		return -ENODEV;
> +
> +	cpuhp_state_add_instance_nocalls(CPUHP_AP_UNCORE_CAVIUM_ONLINE,
> +                                         &uncore->node);
> +
> +	/*
> +	 * perf PMU is CPU dependent in difference to our uncore devices.
> +	 * Just pick a CPU and migrate away if it goes offline.
> +	 */
> +	cpumask_set_cpu(smp_processor_id(), &uncore->active_mask);
> +
> +	uncore->pmu = *pmu;
> +	ret = perf_pmu_register(&uncore->pmu, uncore->pmu.name, -1);
> +	if (ret)
> +		goto fail;
> +
> +	return 0;
> +
> +fail:
> +	node_id = 0;
> +	while (uncore->nodes[node_id]) {
> +		node = uncore->nodes[node_id];
> +
> +		list_for_each_entry_safe(unit, tmp, &node->unit_list, entry) {

Why do you need the _safe variant?

Will

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v4 1/5] arm64: perf: Basic uncore counter support for Cavium ThunderX SOC
@ 2016-11-08 23:50     ` Will Deacon
  0 siblings, 0 replies; 28+ messages in thread
From: Will Deacon @ 2016-11-08 23:50 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Jan,

Thanks for posting an updated series. I have a few minor comments, which
we can hopefully address in time for 4.10.

Also, have you run the perf fuzzer with this series applied?

https://github.com/deater/perf_event_tests

(build the tests and look under the fuzzer/ directory for the tool)

On Sat, Oct 29, 2016 at 01:55:29PM +0200, Jan Glauber wrote:
> Provide "uncore" facilities for different non-CPU performance
> counter units.
> 
> The uncore PMUs can be found under /sys/bus/event_source/devices.
> All counters are exported via sysfs in the corresponding events
> files under the PMU directory so the perf tool can list the event names.
> 
> There are some points that are special in this implementation:
> 
> 1) The PMU detection relies on PCI device detection. If a
>    matching PCI device is found the PMU is created. The code can deal
>    with multiple units of the same type, e.g. more than one memory
>    controller.
> 
> 2) Counters are summarized across different units of the same type
>    on one NUMA node but not across NUMA nodes.
>    For instance L2C TAD 0..7 are presented as a single counter
>    (adding the values from TAD 0 to 7). Although losing the ability
>    to read a single value the merged values are easier to use.
> 
> 3) The counters are not CPU related. A random CPU is picked regardless
>    of the NUMA node. There is a small performance penalty for accessing
>    counters on a remote note but reading a performance counter is a
>    slow operation anyway.
> 
> Signed-off-by: Jan Glauber <jglauber@cavium.com>
> ---
>  drivers/perf/Kconfig                |  13 ++
>  drivers/perf/Makefile               |   1 +
>  drivers/perf/uncore/Makefile        |   1 +
>  drivers/perf/uncore/uncore_cavium.c | 351 ++++++++++++++++++++++++++++++++++++
>  drivers/perf/uncore/uncore_cavium.h |  71 ++++++++

We already have "uncore" PMUs under drivers/perf, so I'd prefer that we
renamed this a bit to reflect better what's going on. How about:

  drivers/perf/cavium/

and then

  drivers/perf/cavium/uncore_thunder.[ch]

?

>  include/linux/cpuhotplug.h          |   1 +
>  6 files changed, 438 insertions(+)
>  create mode 100644 drivers/perf/uncore/Makefile
>  create mode 100644 drivers/perf/uncore/uncore_cavium.c
>  create mode 100644 drivers/perf/uncore/uncore_cavium.h
> 
> diff --git a/drivers/perf/Kconfig b/drivers/perf/Kconfig
> index 4d5c5f9..3266c87 100644
> --- a/drivers/perf/Kconfig
> +++ b/drivers/perf/Kconfig
> @@ -19,4 +19,17 @@ config XGENE_PMU
>          help
>            Say y if you want to use APM X-Gene SoC performance monitors.
>  
> +config UNCORE_PMU
> +	bool

This isn't needed.

> +
> +config UNCORE_PMU_CAVIUM
> +	depends on PERF_EVENTS && NUMA && ARM64
> +	bool "Cavium uncore PMU support"

Please mentioned Thunder somewhere, since that's the SoC being supported.

> +	select UNCORE_PMU
> +	default y
> +	help
> +	  Say y if you want to access performance counters of subsystems
> +	  on a Cavium SOC like cache controller, memory controller or
> +	  processor interconnect.
> +
>  endmenu
> diff --git a/drivers/perf/Makefile b/drivers/perf/Makefile
> index b116e98..d6c02c9 100644
> --- a/drivers/perf/Makefile
> +++ b/drivers/perf/Makefile
> @@ -1,2 +1,3 @@
>  obj-$(CONFIG_ARM_PMU) += arm_pmu.o
>  obj-$(CONFIG_XGENE_PMU) += xgene_pmu.o
> +obj-y += uncore/
> diff --git a/drivers/perf/uncore/Makefile b/drivers/perf/uncore/Makefile
> new file mode 100644
> index 0000000..6130e18
> --- /dev/null
> +++ b/drivers/perf/uncore/Makefile
> @@ -0,0 +1 @@
> +obj-$(CONFIG_UNCORE_PMU_CAVIUM) += uncore_cavium.o
> diff --git a/drivers/perf/uncore/uncore_cavium.c b/drivers/perf/uncore/uncore_cavium.c
> new file mode 100644
> index 0000000..a7b4277
> --- /dev/null
> +++ b/drivers/perf/uncore/uncore_cavium.c
> @@ -0,0 +1,351 @@
> +/*
> + * Cavium Thunder uncore PMU support.
> + *
> + * Copyright (C) 2015,2016 Cavium Inc.
> + * Author: Jan Glauber <jan.glauber@cavium.com>
> + */
> +
> +#include <linux/cpufeature.h>
> +#include <linux/numa.h>
> +#include <linux/slab.h>
> +
> +#include "uncore_cavium.h"
> +
> +/*
> + * Some notes about the various counters supported by this "uncore" PMU
> + * and the design:
> + *
> + * All counters are 64 bit long.
> + * There are no overflow interrupts.
> + * Counters are summarized per node/socket.
> + * Most devices appear as separate PCI devices per socket with the exception
> + * of OCX TLK which appears as one PCI device per socket and contains several
> + * units with counters that are merged.
> + * Some counters are selected via a control register (L2C TAD) and read by
> + * a number of counter registers, others (L2C CBC, LMC & OCX TLK) have
> + * one dedicated counter per event.
> + * Some counters are not stoppable (L2C CBC & LMC).
> + * Some counters are read-only (LMC).
> + * All counters belong to PCI devices, the devices may have additional
> + * drivers but we assume we are the only user of the counter registers.
> + * We map the whole PCI BAR so we must be careful to forbid access to
> + * addresses that contain neither counters nor counter control registers.
> + */
> +
> +void thunder_uncore_read(struct perf_event *event)
> +{

Rather than have a bunch of global symbols that are called from the
individual drivers, why don't you pass a struct of function pointers to
their respective init functions and keep the internals private?

> +	struct thunder_uncore *uncore = to_uncore(event->pmu);
> +	struct hw_perf_event *hwc = &event->hw;
> +	struct thunder_uncore_node *node;
> +	struct thunder_uncore_unit *unit;
> +	u64 prev, delta, new = 0;
> +
> +	node = get_node(hwc->config, uncore);
> +
> +	/* read counter values from all units on the node */
> +	list_for_each_entry(unit, &node->unit_list, entry)
> +		new += readq(hwc->event_base + unit->map);
> +
> +	prev = local64_read(&hwc->prev_count);
> +	local64_set(&hwc->prev_count, new);
> +	delta = new - prev;
> +	local64_add(delta, &event->count);
> +}
> +
> +int thunder_uncore_add(struct perf_event *event, int flags, u64 config_base,
> +		       u64 event_base)
> +{
> +	struct thunder_uncore *uncore = to_uncore(event->pmu);
> +	struct hw_perf_event *hwc = &event->hw;
> +	struct thunder_uncore_node *node;
> +	int id;
> +
> +	node = get_node(hwc->config, uncore);
> +	id = get_id(hwc->config);
> +
> +	if (!cmpxchg(&node->events[id], NULL, event))
> +		hwc->idx = id;

Does this need to be a full-fat cmpxchg? Who are you racing with?

> +
> +	if (hwc->idx == -1)
> +		return -EBUSY;

This would be much clearer as an else statement after the cmpxchg.

> +
> +	hwc->config_base = config_base;
> +	hwc->event_base = event_base;
> +	hwc->state = PERF_HES_UPTODATE | PERF_HES_STOPPED;
> +
> +	if (flags & PERF_EF_START)
> +		uncore->pmu.start(event, PERF_EF_RELOAD);
> +
> +	return 0;
> +}
> +
> +void thunder_uncore_del(struct perf_event *event, int flags)
> +{
> +	struct thunder_uncore *uncore = to_uncore(event->pmu);
> +	struct hw_perf_event *hwc = &event->hw;
> +	struct thunder_uncore_node *node;
> +	int i;
> +
> +	event->pmu->stop(event, PERF_EF_UPDATE);
> +
> +	/*
> +	 * For programmable counters we need to check where we installed it.
> +	 * To keep this function generic always test the more complicated
> +	 * case (free running counters won't need the loop).
> +	 */
> +	node = get_node(hwc->config, uncore);
> +	for (i = 0; i < node->num_counters; i++) {
> +		if (cmpxchg(&node->events[i], event, NULL) == event)
> +			break;
> +	}
> +	hwc->idx = -1;
> +}
> +
> +void thunder_uncore_start(struct perf_event *event, int flags)
> +{
> +	struct thunder_uncore *uncore = to_uncore(event->pmu);
> +	struct hw_perf_event *hwc = &event->hw;
> +	struct thunder_uncore_node *node;
> +	struct thunder_uncore_unit *unit;
> +	u64 new = 0;
> +
> +	/* read counter values from all units on the node */
> +	node = get_node(hwc->config, uncore);
> +	list_for_each_entry(unit, &node->unit_list, entry)
> +		new += readq(hwc->event_base + unit->map);
> +	local64_set(&hwc->prev_count, new);
> +
> +	hwc->state = 0;
> +	perf_event_update_userpage(event);
> +}
> +
> +void thunder_uncore_stop(struct perf_event *event, int flags)
> +{
> +	struct hw_perf_event *hwc = &event->hw;
> +
> +	hwc->state |= PERF_HES_STOPPED;
> +
> +	if ((flags & PERF_EF_UPDATE) && !(hwc->state & PERF_HES_UPTODATE)) {
> +		thunder_uncore_read(event);
> +		hwc->state |= PERF_HES_UPTODATE;
> +	}
> +}
> +
> +int thunder_uncore_event_init(struct perf_event *event)
> +{
> +	struct hw_perf_event *hwc = &event->hw;
> +	struct thunder_uncore_node *node;
> +	struct thunder_uncore *uncore;
> +
> +	if (event->attr.type != event->pmu->type)
> +		return -ENOENT;
> +
> +	/* we do not support sampling */
> +	if (is_sampling_event(event))
> +		return -EINVAL;
> +
> +	/* counters do not have these bits */
> +	if (event->attr.exclude_user	||
> +	    event->attr.exclude_kernel	||
> +	    event->attr.exclude_host	||
> +	    event->attr.exclude_guest	||
> +	    event->attr.exclude_hv	||
> +	    event->attr.exclude_idle)
> +		return -EINVAL;
> +
> +	uncore = to_uncore(event->pmu);
> +	if (!uncore)
> +		return -ENODEV;
> +	if (!uncore->event_valid(event->attr.config & UNCORE_EVENT_ID_MASK))
> +		return -EINVAL;
> +
> +	/* check NUMA node */
> +	node = get_node(event->attr.config, uncore);
> +	if (!node) {
> +		pr_debug("Invalid NUMA node selected\n");
> +		return -EINVAL;
> +	}
> +
> +	hwc->config = event->attr.config;
> +	hwc->idx = -1;
> +	return 0;
> +}
> +
> +static ssize_t thunder_uncore_attr_show_cpumask(struct device *dev,
> +						struct device_attribute *attr,
> +						char *buf)
> +{
> +	struct pmu *pmu = dev_get_drvdata(dev);
> +	struct thunder_uncore *uncore =
> +		container_of(pmu, struct thunder_uncore, pmu);
> +
> +	return cpumap_print_to_pagebuf(true, buf, &uncore->active_mask);
> +}
> +static DEVICE_ATTR(cpumask, S_IRUGO, thunder_uncore_attr_show_cpumask, NULL);
> +
> +static struct attribute *thunder_uncore_attrs[] = {
> +	&dev_attr_cpumask.attr,
> +	NULL,
> +};
> +
> +struct attribute_group thunder_uncore_attr_group = {
> +	.attrs = thunder_uncore_attrs,
> +};
> +
> +ssize_t thunder_events_sysfs_show(struct device *dev,
> +				  struct device_attribute *attr,
> +				  char *page)
> +{
> +	struct perf_pmu_events_attr *pmu_attr =
> +		container_of(attr, struct perf_pmu_events_attr, attr);
> +
> +	if (pmu_attr->event_str)
> +		return sprintf(page, "%s", pmu_attr->event_str);
> +
> +	return 0;
> +}
> +
> +/* node attribute depending on number of NUMA nodes */
> +static ssize_t node_show(struct device *dev, struct device_attribute *attr,
> +			 char *page)
> +{
> +	if (NODES_SHIFT)
> +		return sprintf(page, "config:16-%d\n", 16 + NODES_SHIFT - 1);

If NODES_SHIFT is 1, you'll end up with "config:16-16", which might confuse
userspace.

> +	else
> +		return sprintf(page, "config:16\n");
> +}
> +
> +struct device_attribute format_attr_node = __ATTR_RO(node);
> +
> +/*
> + * Thunder uncore events are independent from CPUs. Provide a cpumask
> + * nevertheless to prevent perf from adding the event per-cpu and just
> + * set the mask to one online CPU. Use the same cpumask for all uncore
> + * devices.
> + *
> + * There is a performance penalty for accessing a device from a CPU on
> + * another socket, but we do not care (yet).
> + */
> +static int thunder_uncore_offline_cpu(unsigned int old_cpu, struct hlist_node *node)
> +{
> +	struct thunder_uncore *uncore = hlist_entry_safe(node, struct thunder_uncore, node);

Why _safe?

> +	int new_cpu;
> +
> +	if (!cpumask_test_and_clear_cpu(old_cpu, &uncore->active_mask))
> +		return 0;
> +	new_cpu = cpumask_any_but(cpu_online_mask, old_cpu);
> +	if (new_cpu >= nr_cpu_ids)
> +		return 0;
> +	perf_pmu_migrate_context(&uncore->pmu, old_cpu, new_cpu);
> +	cpumask_set_cpu(new_cpu, &uncore->active_mask);
> +	return 0;
> +}
> +
> +static struct thunder_uncore_node * __init alloc_node(struct thunder_uncore *uncore,
> +						      int node_id, int counters)
> +{
> +	struct thunder_uncore_node *node;
> +
> +	node = kzalloc(sizeof(*node), GFP_KERNEL);
> +	if (!node)
> +		return NULL;
> +	node->num_counters = counters;
> +	INIT_LIST_HEAD(&node->unit_list);
> +	return node;
> +}
> +
> +int __init thunder_uncore_setup(struct thunder_uncore *uncore, int device_id,
> +				struct pmu *pmu, int counters)
> +{
> +	unsigned int vendor_id = PCI_VENDOR_ID_CAVIUM;
> +	struct thunder_uncore_unit  *unit, *tmp;
> +	struct thunder_uncore_node *node;
> +	struct pci_dev *pdev = NULL;
> +	int ret, node_id, found = 0;
> +
> +	/* detect PCI devices */
> +	while ((pdev = pci_get_device(vendor_id, device_id, pdev))) {

Redundant brackets?

> +		if (!pdev)
> +			break;

Redundant check?

> +		node_id = dev_to_node(&pdev->dev);
> +
> +		/* allocate node if necessary */
> +		if (!uncore->nodes[node_id])
> +			uncore->nodes[node_id] = alloc_node(uncore, node_id, counters);
> +
> +		node = uncore->nodes[node_id];
> +		if (!node) {
> +			ret = -ENOMEM;
> +			goto fail;
> +		}
> +
> +		unit = kzalloc(sizeof(*unit), GFP_KERNEL);
> +		if (!unit) {
> +			ret = -ENOMEM;
> +			goto fail;
> +		}
> +
> +		unit->pdev = pdev;
> +		unit->map = ioremap(pci_resource_start(pdev, 0),
> +				    pci_resource_len(pdev, 0));
> +		list_add(&unit->entry, &node->unit_list);
> +		node->nr_units++;
> +		found++;
> +	}
> +
> +	if (!found)
> +		return -ENODEV;
> +
> +	cpuhp_state_add_instance_nocalls(CPUHP_AP_UNCORE_CAVIUM_ONLINE,
> +                                         &uncore->node);
> +
> +	/*
> +	 * perf PMU is CPU dependent in difference to our uncore devices.
> +	 * Just pick a CPU and migrate away if it goes offline.
> +	 */
> +	cpumask_set_cpu(smp_processor_id(), &uncore->active_mask);
> +
> +	uncore->pmu = *pmu;
> +	ret = perf_pmu_register(&uncore->pmu, uncore->pmu.name, -1);
> +	if (ret)
> +		goto fail;
> +
> +	return 0;
> +
> +fail:
> +	node_id = 0;
> +	while (uncore->nodes[node_id]) {
> +		node = uncore->nodes[node_id];
> +
> +		list_for_each_entry_safe(unit, tmp, &node->unit_list, entry) {

Why do you need the _safe variant?

Will

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 1/5] arm64: perf: Basic uncore counter support for Cavium ThunderX SOC
  2016-10-29 11:55   ` Jan Glauber
@ 2016-11-10 16:54     ` Mark Rutland
  -1 siblings, 0 replies; 28+ messages in thread
From: Mark Rutland @ 2016-11-10 16:54 UTC (permalink / raw)
  To: Jan Glauber; +Cc: Will Deacon, linux-kernel, linux-arm-kernel

Hi Jan,

Apologies for the delay in getting to this.

On Sat, Oct 29, 2016 at 01:55:29PM +0200, Jan Glauber wrote:
> diff --git a/drivers/perf/uncore/uncore_cavium.c b/drivers/perf/uncore/uncore_cavium.c
> new file mode 100644
> index 0000000..a7b4277
> --- /dev/null
> +++ b/drivers/perf/uncore/uncore_cavium.c
> @@ -0,0 +1,351 @@
> +/*
> + * Cavium Thunder uncore PMU support.
> + *
> + * Copyright (C) 2015,2016 Cavium Inc.
> + * Author: Jan Glauber <jan.glauber@cavium.com>
> + */
> +
> +#include <linux/cpufeature.h>
> +#include <linux/numa.h>
> +#include <linux/slab.h>

I believe the following includes are necessary for APIs and/or data
explicitly referenced by the driver code:

#include <linux/atomic.h>
#include <linux/cpuhotplug.h>
#include <linux/cpumask.h>
#include <linux/device.h>
#include <linux/errno.h>
#include <linux/io.h>
#include <linux/kernel.h>
#include <linux/list.h>
#include <linux/pci.h>
#include <linux/perf_event.h>
#include <linux/printk.h>
#include <linux/smp.h>
#include <linux/sysfs.h>
#include <linux/types.h>

#include <asm/local64.h>

... please add those here.

[...]

> +/*
> + * Some notes about the various counters supported by this "uncore" PMU
> + * and the design:
> + *
> + * All counters are 64 bit long.
> + * There are no overflow interrupts.
> + * Counters are summarized per node/socket.
> + * Most devices appear as separate PCI devices per socket with the exception
> + * of OCX TLK which appears as one PCI device per socket and contains several
> + * units with counters that are merged.

As a general note, as I commented on the QC L2 PMU driver [1,2], we need
to figure out if we should be aggregating physical PMUs or not.

Judging by subsequent patches, each unit has individual counters and
controls, and thus we cannot atomically read/write counters or controls
across them. As such, I do not think we should aggregate them, and
should expose them separately to userspace.

That will simplify a number of things (e.g. the CPU migration code no
longer has to iterate over a list of units).

> + * Some counters are selected via a control register (L2C TAD) and read by
> + * a number of counter registers, others (L2C CBC, LMC & OCX TLK) have
> + * one dedicated counter per event.
> + * Some counters are not stoppable (L2C CBC & LMC).
> + * Some counters are read-only (LMC).
> + * All counters belong to PCI devices, the devices may have additional
> + * drivers but we assume we are the only user of the counter registers.
> + * We map the whole PCI BAR so we must be careful to forbid access to
> + * addresses that contain neither counters nor counter control registers.
> + */
> +
> +void thunder_uncore_read(struct perf_event *event)
> +{
> +	struct thunder_uncore *uncore = to_uncore(event->pmu);
> +	struct hw_perf_event *hwc = &event->hw;
> +	struct thunder_uncore_node *node;
> +	struct thunder_uncore_unit *unit;
> +	u64 prev, delta, new = 0;
> +
> +	node = get_node(hwc->config, uncore);
> +
> +	/* read counter values from all units on the node */
> +	list_for_each_entry(unit, &node->unit_list, entry)
> +		new += readq(hwc->event_base + unit->map);
> +
> +	prev = local64_read(&hwc->prev_count);
> +	local64_set(&hwc->prev_count, new);
> +	delta = new - prev;
> +	local64_add(delta, &event->count);
> +}
> +
> +int thunder_uncore_add(struct perf_event *event, int flags, u64 config_base,
> +		       u64 event_base)
> +{
> +	struct thunder_uncore *uncore = to_uncore(event->pmu);
> +	struct hw_perf_event *hwc = &event->hw;
> +	struct thunder_uncore_node *node;
> +	int id;
> +
> +	node = get_node(hwc->config, uncore);
> +	id = get_id(hwc->config);
> +
> +	if (!cmpxchg(&node->events[id], NULL, event))
> +		hwc->idx = id;

Judging by thunder_uncore_event_init() and get_id(), the specific
counter to use is chosen by the user, rather than allocated as
necessary. Yet the block comment before thunder_uncore_read() said only
some events have a dedicated counter.

This does not seem right. Why are we not choosing a relevant counter
dynamically?

As Will commented, we shouldn't need a full-barrier cmpxchg() here; the
pmu::{add,del} are serialised by the core perf code as ctx->lock has to
be held (and we have no interrupt to worry about). If we want to use
cmpxchg() for convenience, it can be a cmpxchg_relaxed().

> +	if (hwc->idx == -1)
> +		return -EBUSY;
> +
> +	hwc->config_base = config_base;
> +	hwc->event_base = event_base;
> +	hwc->state = PERF_HES_UPTODATE | PERF_HES_STOPPED;
> +
> +	if (flags & PERF_EF_START)
> +		uncore->pmu.start(event, PERF_EF_RELOAD);
> +
> +	return 0;
> +}
> +
> +void thunder_uncore_del(struct perf_event *event, int flags)
> +{
> +	struct thunder_uncore *uncore = to_uncore(event->pmu);
> +	struct hw_perf_event *hwc = &event->hw;
> +	struct thunder_uncore_node *node;
> +	int i;
> +
> +	event->pmu->stop(event, PERF_EF_UPDATE);
> +
> +	/*
> +	 * For programmable counters we need to check where we installed it.
> +	 * To keep this function generic always test the more complicated
> +	 * case (free running counters won't need the loop).
> +	 */
> +	node = get_node(hwc->config, uncore);
> +	for (i = 0; i < node->num_counters; i++) {
> +		if (cmpxchg(&node->events[i], event, NULL) == event)
> +			break;

Likewise, this can be cmpxchg_relaxed().

[...]

> +int thunder_uncore_event_init(struct perf_event *event)
> +{

> +	uncore = to_uncore(event->pmu);
> +	if (!uncore)
> +		return -ENODEV;

Given to_uncore() uses container_of(), we can lose the check here; the
result cannot be NULL.

> +	if (!uncore->event_valid(event->attr.config & UNCORE_EVENT_ID_MASK))
> +		return -EINVAL;

Judging by the header, it looks like the node gets dropped in the high
bits. I'm not sure it makes sense to encode that in the user ABI given
the aggregation comments above.

In x86 uncore PMU drivers, one cpu per node is exposed in the cpumask,
and that's how they target nodes. We should either do that, or have
completely separate instances.

Either way, that will also remove the need for exposing the varying
NODE_SHIFT under sysfs.

> +
> +	/* check NUMA node */
> +	node = get_node(event->attr.config, uncore);
> +	if (!node) {
> +		pr_debug("Invalid NUMA node selected\n");
> +		return -EINVAL;
> +	}

... and this to, since the node will either be implicit in the cpu
performing the monitoring, or in the PMU instance the event was
requested from.

> +
> +	hwc->config = event->attr.config;
> +	hwc->idx = -1;
> +	return 0;
> +}

I believe that we should also check that the leader (and siblings) are compatible.

Something like l2x0_pmu_group_is_valid in arch/arm/mm/cache-l2x0-pmu.c.

We also need to ensure that the events in a group are all on the same
CPU (the one exposed via the cpumask). The l2x0 PMU also does this in
its event_init path.

[...]

> +	cpuhp_state_add_instance_nocalls(CPUHP_AP_UNCORE_CAVIUM_ONLINE,
> +                                         &uncore->node);

> +	ret = perf_pmu_register(&uncore->pmu, uncore->pmu.name, -1);
> +	if (ret)
> +		goto fail;

> +fail:
> +	node_id = 0;
> +	while (uncore->nodes[node_id]) {
> +		node = uncore->nodes[node_id];
> +
> +		list_for_each_entry_safe(unit, tmp, &node->unit_list, entry) {
> +			if (unit->pdev) {
> +				if (unit->map)
> +					iounmap(unit->map);
> +				pci_dev_put(unit->pdev);
> +			}
> +			kfree(unit);
> +		}
> +		kfree(uncore->nodes[node_id]);
> +		node_id++;
> +	}
> +	return ret;
> +}

Shouldn't we remove the instance from the cpuhp state machine in the
failure path?

[...]

> diff --git a/drivers/perf/uncore/uncore_cavium.h b/drivers/perf/uncore/uncore_cavium.h
> new file mode 100644
> index 0000000..b5d64b5
> --- /dev/null
> +++ b/drivers/perf/uncore/uncore_cavium.h
> @@ -0,0 +1,71 @@
> +#include <linux/io.h>
> +#include <linux/list.h>
> +#include <linux/pci.h>
> +#include <linux/perf_event.h>

I believe this header also needs:

#include <linux/cpumask.h>
#include <linux/kernel.h>
#include <linux/stringify.h>
#include <linux/sysfs.h>
#include <linux/types.h>

> +
> +#undef pr_fmt
> +#define pr_fmt(fmt)     "thunderx_uncore: " fmt

IIRC this needs to be set before including <linux/printk.h>. Does this
work reliably, given that printk.h is likely included first?

Thanks,
Mark.

[1] http://lists.infradead.org/pipermail/linux-arm-kernel/2016-November/466764.html
[2] http://lists.infradead.org/pipermail/linux-arm-kernel/2016-November/466768.html

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v4 1/5] arm64: perf: Basic uncore counter support for Cavium ThunderX SOC
@ 2016-11-10 16:54     ` Mark Rutland
  0 siblings, 0 replies; 28+ messages in thread
From: Mark Rutland @ 2016-11-10 16:54 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Jan,

Apologies for the delay in getting to this.

On Sat, Oct 29, 2016 at 01:55:29PM +0200, Jan Glauber wrote:
> diff --git a/drivers/perf/uncore/uncore_cavium.c b/drivers/perf/uncore/uncore_cavium.c
> new file mode 100644
> index 0000000..a7b4277
> --- /dev/null
> +++ b/drivers/perf/uncore/uncore_cavium.c
> @@ -0,0 +1,351 @@
> +/*
> + * Cavium Thunder uncore PMU support.
> + *
> + * Copyright (C) 2015,2016 Cavium Inc.
> + * Author: Jan Glauber <jan.glauber@cavium.com>
> + */
> +
> +#include <linux/cpufeature.h>
> +#include <linux/numa.h>
> +#include <linux/slab.h>

I believe the following includes are necessary for APIs and/or data
explicitly referenced by the driver code:

#include <linux/atomic.h>
#include <linux/cpuhotplug.h>
#include <linux/cpumask.h>
#include <linux/device.h>
#include <linux/errno.h>
#include <linux/io.h>
#include <linux/kernel.h>
#include <linux/list.h>
#include <linux/pci.h>
#include <linux/perf_event.h>
#include <linux/printk.h>
#include <linux/smp.h>
#include <linux/sysfs.h>
#include <linux/types.h>

#include <asm/local64.h>

... please add those here.

[...]

> +/*
> + * Some notes about the various counters supported by this "uncore" PMU
> + * and the design:
> + *
> + * All counters are 64 bit long.
> + * There are no overflow interrupts.
> + * Counters are summarized per node/socket.
> + * Most devices appear as separate PCI devices per socket with the exception
> + * of OCX TLK which appears as one PCI device per socket and contains several
> + * units with counters that are merged.

As a general note, as I commented on the QC L2 PMU driver [1,2], we need
to figure out if we should be aggregating physical PMUs or not.

Judging by subsequent patches, each unit has individual counters and
controls, and thus we cannot atomically read/write counters or controls
across them. As such, I do not think we should aggregate them, and
should expose them separately to userspace.

That will simplify a number of things (e.g. the CPU migration code no
longer has to iterate over a list of units).

> + * Some counters are selected via a control register (L2C TAD) and read by
> + * a number of counter registers, others (L2C CBC, LMC & OCX TLK) have
> + * one dedicated counter per event.
> + * Some counters are not stoppable (L2C CBC & LMC).
> + * Some counters are read-only (LMC).
> + * All counters belong to PCI devices, the devices may have additional
> + * drivers but we assume we are the only user of the counter registers.
> + * We map the whole PCI BAR so we must be careful to forbid access to
> + * addresses that contain neither counters nor counter control registers.
> + */
> +
> +void thunder_uncore_read(struct perf_event *event)
> +{
> +	struct thunder_uncore *uncore = to_uncore(event->pmu);
> +	struct hw_perf_event *hwc = &event->hw;
> +	struct thunder_uncore_node *node;
> +	struct thunder_uncore_unit *unit;
> +	u64 prev, delta, new = 0;
> +
> +	node = get_node(hwc->config, uncore);
> +
> +	/* read counter values from all units on the node */
> +	list_for_each_entry(unit, &node->unit_list, entry)
> +		new += readq(hwc->event_base + unit->map);
> +
> +	prev = local64_read(&hwc->prev_count);
> +	local64_set(&hwc->prev_count, new);
> +	delta = new - prev;
> +	local64_add(delta, &event->count);
> +}
> +
> +int thunder_uncore_add(struct perf_event *event, int flags, u64 config_base,
> +		       u64 event_base)
> +{
> +	struct thunder_uncore *uncore = to_uncore(event->pmu);
> +	struct hw_perf_event *hwc = &event->hw;
> +	struct thunder_uncore_node *node;
> +	int id;
> +
> +	node = get_node(hwc->config, uncore);
> +	id = get_id(hwc->config);
> +
> +	if (!cmpxchg(&node->events[id], NULL, event))
> +		hwc->idx = id;

Judging by thunder_uncore_event_init() and get_id(), the specific
counter to use is chosen by the user, rather than allocated as
necessary. Yet the block comment before thunder_uncore_read() said only
some events have a dedicated counter.

This does not seem right. Why are we not choosing a relevant counter
dynamically?

As Will commented, we shouldn't need a full-barrier cmpxchg() here; the
pmu::{add,del} are serialised by the core perf code as ctx->lock has to
be held (and we have no interrupt to worry about). If we want to use
cmpxchg() for convenience, it can be a cmpxchg_relaxed().

> +	if (hwc->idx == -1)
> +		return -EBUSY;
> +
> +	hwc->config_base = config_base;
> +	hwc->event_base = event_base;
> +	hwc->state = PERF_HES_UPTODATE | PERF_HES_STOPPED;
> +
> +	if (flags & PERF_EF_START)
> +		uncore->pmu.start(event, PERF_EF_RELOAD);
> +
> +	return 0;
> +}
> +
> +void thunder_uncore_del(struct perf_event *event, int flags)
> +{
> +	struct thunder_uncore *uncore = to_uncore(event->pmu);
> +	struct hw_perf_event *hwc = &event->hw;
> +	struct thunder_uncore_node *node;
> +	int i;
> +
> +	event->pmu->stop(event, PERF_EF_UPDATE);
> +
> +	/*
> +	 * For programmable counters we need to check where we installed it.
> +	 * To keep this function generic always test the more complicated
> +	 * case (free running counters won't need the loop).
> +	 */
> +	node = get_node(hwc->config, uncore);
> +	for (i = 0; i < node->num_counters; i++) {
> +		if (cmpxchg(&node->events[i], event, NULL) == event)
> +			break;

Likewise, this can be cmpxchg_relaxed().

[...]

> +int thunder_uncore_event_init(struct perf_event *event)
> +{

> +	uncore = to_uncore(event->pmu);
> +	if (!uncore)
> +		return -ENODEV;

Given to_uncore() uses container_of(), we can lose the check here; the
result cannot be NULL.

> +	if (!uncore->event_valid(event->attr.config & UNCORE_EVENT_ID_MASK))
> +		return -EINVAL;

Judging by the header, it looks like the node gets dropped in the high
bits. I'm not sure it makes sense to encode that in the user ABI given
the aggregation comments above.

In x86 uncore PMU drivers, one cpu per node is exposed in the cpumask,
and that's how they target nodes. We should either do that, or have
completely separate instances.

Either way, that will also remove the need for exposing the varying
NODE_SHIFT under sysfs.

> +
> +	/* check NUMA node */
> +	node = get_node(event->attr.config, uncore);
> +	if (!node) {
> +		pr_debug("Invalid NUMA node selected\n");
> +		return -EINVAL;
> +	}

... and this to, since the node will either be implicit in the cpu
performing the monitoring, or in the PMU instance the event was
requested from.

> +
> +	hwc->config = event->attr.config;
> +	hwc->idx = -1;
> +	return 0;
> +}

I believe that we should also check that the leader (and siblings) are compatible.

Something like l2x0_pmu_group_is_valid in arch/arm/mm/cache-l2x0-pmu.c.

We also need to ensure that the events in a group are all on the same
CPU (the one exposed via the cpumask). The l2x0 PMU also does this in
its event_init path.

[...]

> +	cpuhp_state_add_instance_nocalls(CPUHP_AP_UNCORE_CAVIUM_ONLINE,
> +                                         &uncore->node);

> +	ret = perf_pmu_register(&uncore->pmu, uncore->pmu.name, -1);
> +	if (ret)
> +		goto fail;

> +fail:
> +	node_id = 0;
> +	while (uncore->nodes[node_id]) {
> +		node = uncore->nodes[node_id];
> +
> +		list_for_each_entry_safe(unit, tmp, &node->unit_list, entry) {
> +			if (unit->pdev) {
> +				if (unit->map)
> +					iounmap(unit->map);
> +				pci_dev_put(unit->pdev);
> +			}
> +			kfree(unit);
> +		}
> +		kfree(uncore->nodes[node_id]);
> +		node_id++;
> +	}
> +	return ret;
> +}

Shouldn't we remove the instance from the cpuhp state machine in the
failure path?

[...]

> diff --git a/drivers/perf/uncore/uncore_cavium.h b/drivers/perf/uncore/uncore_cavium.h
> new file mode 100644
> index 0000000..b5d64b5
> --- /dev/null
> +++ b/drivers/perf/uncore/uncore_cavium.h
> @@ -0,0 +1,71 @@
> +#include <linux/io.h>
> +#include <linux/list.h>
> +#include <linux/pci.h>
> +#include <linux/perf_event.h>

I believe this header also needs:

#include <linux/cpumask.h>
#include <linux/kernel.h>
#include <linux/stringify.h>
#include <linux/sysfs.h>
#include <linux/types.h>

> +
> +#undef pr_fmt
> +#define pr_fmt(fmt)     "thunderx_uncore: " fmt

IIRC this needs to be set before including <linux/printk.h>. Does this
work reliably, given that printk.h is likely included first?

Thanks,
Mark.

[1] http://lists.infradead.org/pipermail/linux-arm-kernel/2016-November/466764.html
[2] http://lists.infradead.org/pipermail/linux-arm-kernel/2016-November/466768.html

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 1/5] arm64: perf: Basic uncore counter support for Cavium ThunderX SOC
  2016-11-10 16:54     ` Mark Rutland
@ 2016-11-10 19:46       ` Will Deacon
  -1 siblings, 0 replies; 28+ messages in thread
From: Will Deacon @ 2016-11-10 19:46 UTC (permalink / raw)
  To: Mark Rutland; +Cc: Jan Glauber, linux-kernel, linux-arm-kernel

On Thu, Nov 10, 2016 at 04:54:06PM +0000, Mark Rutland wrote:
> On Sat, Oct 29, 2016 at 01:55:29PM +0200, Jan Glauber wrote:
> > diff --git a/drivers/perf/uncore/uncore_cavium.c b/drivers/perf/uncore/uncore_cavium.c
> > new file mode 100644
> > index 0000000..a7b4277
> > --- /dev/null
> > +++ b/drivers/perf/uncore/uncore_cavium.c
> > + * Some notes about the various counters supported by this "uncore" PMU
> > + * and the design:
> > + *
> > + * All counters are 64 bit long.
> > + * There are no overflow interrupts.
> > + * Counters are summarized per node/socket.
> > + * Most devices appear as separate PCI devices per socket with the exception
> > + * of OCX TLK which appears as one PCI device per socket and contains several
> > + * units with counters that are merged.
> 
> As a general note, as I commented on the QC L2 PMU driver [1,2], we need
> to figure out if we should be aggregating physical PMUs or not.
> 
> Judging by subsequent patches, each unit has individual counters and
> controls, and thus we cannot atomically read/write counters or controls
> across them. As such, I do not think we should aggregate them, and
> should expose them separately to userspace.

I thought each unit was registered as a separate PMU to perf? Or are you
specifically commenting on the OCX TLK? The comment there suggests that
the units cannot be individually enabled/disabled and, without docs, I
trust that's the case.

Will

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v4 1/5] arm64: perf: Basic uncore counter support for Cavium ThunderX SOC
@ 2016-11-10 19:46       ` Will Deacon
  0 siblings, 0 replies; 28+ messages in thread
From: Will Deacon @ 2016-11-10 19:46 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Nov 10, 2016 at 04:54:06PM +0000, Mark Rutland wrote:
> On Sat, Oct 29, 2016 at 01:55:29PM +0200, Jan Glauber wrote:
> > diff --git a/drivers/perf/uncore/uncore_cavium.c b/drivers/perf/uncore/uncore_cavium.c
> > new file mode 100644
> > index 0000000..a7b4277
> > --- /dev/null
> > +++ b/drivers/perf/uncore/uncore_cavium.c
> > + * Some notes about the various counters supported by this "uncore" PMU
> > + * and the design:
> > + *
> > + * All counters are 64 bit long.
> > + * There are no overflow interrupts.
> > + * Counters are summarized per node/socket.
> > + * Most devices appear as separate PCI devices per socket with the exception
> > + * of OCX TLK which appears as one PCI device per socket and contains several
> > + * units with counters that are merged.
> 
> As a general note, as I commented on the QC L2 PMU driver [1,2], we need
> to figure out if we should be aggregating physical PMUs or not.
> 
> Judging by subsequent patches, each unit has individual counters and
> controls, and thus we cannot atomically read/write counters or controls
> across them. As such, I do not think we should aggregate them, and
> should expose them separately to userspace.

I thought each unit was registered as a separate PMU to perf? Or are you
specifically commenting on the OCX TLK? The comment there suggests that
the units cannot be individually enabled/disabled and, without docs, I
trust that's the case.

Will

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 1/5] arm64: perf: Basic uncore counter support for Cavium ThunderX SOC
  2016-11-10 16:54     ` Mark Rutland
@ 2016-11-11  7:37       ` Jan Glauber
  -1 siblings, 0 replies; 28+ messages in thread
From: Jan Glauber @ 2016-11-11  7:37 UTC (permalink / raw)
  To: Mark Rutland; +Cc: Will Deacon, linux-kernel, linux-arm-kernel

On Thu, Nov 10, 2016 at 04:54:06PM +0000, Mark Rutland wrote:
> > +/*
> > + * Some notes about the various counters supported by this "uncore" PMU
> > + * and the design:
> > + *
> > + * All counters are 64 bit long.
> > + * There are no overflow interrupts.
> > + * Counters are summarized per node/socket.
> > + * Most devices appear as separate PCI devices per socket with the exception
> > + * of OCX TLK which appears as one PCI device per socket and contains several
> > + * units with counters that are merged.
> 
> As a general note, as I commented on the QC L2 PMU driver [1,2], we need
> to figure out if we should be aggregating physical PMUs or not.

As said before, although it would be possible to create separate PMUs
for each unit, the individual counters are not interesting. For example
we are not interested in individual counters of Tag-and-data unit 0..7,
we just want the global view.

> Judging by subsequent patches, each unit has individual counters and
> controls, and thus we cannot atomically read/write counters or controls
> across them. As such, I do not think we should aggregate them, and
> should expose them separately to userspace.

That sounds like just moving the problem of aggregating the counters to
user-space. And would make the results even worse, if the user needs
several calls to summarize the counters, given how slow a perf counter
read is.


> That will simplify a number of things (e.g. the CPU migration code no
> longer has to iterate over a list of units).

Sure, it simplifies the kernel part, but it moves the cost to the user.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v4 1/5] arm64: perf: Basic uncore counter support for Cavium ThunderX SOC
@ 2016-11-11  7:37       ` Jan Glauber
  0 siblings, 0 replies; 28+ messages in thread
From: Jan Glauber @ 2016-11-11  7:37 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Nov 10, 2016 at 04:54:06PM +0000, Mark Rutland wrote:
> > +/*
> > + * Some notes about the various counters supported by this "uncore" PMU
> > + * and the design:
> > + *
> > + * All counters are 64 bit long.
> > + * There are no overflow interrupts.
> > + * Counters are summarized per node/socket.
> > + * Most devices appear as separate PCI devices per socket with the exception
> > + * of OCX TLK which appears as one PCI device per socket and contains several
> > + * units with counters that are merged.
> 
> As a general note, as I commented on the QC L2 PMU driver [1,2], we need
> to figure out if we should be aggregating physical PMUs or not.

As said before, although it would be possible to create separate PMUs
for each unit, the individual counters are not interesting. For example
we are not interested in individual counters of Tag-and-data unit 0..7,
we just want the global view.

> Judging by subsequent patches, each unit has individual counters and
> controls, and thus we cannot atomically read/write counters or controls
> across them. As such, I do not think we should aggregate them, and
> should expose them separately to userspace.

That sounds like just moving the problem of aggregating the counters to
user-space. And would make the results even worse, if the user needs
several calls to summarize the counters, given how slow a perf counter
read is.


> That will simplify a number of things (e.g. the CPU migration code no
> longer has to iterate over a list of units).

Sure, it simplifies the kernel part, but it moves the cost to the user.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 1/5] arm64: perf: Basic uncore counter support for Cavium ThunderX SOC
  2016-11-08 23:50     ` Will Deacon
@ 2016-11-11 10:30       ` Jan Glauber
  -1 siblings, 0 replies; 28+ messages in thread
From: Jan Glauber @ 2016-11-11 10:30 UTC (permalink / raw)
  To: Will Deacon; +Cc: Mark Rutland, linux-kernel, linux-arm-kernel

Hi Will,

thanks for the review!

On Tue, Nov 08, 2016 at 11:50:10PM +0000, Will Deacon wrote:
> Hi Jan,
> 
> Thanks for posting an updated series. I have a few minor comments, which
> we can hopefully address in time for 4.10.
> 
> Also, have you run the perf fuzzer with this series applied?

No, that's new to me. Will try it.

> https://github.com/deater/perf_event_tests
> 
> (build the tests and look under the fuzzer/ directory for the tool)
> 
> On Sat, Oct 29, 2016 at 01:55:29PM +0200, Jan Glauber wrote:
> > Provide "uncore" facilities for different non-CPU performance
> > counter units.
> > 
> > The uncore PMUs can be found under /sys/bus/event_source/devices.
> > All counters are exported via sysfs in the corresponding events
> > files under the PMU directory so the perf tool can list the event names.
> > 
> > There are some points that are special in this implementation:
> > 
> > 1) The PMU detection relies on PCI device detection. If a
> >    matching PCI device is found the PMU is created. The code can deal
> >    with multiple units of the same type, e.g. more than one memory
> >    controller.
> > 
> > 2) Counters are summarized across different units of the same type
> >    on one NUMA node but not across NUMA nodes.
> >    For instance L2C TAD 0..7 are presented as a single counter
> >    (adding the values from TAD 0 to 7). Although losing the ability
> >    to read a single value the merged values are easier to use.
> > 
> > 3) The counters are not CPU related. A random CPU is picked regardless
> >    of the NUMA node. There is a small performance penalty for accessing
> >    counters on a remote note but reading a performance counter is a
> >    slow operation anyway.
> > 
> > Signed-off-by: Jan Glauber <jglauber@cavium.com>
> > ---
> >  drivers/perf/Kconfig                |  13 ++
> >  drivers/perf/Makefile               |   1 +
> >  drivers/perf/uncore/Makefile        |   1 +
> >  drivers/perf/uncore/uncore_cavium.c | 351 ++++++++++++++++++++++++++++++++++++
> >  drivers/perf/uncore/uncore_cavium.h |  71 ++++++++
> 
> We already have "uncore" PMUs under drivers/perf, so I'd prefer that we
> renamed this a bit to reflect better what's going on. How about:
> 
>   drivers/perf/cavium/
> 
> and then
> 
>   drivers/perf/cavium/uncore_thunder.[ch]
> 
> ?

OK, agreed.

> >  include/linux/cpuhotplug.h          |   1 +
> >  6 files changed, 438 insertions(+)
> >  create mode 100644 drivers/perf/uncore/Makefile
> >  create mode 100644 drivers/perf/uncore/uncore_cavium.c
> >  create mode 100644 drivers/perf/uncore/uncore_cavium.h
> > 
> > diff --git a/drivers/perf/Kconfig b/drivers/perf/Kconfig
> > index 4d5c5f9..3266c87 100644
> > --- a/drivers/perf/Kconfig
> > +++ b/drivers/perf/Kconfig
> > @@ -19,4 +19,17 @@ config XGENE_PMU
> >          help
> >            Say y if you want to use APM X-Gene SoC performance monitors.
> >  
> > +config UNCORE_PMU
> > +	bool
> 
> This isn't needed.

I thought about a symbol for all uncore pmus. But when drivers/perf is
already that it can be removed.

> > +
> > +config UNCORE_PMU_CAVIUM
> > +	depends on PERF_EVENTS && NUMA && ARM64
> > +	bool "Cavium uncore PMU support"
> 
> Please mentioned Thunder somewhere, since that's the SoC being supported.

OK.

> > +	select UNCORE_PMU
> > +	default y
> > +	help
> > +	  Say y if you want to access performance counters of subsystems
> > +	  on a Cavium SOC like cache controller, memory controller or
> > +	  processor interconnect.
> > +
> >  endmenu
> > diff --git a/drivers/perf/Makefile b/drivers/perf/Makefile
> > index b116e98..d6c02c9 100644
> > --- a/drivers/perf/Makefile
> > +++ b/drivers/perf/Makefile
> > @@ -1,2 +1,3 @@
> >  obj-$(CONFIG_ARM_PMU) += arm_pmu.o
> >  obj-$(CONFIG_XGENE_PMU) += xgene_pmu.o
> > +obj-y += uncore/
> > diff --git a/drivers/perf/uncore/Makefile b/drivers/perf/uncore/Makefile
> > new file mode 100644
> > index 0000000..6130e18
> > --- /dev/null
> > +++ b/drivers/perf/uncore/Makefile
> > @@ -0,0 +1 @@
> > +obj-$(CONFIG_UNCORE_PMU_CAVIUM) += uncore_cavium.o
> > diff --git a/drivers/perf/uncore/uncore_cavium.c b/drivers/perf/uncore/uncore_cavium.c
> > new file mode 100644
> > index 0000000..a7b4277
> > --- /dev/null
> > +++ b/drivers/perf/uncore/uncore_cavium.c
> > @@ -0,0 +1,351 @@
> > +/*
> > + * Cavium Thunder uncore PMU support.
> > + *
> > + * Copyright (C) 2015,2016 Cavium Inc.
> > + * Author: Jan Glauber <jan.glauber@cavium.com>
> > + */
> > +
> > +#include <linux/cpufeature.h>
> > +#include <linux/numa.h>
> > +#include <linux/slab.h>
> > +
> > +#include "uncore_cavium.h"
> > +
> > +/*
> > + * Some notes about the various counters supported by this "uncore" PMU
> > + * and the design:
> > + *
> > + * All counters are 64 bit long.
> > + * There are no overflow interrupts.
> > + * Counters are summarized per node/socket.
> > + * Most devices appear as separate PCI devices per socket with the exception
> > + * of OCX TLK which appears as one PCI device per socket and contains several
> > + * units with counters that are merged.
> > + * Some counters are selected via a control register (L2C TAD) and read by
> > + * a number of counter registers, others (L2C CBC, LMC & OCX TLK) have
> > + * one dedicated counter per event.
> > + * Some counters are not stoppable (L2C CBC & LMC).
> > + * Some counters are read-only (LMC).
> > + * All counters belong to PCI devices, the devices may have additional
> > + * drivers but we assume we are the only user of the counter registers.
> > + * We map the whole PCI BAR so we must be careful to forbid access to
> > + * addresses that contain neither counters nor counter control registers.
> > + */
> > +
> > +void thunder_uncore_read(struct perf_event *event)
> > +{
> 
> Rather than have a bunch of global symbols that are called from the
> individual drivers, why don't you pass a struct of function pointers to
> their respective init functions and keep the internals private?

Most of these functions are already in struct pmu. What I can do is
set the shared functions in thunder_uncore_setup as default, and
only override them as needed (like thunder_uncore_read_ocx_tlk)
or the other way around (use default if not set already).
Then I can get rid of them.

> > +	struct thunder_uncore *uncore = to_uncore(event->pmu);
> > +	struct hw_perf_event *hwc = &event->hw;
> > +	struct thunder_uncore_node *node;
> > +	struct thunder_uncore_unit *unit;
> > +	u64 prev, delta, new = 0;
> > +
> > +	node = get_node(hwc->config, uncore);
> > +
> > +	/* read counter values from all units on the node */
> > +	list_for_each_entry(unit, &node->unit_list, entry)
> > +		new += readq(hwc->event_base + unit->map);
> > +
> > +	prev = local64_read(&hwc->prev_count);
> > +	local64_set(&hwc->prev_count, new);
> > +	delta = new - prev;
> > +	local64_add(delta, &event->count);
> > +}
> > +
> > +int thunder_uncore_add(struct perf_event *event, int flags, u64 config_base,
> > +		       u64 event_base)
> > +{
> > +	struct thunder_uncore *uncore = to_uncore(event->pmu);
> > +	struct hw_perf_event *hwc = &event->hw;
> > +	struct thunder_uncore_node *node;
> > +	int id;
> > +
> > +	node = get_node(hwc->config, uncore);
> > +	id = get_id(hwc->config);
> > +
> > +	if (!cmpxchg(&node->events[id], NULL, event))
> > +		hwc->idx = id;
> 
> Does this need to be a full-fat cmpxchg? Who are you racing with?

Just copy'n'paste from the existing drivers. I guess it can be relaxed.

> > +
> > +	if (hwc->idx == -1)
> > +		return -EBUSY;
> 
> This would be much clearer as an else statement after the cmpxchg.

Agreed.

> > +
> > +	hwc->config_base = config_base;
> > +	hwc->event_base = event_base;
> > +	hwc->state = PERF_HES_UPTODATE | PERF_HES_STOPPED;
> > +
> > +	if (flags & PERF_EF_START)
> > +		uncore->pmu.start(event, PERF_EF_RELOAD);
> > +
> > +	return 0;
> > +}
> > +
> > +void thunder_uncore_del(struct perf_event *event, int flags)
> > +{
> > +	struct thunder_uncore *uncore = to_uncore(event->pmu);
> > +	struct hw_perf_event *hwc = &event->hw;
> > +	struct thunder_uncore_node *node;
> > +	int i;
> > +
> > +	event->pmu->stop(event, PERF_EF_UPDATE);
> > +
> > +	/*
> > +	 * For programmable counters we need to check where we installed it.
> > +	 * To keep this function generic always test the more complicated
> > +	 * case (free running counters won't need the loop).
> > +	 */
> > +	node = get_node(hwc->config, uncore);
> > +	for (i = 0; i < node->num_counters; i++) {
> > +		if (cmpxchg(&node->events[i], event, NULL) == event)
> > +			break;
> > +	}
> > +	hwc->idx = -1;
> > +}
> > +
> > +void thunder_uncore_start(struct perf_event *event, int flags)
> > +{
> > +	struct thunder_uncore *uncore = to_uncore(event->pmu);
> > +	struct hw_perf_event *hwc = &event->hw;
> > +	struct thunder_uncore_node *node;
> > +	struct thunder_uncore_unit *unit;
> > +	u64 new = 0;
> > +
> > +	/* read counter values from all units on the node */
> > +	node = get_node(hwc->config, uncore);
> > +	list_for_each_entry(unit, &node->unit_list, entry)
> > +		new += readq(hwc->event_base + unit->map);
> > +	local64_set(&hwc->prev_count, new);
> > +
> > +	hwc->state = 0;
> > +	perf_event_update_userpage(event);
> > +}
> > +
> > +void thunder_uncore_stop(struct perf_event *event, int flags)
> > +{
> > +	struct hw_perf_event *hwc = &event->hw;
> > +
> > +	hwc->state |= PERF_HES_STOPPED;
> > +
> > +	if ((flags & PERF_EF_UPDATE) && !(hwc->state & PERF_HES_UPTODATE)) {
> > +		thunder_uncore_read(event);
> > +		hwc->state |= PERF_HES_UPTODATE;
> > +	}
> > +}
> > +
> > +int thunder_uncore_event_init(struct perf_event *event)
> > +{
> > +	struct hw_perf_event *hwc = &event->hw;
> > +	struct thunder_uncore_node *node;
> > +	struct thunder_uncore *uncore;
> > +
> > +	if (event->attr.type != event->pmu->type)
> > +		return -ENOENT;
> > +
> > +	/* we do not support sampling */
> > +	if (is_sampling_event(event))
> > +		return -EINVAL;
> > +
> > +	/* counters do not have these bits */
> > +	if (event->attr.exclude_user	||
> > +	    event->attr.exclude_kernel	||
> > +	    event->attr.exclude_host	||
> > +	    event->attr.exclude_guest	||
> > +	    event->attr.exclude_hv	||
> > +	    event->attr.exclude_idle)
> > +		return -EINVAL;
> > +
> > +	uncore = to_uncore(event->pmu);
> > +	if (!uncore)
> > +		return -ENODEV;
> > +	if (!uncore->event_valid(event->attr.config & UNCORE_EVENT_ID_MASK))
> > +		return -EINVAL;
> > +
> > +	/* check NUMA node */
> > +	node = get_node(event->attr.config, uncore);
> > +	if (!node) {
> > +		pr_debug("Invalid NUMA node selected\n");
> > +		return -EINVAL;
> > +	}
> > +
> > +	hwc->config = event->attr.config;
> > +	hwc->idx = -1;
> > +	return 0;
> > +}
> > +
> > +static ssize_t thunder_uncore_attr_show_cpumask(struct device *dev,
> > +						struct device_attribute *attr,
> > +						char *buf)
> > +{
> > +	struct pmu *pmu = dev_get_drvdata(dev);
> > +	struct thunder_uncore *uncore =
> > +		container_of(pmu, struct thunder_uncore, pmu);
> > +
> > +	return cpumap_print_to_pagebuf(true, buf, &uncore->active_mask);
> > +}
> > +static DEVICE_ATTR(cpumask, S_IRUGO, thunder_uncore_attr_show_cpumask, NULL);
> > +
> > +static struct attribute *thunder_uncore_attrs[] = {
> > +	&dev_attr_cpumask.attr,
> > +	NULL,
> > +};
> > +
> > +struct attribute_group thunder_uncore_attr_group = {
> > +	.attrs = thunder_uncore_attrs,
> > +};
> > +
> > +ssize_t thunder_events_sysfs_show(struct device *dev,
> > +				  struct device_attribute *attr,
> > +				  char *page)
> > +{
> > +	struct perf_pmu_events_attr *pmu_attr =
> > +		container_of(attr, struct perf_pmu_events_attr, attr);
> > +
> > +	if (pmu_attr->event_str)
> > +		return sprintf(page, "%s", pmu_attr->event_str);
> > +
> > +	return 0;
> > +}
> > +
> > +/* node attribute depending on number of NUMA nodes */
> > +static ssize_t node_show(struct device *dev, struct device_attribute *attr,
> > +			 char *page)
> > +{
> > +	if (NODES_SHIFT)
> > +		return sprintf(page, "config:16-%d\n", 16 + NODES_SHIFT - 1);
> 
> If NODES_SHIFT is 1, you'll end up with "config:16-16", which might confuse
> userspace.

So should I use "config:16" in that case? Is it OK to use this also for
NODES_SHIFT=0 ?

> > +	else
> > +		return sprintf(page, "config:16\n");
> > +}
> > +
> > +struct device_attribute format_attr_node = __ATTR_RO(node);
> > +
> > +/*
> > + * Thunder uncore events are independent from CPUs. Provide a cpumask
> > + * nevertheless to prevent perf from adding the event per-cpu and just
> > + * set the mask to one online CPU. Use the same cpumask for all uncore
> > + * devices.
> > + *
> > + * There is a performance penalty for accessing a device from a CPU on
> > + * another socket, but we do not care (yet).
> > + */
> > +static int thunder_uncore_offline_cpu(unsigned int old_cpu, struct hlist_node *node)
> > +{
> > +	struct thunder_uncore *uncore = hlist_entry_safe(node, struct thunder_uncore, node);
> 
> Why _safe?

Not required, will remove.

> > +	int new_cpu;
> > +
> > +	if (!cpumask_test_and_clear_cpu(old_cpu, &uncore->active_mask))
> > +		return 0;
> > +	new_cpu = cpumask_any_but(cpu_online_mask, old_cpu);
> > +	if (new_cpu >= nr_cpu_ids)
> > +		return 0;
> > +	perf_pmu_migrate_context(&uncore->pmu, old_cpu, new_cpu);
> > +	cpumask_set_cpu(new_cpu, &uncore->active_mask);
> > +	return 0;
> > +}
> > +
> > +static struct thunder_uncore_node * __init alloc_node(struct thunder_uncore *uncore,
> > +						      int node_id, int counters)
> > +{
> > +	struct thunder_uncore_node *node;
> > +
> > +	node = kzalloc(sizeof(*node), GFP_KERNEL);
> > +	if (!node)
> > +		return NULL;
> > +	node->num_counters = counters;
> > +	INIT_LIST_HEAD(&node->unit_list);
> > +	return node;
> > +}
> > +
> > +int __init thunder_uncore_setup(struct thunder_uncore *uncore, int device_id,
> > +				struct pmu *pmu, int counters)
> > +{
> > +	unsigned int vendor_id = PCI_VENDOR_ID_CAVIUM;
> > +	struct thunder_uncore_unit  *unit, *tmp;
> > +	struct thunder_uncore_node *node;
> > +	struct pci_dev *pdev = NULL;
> > +	int ret, node_id, found = 0;
> > +
> > +	/* detect PCI devices */
> > +	while ((pdev = pci_get_device(vendor_id, device_id, pdev))) {
> 
> Redundant brackets?

OK

> > +		if (!pdev)
> > +			break;
> 
> Redundant check?

OK

> > +		node_id = dev_to_node(&pdev->dev);
> > +
> > +		/* allocate node if necessary */
> > +		if (!uncore->nodes[node_id])
> > +			uncore->nodes[node_id] = alloc_node(uncore, node_id, counters);
> > +
> > +		node = uncore->nodes[node_id];
> > +		if (!node) {
> > +			ret = -ENOMEM;
> > +			goto fail;
> > +		}
> > +
> > +		unit = kzalloc(sizeof(*unit), GFP_KERNEL);
> > +		if (!unit) {
> > +			ret = -ENOMEM;
> > +			goto fail;
> > +		}
> > +
> > +		unit->pdev = pdev;
> > +		unit->map = ioremap(pci_resource_start(pdev, 0),
> > +				    pci_resource_len(pdev, 0));
> > +		list_add(&unit->entry, &node->unit_list);
> > +		node->nr_units++;
> > +		found++;
> > +	}
> > +
> > +	if (!found)
> > +		return -ENODEV;
> > +
> > +	cpuhp_state_add_instance_nocalls(CPUHP_AP_UNCORE_CAVIUM_ONLINE,
> > +                                         &uncore->node);
> > +
> > +	/*
> > +	 * perf PMU is CPU dependent in difference to our uncore devices.
> > +	 * Just pick a CPU and migrate away if it goes offline.
> > +	 */
> > +	cpumask_set_cpu(smp_processor_id(), &uncore->active_mask);
> > +
> > +	uncore->pmu = *pmu;
> > +	ret = perf_pmu_register(&uncore->pmu, uncore->pmu.name, -1);
> > +	if (ret)
> > +		goto fail;
> > +
> > +	return 0;
> > +
> > +fail:
> > +	node_id = 0;
> > +	while (uncore->nodes[node_id]) {
> > +		node = uncore->nodes[node_id];
> > +
> > +		list_for_each_entry_safe(unit, tmp, &node->unit_list, entry) {
> 
> Why do you need the _safe variant?

OK, not needed

> Will

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v4 1/5] arm64: perf: Basic uncore counter support for Cavium ThunderX SOC
@ 2016-11-11 10:30       ` Jan Glauber
  0 siblings, 0 replies; 28+ messages in thread
From: Jan Glauber @ 2016-11-11 10:30 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Will,

thanks for the review!

On Tue, Nov 08, 2016 at 11:50:10PM +0000, Will Deacon wrote:
> Hi Jan,
> 
> Thanks for posting an updated series. I have a few minor comments, which
> we can hopefully address in time for 4.10.
> 
> Also, have you run the perf fuzzer with this series applied?

No, that's new to me. Will try it.

> https://github.com/deater/perf_event_tests
> 
> (build the tests and look under the fuzzer/ directory for the tool)
> 
> On Sat, Oct 29, 2016 at 01:55:29PM +0200, Jan Glauber wrote:
> > Provide "uncore" facilities for different non-CPU performance
> > counter units.
> > 
> > The uncore PMUs can be found under /sys/bus/event_source/devices.
> > All counters are exported via sysfs in the corresponding events
> > files under the PMU directory so the perf tool can list the event names.
> > 
> > There are some points that are special in this implementation:
> > 
> > 1) The PMU detection relies on PCI device detection. If a
> >    matching PCI device is found the PMU is created. The code can deal
> >    with multiple units of the same type, e.g. more than one memory
> >    controller.
> > 
> > 2) Counters are summarized across different units of the same type
> >    on one NUMA node but not across NUMA nodes.
> >    For instance L2C TAD 0..7 are presented as a single counter
> >    (adding the values from TAD 0 to 7). Although losing the ability
> >    to read a single value the merged values are easier to use.
> > 
> > 3) The counters are not CPU related. A random CPU is picked regardless
> >    of the NUMA node. There is a small performance penalty for accessing
> >    counters on a remote note but reading a performance counter is a
> >    slow operation anyway.
> > 
> > Signed-off-by: Jan Glauber <jglauber@cavium.com>
> > ---
> >  drivers/perf/Kconfig                |  13 ++
> >  drivers/perf/Makefile               |   1 +
> >  drivers/perf/uncore/Makefile        |   1 +
> >  drivers/perf/uncore/uncore_cavium.c | 351 ++++++++++++++++++++++++++++++++++++
> >  drivers/perf/uncore/uncore_cavium.h |  71 ++++++++
> 
> We already have "uncore" PMUs under drivers/perf, so I'd prefer that we
> renamed this a bit to reflect better what's going on. How about:
> 
>   drivers/perf/cavium/
> 
> and then
> 
>   drivers/perf/cavium/uncore_thunder.[ch]
> 
> ?

OK, agreed.

> >  include/linux/cpuhotplug.h          |   1 +
> >  6 files changed, 438 insertions(+)
> >  create mode 100644 drivers/perf/uncore/Makefile
> >  create mode 100644 drivers/perf/uncore/uncore_cavium.c
> >  create mode 100644 drivers/perf/uncore/uncore_cavium.h
> > 
> > diff --git a/drivers/perf/Kconfig b/drivers/perf/Kconfig
> > index 4d5c5f9..3266c87 100644
> > --- a/drivers/perf/Kconfig
> > +++ b/drivers/perf/Kconfig
> > @@ -19,4 +19,17 @@ config XGENE_PMU
> >          help
> >            Say y if you want to use APM X-Gene SoC performance monitors.
> >  
> > +config UNCORE_PMU
> > +	bool
> 
> This isn't needed.

I thought about a symbol for all uncore pmus. But when drivers/perf is
already that it can be removed.

> > +
> > +config UNCORE_PMU_CAVIUM
> > +	depends on PERF_EVENTS && NUMA && ARM64
> > +	bool "Cavium uncore PMU support"
> 
> Please mentioned Thunder somewhere, since that's the SoC being supported.

OK.

> > +	select UNCORE_PMU
> > +	default y
> > +	help
> > +	  Say y if you want to access performance counters of subsystems
> > +	  on a Cavium SOC like cache controller, memory controller or
> > +	  processor interconnect.
> > +
> >  endmenu
> > diff --git a/drivers/perf/Makefile b/drivers/perf/Makefile
> > index b116e98..d6c02c9 100644
> > --- a/drivers/perf/Makefile
> > +++ b/drivers/perf/Makefile
> > @@ -1,2 +1,3 @@
> >  obj-$(CONFIG_ARM_PMU) += arm_pmu.o
> >  obj-$(CONFIG_XGENE_PMU) += xgene_pmu.o
> > +obj-y += uncore/
> > diff --git a/drivers/perf/uncore/Makefile b/drivers/perf/uncore/Makefile
> > new file mode 100644
> > index 0000000..6130e18
> > --- /dev/null
> > +++ b/drivers/perf/uncore/Makefile
> > @@ -0,0 +1 @@
> > +obj-$(CONFIG_UNCORE_PMU_CAVIUM) += uncore_cavium.o
> > diff --git a/drivers/perf/uncore/uncore_cavium.c b/drivers/perf/uncore/uncore_cavium.c
> > new file mode 100644
> > index 0000000..a7b4277
> > --- /dev/null
> > +++ b/drivers/perf/uncore/uncore_cavium.c
> > @@ -0,0 +1,351 @@
> > +/*
> > + * Cavium Thunder uncore PMU support.
> > + *
> > + * Copyright (C) 2015,2016 Cavium Inc.
> > + * Author: Jan Glauber <jan.glauber@cavium.com>
> > + */
> > +
> > +#include <linux/cpufeature.h>
> > +#include <linux/numa.h>
> > +#include <linux/slab.h>
> > +
> > +#include "uncore_cavium.h"
> > +
> > +/*
> > + * Some notes about the various counters supported by this "uncore" PMU
> > + * and the design:
> > + *
> > + * All counters are 64 bit long.
> > + * There are no overflow interrupts.
> > + * Counters are summarized per node/socket.
> > + * Most devices appear as separate PCI devices per socket with the exception
> > + * of OCX TLK which appears as one PCI device per socket and contains several
> > + * units with counters that are merged.
> > + * Some counters are selected via a control register (L2C TAD) and read by
> > + * a number of counter registers, others (L2C CBC, LMC & OCX TLK) have
> > + * one dedicated counter per event.
> > + * Some counters are not stoppable (L2C CBC & LMC).
> > + * Some counters are read-only (LMC).
> > + * All counters belong to PCI devices, the devices may have additional
> > + * drivers but we assume we are the only user of the counter registers.
> > + * We map the whole PCI BAR so we must be careful to forbid access to
> > + * addresses that contain neither counters nor counter control registers.
> > + */
> > +
> > +void thunder_uncore_read(struct perf_event *event)
> > +{
> 
> Rather than have a bunch of global symbols that are called from the
> individual drivers, why don't you pass a struct of function pointers to
> their respective init functions and keep the internals private?

Most of these functions are already in struct pmu. What I can do is
set the shared functions in thunder_uncore_setup as default, and
only override them as needed (like thunder_uncore_read_ocx_tlk)
or the other way around (use default if not set already).
Then I can get rid of them.

> > +	struct thunder_uncore *uncore = to_uncore(event->pmu);
> > +	struct hw_perf_event *hwc = &event->hw;
> > +	struct thunder_uncore_node *node;
> > +	struct thunder_uncore_unit *unit;
> > +	u64 prev, delta, new = 0;
> > +
> > +	node = get_node(hwc->config, uncore);
> > +
> > +	/* read counter values from all units on the node */
> > +	list_for_each_entry(unit, &node->unit_list, entry)
> > +		new += readq(hwc->event_base + unit->map);
> > +
> > +	prev = local64_read(&hwc->prev_count);
> > +	local64_set(&hwc->prev_count, new);
> > +	delta = new - prev;
> > +	local64_add(delta, &event->count);
> > +}
> > +
> > +int thunder_uncore_add(struct perf_event *event, int flags, u64 config_base,
> > +		       u64 event_base)
> > +{
> > +	struct thunder_uncore *uncore = to_uncore(event->pmu);
> > +	struct hw_perf_event *hwc = &event->hw;
> > +	struct thunder_uncore_node *node;
> > +	int id;
> > +
> > +	node = get_node(hwc->config, uncore);
> > +	id = get_id(hwc->config);
> > +
> > +	if (!cmpxchg(&node->events[id], NULL, event))
> > +		hwc->idx = id;
> 
> Does this need to be a full-fat cmpxchg? Who are you racing with?

Just copy'n'paste from the existing drivers. I guess it can be relaxed.

> > +
> > +	if (hwc->idx == -1)
> > +		return -EBUSY;
> 
> This would be much clearer as an else statement after the cmpxchg.

Agreed.

> > +
> > +	hwc->config_base = config_base;
> > +	hwc->event_base = event_base;
> > +	hwc->state = PERF_HES_UPTODATE | PERF_HES_STOPPED;
> > +
> > +	if (flags & PERF_EF_START)
> > +		uncore->pmu.start(event, PERF_EF_RELOAD);
> > +
> > +	return 0;
> > +}
> > +
> > +void thunder_uncore_del(struct perf_event *event, int flags)
> > +{
> > +	struct thunder_uncore *uncore = to_uncore(event->pmu);
> > +	struct hw_perf_event *hwc = &event->hw;
> > +	struct thunder_uncore_node *node;
> > +	int i;
> > +
> > +	event->pmu->stop(event, PERF_EF_UPDATE);
> > +
> > +	/*
> > +	 * For programmable counters we need to check where we installed it.
> > +	 * To keep this function generic always test the more complicated
> > +	 * case (free running counters won't need the loop).
> > +	 */
> > +	node = get_node(hwc->config, uncore);
> > +	for (i = 0; i < node->num_counters; i++) {
> > +		if (cmpxchg(&node->events[i], event, NULL) == event)
> > +			break;
> > +	}
> > +	hwc->idx = -1;
> > +}
> > +
> > +void thunder_uncore_start(struct perf_event *event, int flags)
> > +{
> > +	struct thunder_uncore *uncore = to_uncore(event->pmu);
> > +	struct hw_perf_event *hwc = &event->hw;
> > +	struct thunder_uncore_node *node;
> > +	struct thunder_uncore_unit *unit;
> > +	u64 new = 0;
> > +
> > +	/* read counter values from all units on the node */
> > +	node = get_node(hwc->config, uncore);
> > +	list_for_each_entry(unit, &node->unit_list, entry)
> > +		new += readq(hwc->event_base + unit->map);
> > +	local64_set(&hwc->prev_count, new);
> > +
> > +	hwc->state = 0;
> > +	perf_event_update_userpage(event);
> > +}
> > +
> > +void thunder_uncore_stop(struct perf_event *event, int flags)
> > +{
> > +	struct hw_perf_event *hwc = &event->hw;
> > +
> > +	hwc->state |= PERF_HES_STOPPED;
> > +
> > +	if ((flags & PERF_EF_UPDATE) && !(hwc->state & PERF_HES_UPTODATE)) {
> > +		thunder_uncore_read(event);
> > +		hwc->state |= PERF_HES_UPTODATE;
> > +	}
> > +}
> > +
> > +int thunder_uncore_event_init(struct perf_event *event)
> > +{
> > +	struct hw_perf_event *hwc = &event->hw;
> > +	struct thunder_uncore_node *node;
> > +	struct thunder_uncore *uncore;
> > +
> > +	if (event->attr.type != event->pmu->type)
> > +		return -ENOENT;
> > +
> > +	/* we do not support sampling */
> > +	if (is_sampling_event(event))
> > +		return -EINVAL;
> > +
> > +	/* counters do not have these bits */
> > +	if (event->attr.exclude_user	||
> > +	    event->attr.exclude_kernel	||
> > +	    event->attr.exclude_host	||
> > +	    event->attr.exclude_guest	||
> > +	    event->attr.exclude_hv	||
> > +	    event->attr.exclude_idle)
> > +		return -EINVAL;
> > +
> > +	uncore = to_uncore(event->pmu);
> > +	if (!uncore)
> > +		return -ENODEV;
> > +	if (!uncore->event_valid(event->attr.config & UNCORE_EVENT_ID_MASK))
> > +		return -EINVAL;
> > +
> > +	/* check NUMA node */
> > +	node = get_node(event->attr.config, uncore);
> > +	if (!node) {
> > +		pr_debug("Invalid NUMA node selected\n");
> > +		return -EINVAL;
> > +	}
> > +
> > +	hwc->config = event->attr.config;
> > +	hwc->idx = -1;
> > +	return 0;
> > +}
> > +
> > +static ssize_t thunder_uncore_attr_show_cpumask(struct device *dev,
> > +						struct device_attribute *attr,
> > +						char *buf)
> > +{
> > +	struct pmu *pmu = dev_get_drvdata(dev);
> > +	struct thunder_uncore *uncore =
> > +		container_of(pmu, struct thunder_uncore, pmu);
> > +
> > +	return cpumap_print_to_pagebuf(true, buf, &uncore->active_mask);
> > +}
> > +static DEVICE_ATTR(cpumask, S_IRUGO, thunder_uncore_attr_show_cpumask, NULL);
> > +
> > +static struct attribute *thunder_uncore_attrs[] = {
> > +	&dev_attr_cpumask.attr,
> > +	NULL,
> > +};
> > +
> > +struct attribute_group thunder_uncore_attr_group = {
> > +	.attrs = thunder_uncore_attrs,
> > +};
> > +
> > +ssize_t thunder_events_sysfs_show(struct device *dev,
> > +				  struct device_attribute *attr,
> > +				  char *page)
> > +{
> > +	struct perf_pmu_events_attr *pmu_attr =
> > +		container_of(attr, struct perf_pmu_events_attr, attr);
> > +
> > +	if (pmu_attr->event_str)
> > +		return sprintf(page, "%s", pmu_attr->event_str);
> > +
> > +	return 0;
> > +}
> > +
> > +/* node attribute depending on number of NUMA nodes */
> > +static ssize_t node_show(struct device *dev, struct device_attribute *attr,
> > +			 char *page)
> > +{
> > +	if (NODES_SHIFT)
> > +		return sprintf(page, "config:16-%d\n", 16 + NODES_SHIFT - 1);
> 
> If NODES_SHIFT is 1, you'll end up with "config:16-16", which might confuse
> userspace.

So should I use "config:16" in that case? Is it OK to use this also for
NODES_SHIFT=0 ?

> > +	else
> > +		return sprintf(page, "config:16\n");
> > +}
> > +
> > +struct device_attribute format_attr_node = __ATTR_RO(node);
> > +
> > +/*
> > + * Thunder uncore events are independent from CPUs. Provide a cpumask
> > + * nevertheless to prevent perf from adding the event per-cpu and just
> > + * set the mask to one online CPU. Use the same cpumask for all uncore
> > + * devices.
> > + *
> > + * There is a performance penalty for accessing a device from a CPU on
> > + * another socket, but we do not care (yet).
> > + */
> > +static int thunder_uncore_offline_cpu(unsigned int old_cpu, struct hlist_node *node)
> > +{
> > +	struct thunder_uncore *uncore = hlist_entry_safe(node, struct thunder_uncore, node);
> 
> Why _safe?

Not required, will remove.

> > +	int new_cpu;
> > +
> > +	if (!cpumask_test_and_clear_cpu(old_cpu, &uncore->active_mask))
> > +		return 0;
> > +	new_cpu = cpumask_any_but(cpu_online_mask, old_cpu);
> > +	if (new_cpu >= nr_cpu_ids)
> > +		return 0;
> > +	perf_pmu_migrate_context(&uncore->pmu, old_cpu, new_cpu);
> > +	cpumask_set_cpu(new_cpu, &uncore->active_mask);
> > +	return 0;
> > +}
> > +
> > +static struct thunder_uncore_node * __init alloc_node(struct thunder_uncore *uncore,
> > +						      int node_id, int counters)
> > +{
> > +	struct thunder_uncore_node *node;
> > +
> > +	node = kzalloc(sizeof(*node), GFP_KERNEL);
> > +	if (!node)
> > +		return NULL;
> > +	node->num_counters = counters;
> > +	INIT_LIST_HEAD(&node->unit_list);
> > +	return node;
> > +}
> > +
> > +int __init thunder_uncore_setup(struct thunder_uncore *uncore, int device_id,
> > +				struct pmu *pmu, int counters)
> > +{
> > +	unsigned int vendor_id = PCI_VENDOR_ID_CAVIUM;
> > +	struct thunder_uncore_unit  *unit, *tmp;
> > +	struct thunder_uncore_node *node;
> > +	struct pci_dev *pdev = NULL;
> > +	int ret, node_id, found = 0;
> > +
> > +	/* detect PCI devices */
> > +	while ((pdev = pci_get_device(vendor_id, device_id, pdev))) {
> 
> Redundant brackets?

OK

> > +		if (!pdev)
> > +			break;
> 
> Redundant check?

OK

> > +		node_id = dev_to_node(&pdev->dev);
> > +
> > +		/* allocate node if necessary */
> > +		if (!uncore->nodes[node_id])
> > +			uncore->nodes[node_id] = alloc_node(uncore, node_id, counters);
> > +
> > +		node = uncore->nodes[node_id];
> > +		if (!node) {
> > +			ret = -ENOMEM;
> > +			goto fail;
> > +		}
> > +
> > +		unit = kzalloc(sizeof(*unit), GFP_KERNEL);
> > +		if (!unit) {
> > +			ret = -ENOMEM;
> > +			goto fail;
> > +		}
> > +
> > +		unit->pdev = pdev;
> > +		unit->map = ioremap(pci_resource_start(pdev, 0),
> > +				    pci_resource_len(pdev, 0));
> > +		list_add(&unit->entry, &node->unit_list);
> > +		node->nr_units++;
> > +		found++;
> > +	}
> > +
> > +	if (!found)
> > +		return -ENODEV;
> > +
> > +	cpuhp_state_add_instance_nocalls(CPUHP_AP_UNCORE_CAVIUM_ONLINE,
> > +                                         &uncore->node);
> > +
> > +	/*
> > +	 * perf PMU is CPU dependent in difference to our uncore devices.
> > +	 * Just pick a CPU and migrate away if it goes offline.
> > +	 */
> > +	cpumask_set_cpu(smp_processor_id(), &uncore->active_mask);
> > +
> > +	uncore->pmu = *pmu;
> > +	ret = perf_pmu_register(&uncore->pmu, uncore->pmu.name, -1);
> > +	if (ret)
> > +		goto fail;
> > +
> > +	return 0;
> > +
> > +fail:
> > +	node_id = 0;
> > +	while (uncore->nodes[node_id]) {
> > +		node = uncore->nodes[node_id];
> > +
> > +		list_for_each_entry_safe(unit, tmp, &node->unit_list, entry) {
> 
> Why do you need the _safe variant?

OK, not needed

> Will

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 1/5] arm64: perf: Basic uncore counter support for Cavium ThunderX SOC
  2016-11-10 16:54     ` Mark Rutland
@ 2016-11-11 10:39       ` Jan Glauber
  -1 siblings, 0 replies; 28+ messages in thread
From: Jan Glauber @ 2016-11-11 10:39 UTC (permalink / raw)
  To: Mark Rutland; +Cc: Will Deacon, linux-kernel, linux-arm-kernel

Hi Mark,

thanks for reviewing. One question below, for most of your other comments
I think we need to come to a conclusion about the aggregation first.

On Thu, Nov 10, 2016 at 04:54:06PM +0000, Mark Rutland wrote:
> Hi Jan,
> 
> Apologies for the delay in getting to this.
> 
> On Sat, Oct 29, 2016 at 01:55:29PM +0200, Jan Glauber wrote:
> > diff --git a/drivers/perf/uncore/uncore_cavium.c b/drivers/perf/uncore/uncore_cavium.c
> > new file mode 100644
> > index 0000000..a7b4277
> > --- /dev/null
> > +++ b/drivers/perf/uncore/uncore_cavium.c
> > @@ -0,0 +1,351 @@
> > +/*
> > + * Cavium Thunder uncore PMU support.
> > + *
> > + * Copyright (C) 2015,2016 Cavium Inc.
> > + * Author: Jan Glauber <jan.glauber@cavium.com>
> > + */
> > +
> > +#include <linux/cpufeature.h>
> > +#include <linux/numa.h>
> > +#include <linux/slab.h>
> 
> I believe the following includes are necessary for APIs and/or data
> explicitly referenced by the driver code:
> 
> #include <linux/atomic.h>
> #include <linux/cpuhotplug.h>
> #include <linux/cpumask.h>
> #include <linux/device.h>
> #include <linux/errno.h>
> #include <linux/io.h>
> #include <linux/kernel.h>
> #include <linux/list.h>
> #include <linux/pci.h>
> #include <linux/perf_event.h>
> #include <linux/printk.h>
> #include <linux/smp.h>
> #include <linux/sysfs.h>
> #include <linux/types.h>
> 
> #include <asm/local64.h>
> 
> ... please add those here.

Should I also add includes that are already in the included by uncore_cavium.h?
I usually avoid includes that come through the "local" header file.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v4 1/5] arm64: perf: Basic uncore counter support for Cavium ThunderX SOC
@ 2016-11-11 10:39       ` Jan Glauber
  0 siblings, 0 replies; 28+ messages in thread
From: Jan Glauber @ 2016-11-11 10:39 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Mark,

thanks for reviewing. One question below, for most of your other comments
I think we need to come to a conclusion about the aggregation first.

On Thu, Nov 10, 2016 at 04:54:06PM +0000, Mark Rutland wrote:
> Hi Jan,
> 
> Apologies for the delay in getting to this.
> 
> On Sat, Oct 29, 2016 at 01:55:29PM +0200, Jan Glauber wrote:
> > diff --git a/drivers/perf/uncore/uncore_cavium.c b/drivers/perf/uncore/uncore_cavium.c
> > new file mode 100644
> > index 0000000..a7b4277
> > --- /dev/null
> > +++ b/drivers/perf/uncore/uncore_cavium.c
> > @@ -0,0 +1,351 @@
> > +/*
> > + * Cavium Thunder uncore PMU support.
> > + *
> > + * Copyright (C) 2015,2016 Cavium Inc.
> > + * Author: Jan Glauber <jan.glauber@cavium.com>
> > + */
> > +
> > +#include <linux/cpufeature.h>
> > +#include <linux/numa.h>
> > +#include <linux/slab.h>
> 
> I believe the following includes are necessary for APIs and/or data
> explicitly referenced by the driver code:
> 
> #include <linux/atomic.h>
> #include <linux/cpuhotplug.h>
> #include <linux/cpumask.h>
> #include <linux/device.h>
> #include <linux/errno.h>
> #include <linux/io.h>
> #include <linux/kernel.h>
> #include <linux/list.h>
> #include <linux/pci.h>
> #include <linux/perf_event.h>
> #include <linux/printk.h>
> #include <linux/smp.h>
> #include <linux/sysfs.h>
> #include <linux/types.h>
> 
> #include <asm/local64.h>
> 
> ... please add those here.

Should I also add includes that are already in the included by uncore_cavium.h?
I usually avoid includes that come through the "local" header file.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 1/5] arm64: perf: Basic uncore counter support for Cavium ThunderX SOC
  2016-11-11 10:39       ` Jan Glauber
@ 2016-11-11 11:18         ` Mark Rutland
  -1 siblings, 0 replies; 28+ messages in thread
From: Mark Rutland @ 2016-11-11 11:18 UTC (permalink / raw)
  To: Jan Glauber; +Cc: Will Deacon, linux-kernel, linux-arm-kernel

On Fri, Nov 11, 2016 at 11:39:21AM +0100, Jan Glauber wrote:
> Hi Mark,
> 
> thanks for reviewing. One question below,

> On Thu, Nov 10, 2016 at 04:54:06PM +0000, Mark Rutland wrote:

> > On Sat, Oct 29, 2016 at 01:55:29PM +0200, Jan Glauber wrote:

> > > +#include <linux/cpufeature.h>
> > > +#include <linux/numa.h>
> > > +#include <linux/slab.h>
> > 
> > I believe the following includes are necessary for APIs and/or data
> > explicitly referenced by the driver code:

[...]

> Should I also add includes that are already in the included by uncore_cavium.h?

Please do.

> I usually avoid includes that come through the "local" header file.

Generally, when you explcitly use some macro/function/data in a file,
that file should have the relevant include. 

If something's only used in the header (e.g. hidden in a macro or inline
function), then we only need that include in the header.

For example: uncore_cavium.h uses container_of(), and should include
<linux/kernel.h>. Also, uncore_cavium.c also uses container_of()
directly for something unrelated, and should also include
<linux/kernel.h>.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v4 1/5] arm64: perf: Basic uncore counter support for Cavium ThunderX SOC
@ 2016-11-11 11:18         ` Mark Rutland
  0 siblings, 0 replies; 28+ messages in thread
From: Mark Rutland @ 2016-11-11 11:18 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Nov 11, 2016 at 11:39:21AM +0100, Jan Glauber wrote:
> Hi Mark,
> 
> thanks for reviewing. One question below,

> On Thu, Nov 10, 2016 at 04:54:06PM +0000, Mark Rutland wrote:

> > On Sat, Oct 29, 2016 at 01:55:29PM +0200, Jan Glauber wrote:

> > > +#include <linux/cpufeature.h>
> > > +#include <linux/numa.h>
> > > +#include <linux/slab.h>
> > 
> > I believe the following includes are necessary for APIs and/or data
> > explicitly referenced by the driver code:

[...]

> Should I also add includes that are already in the included by uncore_cavium.h?

Please do.

> I usually avoid includes that come through the "local" header file.

Generally, when you explcitly use some macro/function/data in a file,
that file should have the relevant include. 

If something's only used in the header (e.g. hidden in a macro or inline
function), then we only need that include in the header.

For example: uncore_cavium.h uses container_of(), and should include
<linux/kernel.h>. Also, uncore_cavium.c also uses container_of()
directly for something unrelated, and should also include
<linux/kernel.h>.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 1/5] arm64: perf: Basic uncore counter support for Cavium ThunderX SOC
  2016-11-11 10:30       ` Jan Glauber
@ 2016-11-17 18:10         ` Will Deacon
  -1 siblings, 0 replies; 28+ messages in thread
From: Will Deacon @ 2016-11-17 18:10 UTC (permalink / raw)
  To: Jan Glauber; +Cc: Mark Rutland, linux-kernel, linux-arm-kernel

On Fri, Nov 11, 2016 at 11:30:29AM +0100, Jan Glauber wrote:
> On Tue, Nov 08, 2016 at 11:50:10PM +0000, Will Deacon wrote:
> > On Sat, Oct 29, 2016 at 01:55:29PM +0200, Jan Glauber wrote:
> > > +/* node attribute depending on number of NUMA nodes */
> > > +static ssize_t node_show(struct device *dev, struct device_attribute *attr,
> > > +			 char *page)
> > > +{
> > > +	if (NODES_SHIFT)
> > > +		return sprintf(page, "config:16-%d\n", 16 + NODES_SHIFT - 1);
> > 
> > If NODES_SHIFT is 1, you'll end up with "config:16-16", which might confuse
> > userspace.
> 
> So should I use "config:16" in that case? Is it OK to use this also for
> NODES_SHIFT=0 ?

If you only need one bit, then "config:16" is the right thing to do.

Will

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v4 1/5] arm64: perf: Basic uncore counter support for Cavium ThunderX SOC
@ 2016-11-17 18:10         ` Will Deacon
  0 siblings, 0 replies; 28+ messages in thread
From: Will Deacon @ 2016-11-17 18:10 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Nov 11, 2016 at 11:30:29AM +0100, Jan Glauber wrote:
> On Tue, Nov 08, 2016 at 11:50:10PM +0000, Will Deacon wrote:
> > On Sat, Oct 29, 2016 at 01:55:29PM +0200, Jan Glauber wrote:
> > > +/* node attribute depending on number of NUMA nodes */
> > > +static ssize_t node_show(struct device *dev, struct device_attribute *attr,
> > > +			 char *page)
> > > +{
> > > +	if (NODES_SHIFT)
> > > +		return sprintf(page, "config:16-%d\n", 16 + NODES_SHIFT - 1);
> > 
> > If NODES_SHIFT is 1, you'll end up with "config:16-16", which might confuse
> > userspace.
> 
> So should I use "config:16" in that case? Is it OK to use this also for
> NODES_SHIFT=0 ?

If you only need one bit, then "config:16" is the right thing to do.

Will

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2016-11-17 18:10 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-29 11:55 [PATCH v4 0/5] Cavium ThunderX uncore PMU support Jan Glauber
2016-10-29 11:55 ` Jan Glauber
2016-10-29 11:55 ` [PATCH v4 1/5] arm64: perf: Basic uncore counter support for Cavium ThunderX SOC Jan Glauber
2016-10-29 11:55   ` Jan Glauber
2016-11-08 23:50   ` Will Deacon
2016-11-08 23:50     ` Will Deacon
2016-11-11 10:30     ` Jan Glauber
2016-11-11 10:30       ` Jan Glauber
2016-11-17 18:10       ` Will Deacon
2016-11-17 18:10         ` Will Deacon
2016-11-10 16:54   ` Mark Rutland
2016-11-10 16:54     ` Mark Rutland
2016-11-10 19:46     ` Will Deacon
2016-11-10 19:46       ` Will Deacon
2016-11-11  7:37     ` Jan Glauber
2016-11-11  7:37       ` Jan Glauber
2016-11-11 10:39     ` Jan Glauber
2016-11-11 10:39       ` Jan Glauber
2016-11-11 11:18       ` Mark Rutland
2016-11-11 11:18         ` Mark Rutland
2016-10-29 11:55 ` [PATCH v4 2/5] arm64: perf: Cavium ThunderX L2C TAD uncore support Jan Glauber
2016-10-29 11:55   ` Jan Glauber
2016-10-29 11:55 ` [PATCH v4 3/5] arm64: perf: Cavium ThunderX L2C CBC " Jan Glauber
2016-10-29 11:55   ` Jan Glauber
2016-10-29 11:55 ` [PATCH v4 4/5] arm64: perf: Cavium ThunderX LMC " Jan Glauber
2016-10-29 11:55   ` Jan Glauber
2016-10-29 11:55 ` [PATCH v4 5/5] arm64: perf: Cavium ThunderX OCX TLK " Jan Glauber
2016-10-29 11:55   ` Jan Glauber

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.