All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/3] perf: xgene: Add support for SoC PMU version 3
@ 2017-04-03 16:47 ` Hoan Tran
  0 siblings, 0 replies; 26+ messages in thread
From: Hoan Tran @ 2017-04-03 16:47 UTC (permalink / raw)
  To: Will Deacon, Mark Rutland, Jonathan Corbet, Tai Nguyen
  Cc: linux-arm-kernel, linux-kernel, linux-doc, Loc Ho, Hoan Tran

This patch set adds support for SoC-wide (AKA uncore) Performance Monitoring
Unit version 3.

It can support up to
 - 2 IOB PMU instances
 - 8 L3C PMU instances
 - 2 MCB PMU instances
 - 8 MCU PMU instances

v2:
 * Split into separate patches
 * Use the function pointers for the PMU leaf functions
 * Parse PMU subnode by the match table
 * Dont allow user change agent id by config1 for SoC PMU v3

Hoan Tran (3):
  perf: xgene: Parse PMU subnode from the match table
  perf: xgene: Move PMU leaf functions into function pointer structure
  perf: xgene: Add support for SoC PMU version 3

 drivers/perf/xgene_pmu.c | 695 +++++++++++++++++++++++++++++++++++++++++------
 1 file changed, 619 insertions(+), 76 deletions(-)

-- 
1.9.1

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v2 0/3] perf: xgene: Add support for SoC PMU version 3
@ 2017-04-03 16:47 ` Hoan Tran
  0 siblings, 0 replies; 26+ messages in thread
From: Hoan Tran @ 2017-04-03 16:47 UTC (permalink / raw)
  To: linux-arm-kernel

This patch set adds support for SoC-wide (AKA uncore) Performance Monitoring
Unit version 3.

It can support up to
 - 2 IOB PMU instances
 - 8 L3C PMU instances
 - 2 MCB PMU instances
 - 8 MCU PMU instances

v2:
 * Split into separate patches
 * Use the function pointers for the PMU leaf functions
 * Parse PMU subnode by the match table
 * Dont allow user change agent id by config1 for SoC PMU v3

Hoan Tran (3):
  perf: xgene: Parse PMU subnode from the match table
  perf: xgene: Move PMU leaf functions into function pointer structure
  perf: xgene: Add support for SoC PMU version 3

 drivers/perf/xgene_pmu.c | 695 +++++++++++++++++++++++++++++++++++++++++------
 1 file changed, 619 insertions(+), 76 deletions(-)

-- 
1.9.1

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v2 1/3] perf: xgene: Parse PMU subnode from the match table
  2017-04-03 16:47 ` Hoan Tran
@ 2017-04-03 16:47   ` Hoan Tran
  -1 siblings, 0 replies; 26+ messages in thread
From: Hoan Tran @ 2017-04-03 16:47 UTC (permalink / raw)
  To: Will Deacon, Mark Rutland, Jonathan Corbet, Tai Nguyen
  Cc: linux-arm-kernel, linux-kernel, linux-doc, Loc Ho, Hoan Tran

This patch parses PMU Subnode from a match table.

Signed-off-by: Hoan Tran <hotran@apm.com>
---
 drivers/perf/xgene_pmu.c | 40 ++++++++++++++++++++++++++++++----------
 1 file changed, 30 insertions(+), 10 deletions(-)

diff --git a/drivers/perf/xgene_pmu.c b/drivers/perf/xgene_pmu.c
index 35b5289..5ffd580 100644
--- a/drivers/perf/xgene_pmu.c
+++ b/drivers/perf/xgene_pmu.c
@@ -1047,9 +1047,35 @@ xgene_pmu_dev_ctx *acpi_get_pmu_hw_inf(struct xgene_pmu *xgene_pmu,
 	return NULL;
 }
 
+static const struct acpi_device_id xgene_pmu_acpi_type_match[] = {
+	{"APMC0D5D", PMU_TYPE_L3C},
+	{"APMC0D5E", PMU_TYPE_IOB},
+	{"APMC0D5F", PMU_TYPE_MCB},
+	{"APMC0D60", PMU_TYPE_MC},
+	{},
+};
+
+static const struct acpi_device_id *xgene_pmu_acpi_match_type(
+					const struct acpi_device_id *ids,
+					struct acpi_device *adev)
+{
+	const struct acpi_device_id *match_id = NULL;
+	const struct acpi_device_id *id;
+
+	for (id = ids; id->id[0] || id->cls; id++) {
+		if (!acpi_match_device_ids(adev, id))
+			match_id = id;
+		else if (match_id)
+			break;
+	}
+
+	return match_id;
+}
+
 static acpi_status acpi_pmu_dev_add(acpi_handle handle, u32 level,
 				    void *data, void **return_value)
 {
+	const struct acpi_device_id *acpi_id;
 	struct xgene_pmu *xgene_pmu = data;
 	struct xgene_pmu_dev_ctx *ctx;
 	struct acpi_device *adev;
@@ -1059,17 +1085,11 @@ static acpi_status acpi_pmu_dev_add(acpi_handle handle, u32 level,
 	if (acpi_bus_get_status(adev) || !adev->status.present)
 		return AE_OK;
 
-	if (!strcmp(acpi_device_hid(adev), "APMC0D5D"))
-		ctx = acpi_get_pmu_hw_inf(xgene_pmu, adev, PMU_TYPE_L3C);
-	else if (!strcmp(acpi_device_hid(adev), "APMC0D5E"))
-		ctx = acpi_get_pmu_hw_inf(xgene_pmu, adev, PMU_TYPE_IOB);
-	else if (!strcmp(acpi_device_hid(adev), "APMC0D5F"))
-		ctx = acpi_get_pmu_hw_inf(xgene_pmu, adev, PMU_TYPE_MCB);
-	else if (!strcmp(acpi_device_hid(adev), "APMC0D60"))
-		ctx = acpi_get_pmu_hw_inf(xgene_pmu, adev, PMU_TYPE_MC);
-	else
-		ctx = NULL;
+	acpi_id = xgene_pmu_acpi_match_type(xgene_pmu_acpi_type_match, adev);
+	if (!acpi_id)
+		return AE_OK;
 
+	ctx = acpi_get_pmu_hw_inf(xgene_pmu, adev, (u32)acpi_id->driver_data);
 	if (!ctx)
 		return AE_OK;
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 1/3] perf: xgene: Parse PMU subnode from the match table
@ 2017-04-03 16:47   ` Hoan Tran
  0 siblings, 0 replies; 26+ messages in thread
From: Hoan Tran @ 2017-04-03 16:47 UTC (permalink / raw)
  To: linux-arm-kernel

This patch parses PMU Subnode from a match table.

Signed-off-by: Hoan Tran <hotran@apm.com>
---
 drivers/perf/xgene_pmu.c | 40 ++++++++++++++++++++++++++++++----------
 1 file changed, 30 insertions(+), 10 deletions(-)

diff --git a/drivers/perf/xgene_pmu.c b/drivers/perf/xgene_pmu.c
index 35b5289..5ffd580 100644
--- a/drivers/perf/xgene_pmu.c
+++ b/drivers/perf/xgene_pmu.c
@@ -1047,9 +1047,35 @@ xgene_pmu_dev_ctx *acpi_get_pmu_hw_inf(struct xgene_pmu *xgene_pmu,
 	return NULL;
 }
 
+static const struct acpi_device_id xgene_pmu_acpi_type_match[] = {
+	{"APMC0D5D", PMU_TYPE_L3C},
+	{"APMC0D5E", PMU_TYPE_IOB},
+	{"APMC0D5F", PMU_TYPE_MCB},
+	{"APMC0D60", PMU_TYPE_MC},
+	{},
+};
+
+static const struct acpi_device_id *xgene_pmu_acpi_match_type(
+					const struct acpi_device_id *ids,
+					struct acpi_device *adev)
+{
+	const struct acpi_device_id *match_id = NULL;
+	const struct acpi_device_id *id;
+
+	for (id = ids; id->id[0] || id->cls; id++) {
+		if (!acpi_match_device_ids(adev, id))
+			match_id = id;
+		else if (match_id)
+			break;
+	}
+
+	return match_id;
+}
+
 static acpi_status acpi_pmu_dev_add(acpi_handle handle, u32 level,
 				    void *data, void **return_value)
 {
+	const struct acpi_device_id *acpi_id;
 	struct xgene_pmu *xgene_pmu = data;
 	struct xgene_pmu_dev_ctx *ctx;
 	struct acpi_device *adev;
@@ -1059,17 +1085,11 @@ static acpi_status acpi_pmu_dev_add(acpi_handle handle, u32 level,
 	if (acpi_bus_get_status(adev) || !adev->status.present)
 		return AE_OK;
 
-	if (!strcmp(acpi_device_hid(adev), "APMC0D5D"))
-		ctx = acpi_get_pmu_hw_inf(xgene_pmu, adev, PMU_TYPE_L3C);
-	else if (!strcmp(acpi_device_hid(adev), "APMC0D5E"))
-		ctx = acpi_get_pmu_hw_inf(xgene_pmu, adev, PMU_TYPE_IOB);
-	else if (!strcmp(acpi_device_hid(adev), "APMC0D5F"))
-		ctx = acpi_get_pmu_hw_inf(xgene_pmu, adev, PMU_TYPE_MCB);
-	else if (!strcmp(acpi_device_hid(adev), "APMC0D60"))
-		ctx = acpi_get_pmu_hw_inf(xgene_pmu, adev, PMU_TYPE_MC);
-	else
-		ctx = NULL;
+	acpi_id = xgene_pmu_acpi_match_type(xgene_pmu_acpi_type_match, adev);
+	if (!acpi_id)
+		return AE_OK;
 
+	ctx = acpi_get_pmu_hw_inf(xgene_pmu, adev, (u32)acpi_id->driver_data);
 	if (!ctx)
 		return AE_OK;
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 2/3] perf: xgene: Move PMU leaf functions into function pointer structure
  2017-04-03 16:47 ` Hoan Tran
@ 2017-04-03 16:47   ` Hoan Tran
  -1 siblings, 0 replies; 26+ messages in thread
From: Hoan Tran @ 2017-04-03 16:47 UTC (permalink / raw)
  To: Will Deacon, Mark Rutland, Jonathan Corbet, Tai Nguyen
  Cc: linux-arm-kernel, linux-kernel, linux-doc, Loc Ho, Hoan Tran

This patch moves PMU leaf functions into a function pointer structure.
It helps code maintain and expasion easier.

Signed-off-by: Hoan Tran <hotran@apm.com>
---
 drivers/perf/xgene_pmu.c | 85 +++++++++++++++++++++++++++++++++++++-----------
 1 file changed, 66 insertions(+), 19 deletions(-)

diff --git a/drivers/perf/xgene_pmu.c b/drivers/perf/xgene_pmu.c
index 5ffd580..f34fc78 100644
--- a/drivers/perf/xgene_pmu.c
+++ b/drivers/perf/xgene_pmu.c
@@ -96,6 +96,23 @@ struct xgene_pmu_dev {
 	struct perf_event *pmu_counter_event[PMU_MAX_COUNTERS];
 };
 
+struct xgene_pmu_ops {
+	void (*mask_int)(struct xgene_pmu *pmu);
+	void (*unmask_int)(struct xgene_pmu *pmu);
+	u64 (*read_counter)(struct xgene_pmu_dev *pmu, int idx);
+	void (*write_counter)(struct xgene_pmu_dev *pmu, int idx, u64 val);
+	void (*write_evttype)(struct xgene_pmu_dev *pmu_dev, int idx, u32 val);
+	void (*write_agentmsk)(struct xgene_pmu_dev *pmu_dev, u32 val);
+	void (*write_agent1msk)(struct xgene_pmu_dev *pmu_dev, u32 val);
+	void (*enable_counter)(struct xgene_pmu_dev *pmu_dev, int idx);
+	void (*disable_counter)(struct xgene_pmu_dev *pmu_dev, int idx);
+	void (*enable_counter_int)(struct xgene_pmu_dev *pmu_dev, int idx);
+	void (*disable_counter_int)(struct xgene_pmu_dev *pmu_dev, int idx);
+	void (*reset_counters)(struct xgene_pmu_dev *pmu_dev);
+	void (*start_counters)(struct xgene_pmu_dev *pmu_dev);
+	void (*stop_counters)(struct xgene_pmu_dev *pmu_dev);
+};
+
 struct xgene_pmu {
 	struct device *dev;
 	int version;
@@ -104,6 +121,7 @@ struct xgene_pmu {
 	u32 mc_active_mask;
 	cpumask_t cpu;
 	raw_spinlock_t lock;
+	const struct xgene_pmu_ops *ops;
 	struct list_head l3cpmus;
 	struct list_head iobpmus;
 	struct list_head mcbpmus;
@@ -392,13 +410,14 @@ static inline void xgene_pmu_unmask_int(struct xgene_pmu *xgene_pmu)
 	writel(PCPPMU_INTCLRMASK, xgene_pmu->pcppmu_csr + PCPPMU_INTMASK_REG);
 }
 
-static inline u32 xgene_pmu_read_counter(struct xgene_pmu_dev *pmu_dev, int idx)
+static inline u64 xgene_pmu_read_counter32(struct xgene_pmu_dev *pmu_dev,
+					   int idx)
 {
-	return readl(pmu_dev->inf->csr + PMU_PMEVCNTR0 + (4 * idx));
+	return (u64)readl(pmu_dev->inf->csr + PMU_PMEVCNTR0 + (4 * idx));
 }
 
 static inline void
-xgene_pmu_write_counter(struct xgene_pmu_dev *pmu_dev, int idx, u32 val)
+xgene_pmu_write_counter32(struct xgene_pmu_dev *pmu_dev, int idx, u64 val)
 {
 	writel(val, pmu_dev->inf->csr + PMU_PMEVCNTR0 + (4 * idx));
 }
@@ -491,20 +510,22 @@ static inline void xgene_pmu_stop_counters(struct xgene_pmu_dev *pmu_dev)
 static void xgene_perf_pmu_enable(struct pmu *pmu)
 {
 	struct xgene_pmu_dev *pmu_dev = to_pmu_dev(pmu);
+	struct xgene_pmu *xgene_pmu = pmu_dev->parent;
 	int enabled = bitmap_weight(pmu_dev->cntr_assign_mask,
 			pmu_dev->max_counters);
 
 	if (!enabled)
 		return;
 
-	xgene_pmu_start_counters(pmu_dev);
+	xgene_pmu->ops->start_counters(pmu_dev);
 }
 
 static void xgene_perf_pmu_disable(struct pmu *pmu)
 {
 	struct xgene_pmu_dev *pmu_dev = to_pmu_dev(pmu);
+	struct xgene_pmu *xgene_pmu = pmu_dev->parent;
 
-	xgene_pmu_stop_counters(pmu_dev);
+	xgene_pmu->ops->stop_counters(pmu_dev);
 }
 
 static int xgene_perf_event_init(struct perf_event *event)
@@ -572,27 +593,32 @@ static int xgene_perf_event_init(struct perf_event *event)
 static void xgene_perf_enable_event(struct perf_event *event)
 {
 	struct xgene_pmu_dev *pmu_dev = to_pmu_dev(event->pmu);
+	struct xgene_pmu *xgene_pmu = pmu_dev->parent;
 
-	xgene_pmu_write_evttype(pmu_dev, GET_CNTR(event), GET_EVENTID(event));
-	xgene_pmu_write_agentmsk(pmu_dev, ~((u32)GET_AGENTID(event)));
+	xgene_pmu->ops->write_evttype(pmu_dev, GET_CNTR(event),
+				      GET_EVENTID(event));
+	xgene_pmu->ops->write_agentmsk(pmu_dev, ~((u32)GET_AGENTID(event)));
 	if (pmu_dev->inf->type == PMU_TYPE_IOB)
-		xgene_pmu_write_agent1msk(pmu_dev, ~((u32)GET_AGENT1ID(event)));
+		xgene_pmu->ops->write_agent1msk(pmu_dev,
+						~((u32)GET_AGENT1ID(event)));
 
-	xgene_pmu_enable_counter(pmu_dev, GET_CNTR(event));
-	xgene_pmu_enable_counter_int(pmu_dev, GET_CNTR(event));
+	xgene_pmu->ops->enable_counter(pmu_dev, GET_CNTR(event));
+	xgene_pmu->ops->enable_counter_int(pmu_dev, GET_CNTR(event));
 }
 
 static void xgene_perf_disable_event(struct perf_event *event)
 {
 	struct xgene_pmu_dev *pmu_dev = to_pmu_dev(event->pmu);
+	struct xgene_pmu *xgene_pmu = pmu_dev->parent;
 
-	xgene_pmu_disable_counter(pmu_dev, GET_CNTR(event));
-	xgene_pmu_disable_counter_int(pmu_dev, GET_CNTR(event));
+	xgene_pmu->ops->disable_counter(pmu_dev, GET_CNTR(event));
+	xgene_pmu->ops->disable_counter_int(pmu_dev, GET_CNTR(event));
 }
 
 static void xgene_perf_event_set_period(struct perf_event *event)
 {
 	struct xgene_pmu_dev *pmu_dev = to_pmu_dev(event->pmu);
+	struct xgene_pmu *xgene_pmu = pmu_dev->parent;
 	struct hw_perf_event *hw = &event->hw;
 	/*
 	 * The X-Gene PMU counters have a period of 2^32. To account for the
@@ -603,18 +629,19 @@ static void xgene_perf_event_set_period(struct perf_event *event)
 	u64 val = 1ULL << 31;
 
 	local64_set(&hw->prev_count, val);
-	xgene_pmu_write_counter(pmu_dev, hw->idx, (u32) val);
+	xgene_pmu->ops->write_counter(pmu_dev, hw->idx, val);
 }
 
 static void xgene_perf_event_update(struct perf_event *event)
 {
 	struct xgene_pmu_dev *pmu_dev = to_pmu_dev(event->pmu);
+	struct xgene_pmu *xgene_pmu = pmu_dev->parent;
 	struct hw_perf_event *hw = &event->hw;
 	u64 delta, prev_raw_count, new_raw_count;
 
 again:
 	prev_raw_count = local64_read(&hw->prev_count);
-	new_raw_count = xgene_pmu_read_counter(pmu_dev, GET_CNTR(event));
+	new_raw_count = xgene_pmu->ops->read_counter(pmu_dev, GET_CNTR(event));
 
 	if (local64_cmpxchg(&hw->prev_count, prev_raw_count,
 			    new_raw_count) != prev_raw_count)
@@ -633,6 +660,7 @@ static void xgene_perf_read(struct perf_event *event)
 static void xgene_perf_start(struct perf_event *event, int flags)
 {
 	struct xgene_pmu_dev *pmu_dev = to_pmu_dev(event->pmu);
+	struct xgene_pmu *xgene_pmu = pmu_dev->parent;
 	struct hw_perf_event *hw = &event->hw;
 
 	if (WARN_ON_ONCE(!(hw->state & PERF_HES_STOPPED)))
@@ -646,8 +674,8 @@ static void xgene_perf_start(struct perf_event *event, int flags)
 	if (flags & PERF_EF_RELOAD) {
 		u64 prev_raw_count =  local64_read(&hw->prev_count);
 
-		xgene_pmu_write_counter(pmu_dev, GET_CNTR(event),
-					(u32) prev_raw_count);
+		xgene_pmu->ops->write_counter(pmu_dev, GET_CNTR(event),
+					      prev_raw_count);
 	}
 
 	xgene_perf_enable_event(event);
@@ -736,8 +764,8 @@ static int xgene_init_perf(struct xgene_pmu_dev *pmu_dev, char *name)
 	};
 
 	/* Hardware counter init */
-	xgene_pmu_stop_counters(pmu_dev);
-	xgene_pmu_reset_counters(pmu_dev);
+	xgene_pmu->ops->stop_counters(pmu_dev);
+	xgene_pmu->ops->reset_counters(pmu_dev);
 
 	return perf_pmu_register(&pmu_dev->pmu, name, -1);
 }
@@ -1255,6 +1283,23 @@ static int xgene_pmu_probe_pmu_dev(struct xgene_pmu *xgene_pmu,
 	.id   = PCP_PMU_V2,
 };
 
+const struct xgene_pmu_ops xgene_pmu_ops = {
+	.mask_int = xgene_pmu_mask_int,
+	.unmask_int = xgene_pmu_unmask_int,
+	.read_counter = xgene_pmu_read_counter32,
+	.write_counter = xgene_pmu_write_counter32,
+	.write_evttype = xgene_pmu_write_evttype,
+	.write_agentmsk = xgene_pmu_write_agentmsk,
+	.write_agent1msk = xgene_pmu_write_agent1msk,
+	.enable_counter = xgene_pmu_enable_counter,
+	.disable_counter = xgene_pmu_disable_counter,
+	.enable_counter_int = xgene_pmu_enable_counter_int,
+	.disable_counter_int = xgene_pmu_disable_counter_int,
+	.reset_counters = xgene_pmu_reset_counters,
+	.start_counters = xgene_pmu_start_counters,
+	.stop_counters = xgene_pmu_stop_counters,
+};
+
 static const struct of_device_id xgene_pmu_of_match[] = {
 	{ .compatible	= "apm,xgene-pmu",	.data = &xgene_pmu_data },
 	{ .compatible	= "apm,xgene-pmu-v2",	.data = &xgene_pmu_v2_data },
@@ -1304,6 +1349,8 @@ static int xgene_pmu_probe(struct platform_device *pdev)
 	if (version < 0)
 		return -ENODEV;
 
+	xgene_pmu->ops = &xgene_pmu_ops;
+
 	INIT_LIST_HEAD(&xgene_pmu->l3cpmus);
 	INIT_LIST_HEAD(&xgene_pmu->iobpmus);
 	INIT_LIST_HEAD(&xgene_pmu->mcbpmus);
@@ -1362,7 +1409,7 @@ static int xgene_pmu_probe(struct platform_device *pdev)
 	}
 
 	/* Enable interrupt */
-	xgene_pmu_unmask_int(xgene_pmu);
+	xgene_pmu->ops->unmask_int(xgene_pmu);
 
 	return 0;
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 2/3] perf: xgene: Move PMU leaf functions into function pointer structure
@ 2017-04-03 16:47   ` Hoan Tran
  0 siblings, 0 replies; 26+ messages in thread
From: Hoan Tran @ 2017-04-03 16:47 UTC (permalink / raw)
  To: linux-arm-kernel

This patch moves PMU leaf functions into a function pointer structure.
It helps code maintain and expasion easier.

Signed-off-by: Hoan Tran <hotran@apm.com>
---
 drivers/perf/xgene_pmu.c | 85 +++++++++++++++++++++++++++++++++++++-----------
 1 file changed, 66 insertions(+), 19 deletions(-)

diff --git a/drivers/perf/xgene_pmu.c b/drivers/perf/xgene_pmu.c
index 5ffd580..f34fc78 100644
--- a/drivers/perf/xgene_pmu.c
+++ b/drivers/perf/xgene_pmu.c
@@ -96,6 +96,23 @@ struct xgene_pmu_dev {
 	struct perf_event *pmu_counter_event[PMU_MAX_COUNTERS];
 };
 
+struct xgene_pmu_ops {
+	void (*mask_int)(struct xgene_pmu *pmu);
+	void (*unmask_int)(struct xgene_pmu *pmu);
+	u64 (*read_counter)(struct xgene_pmu_dev *pmu, int idx);
+	void (*write_counter)(struct xgene_pmu_dev *pmu, int idx, u64 val);
+	void (*write_evttype)(struct xgene_pmu_dev *pmu_dev, int idx, u32 val);
+	void (*write_agentmsk)(struct xgene_pmu_dev *pmu_dev, u32 val);
+	void (*write_agent1msk)(struct xgene_pmu_dev *pmu_dev, u32 val);
+	void (*enable_counter)(struct xgene_pmu_dev *pmu_dev, int idx);
+	void (*disable_counter)(struct xgene_pmu_dev *pmu_dev, int idx);
+	void (*enable_counter_int)(struct xgene_pmu_dev *pmu_dev, int idx);
+	void (*disable_counter_int)(struct xgene_pmu_dev *pmu_dev, int idx);
+	void (*reset_counters)(struct xgene_pmu_dev *pmu_dev);
+	void (*start_counters)(struct xgene_pmu_dev *pmu_dev);
+	void (*stop_counters)(struct xgene_pmu_dev *pmu_dev);
+};
+
 struct xgene_pmu {
 	struct device *dev;
 	int version;
@@ -104,6 +121,7 @@ struct xgene_pmu {
 	u32 mc_active_mask;
 	cpumask_t cpu;
 	raw_spinlock_t lock;
+	const struct xgene_pmu_ops *ops;
 	struct list_head l3cpmus;
 	struct list_head iobpmus;
 	struct list_head mcbpmus;
@@ -392,13 +410,14 @@ static inline void xgene_pmu_unmask_int(struct xgene_pmu *xgene_pmu)
 	writel(PCPPMU_INTCLRMASK, xgene_pmu->pcppmu_csr + PCPPMU_INTMASK_REG);
 }
 
-static inline u32 xgene_pmu_read_counter(struct xgene_pmu_dev *pmu_dev, int idx)
+static inline u64 xgene_pmu_read_counter32(struct xgene_pmu_dev *pmu_dev,
+					   int idx)
 {
-	return readl(pmu_dev->inf->csr + PMU_PMEVCNTR0 + (4 * idx));
+	return (u64)readl(pmu_dev->inf->csr + PMU_PMEVCNTR0 + (4 * idx));
 }
 
 static inline void
-xgene_pmu_write_counter(struct xgene_pmu_dev *pmu_dev, int idx, u32 val)
+xgene_pmu_write_counter32(struct xgene_pmu_dev *pmu_dev, int idx, u64 val)
 {
 	writel(val, pmu_dev->inf->csr + PMU_PMEVCNTR0 + (4 * idx));
 }
@@ -491,20 +510,22 @@ static inline void xgene_pmu_stop_counters(struct xgene_pmu_dev *pmu_dev)
 static void xgene_perf_pmu_enable(struct pmu *pmu)
 {
 	struct xgene_pmu_dev *pmu_dev = to_pmu_dev(pmu);
+	struct xgene_pmu *xgene_pmu = pmu_dev->parent;
 	int enabled = bitmap_weight(pmu_dev->cntr_assign_mask,
 			pmu_dev->max_counters);
 
 	if (!enabled)
 		return;
 
-	xgene_pmu_start_counters(pmu_dev);
+	xgene_pmu->ops->start_counters(pmu_dev);
 }
 
 static void xgene_perf_pmu_disable(struct pmu *pmu)
 {
 	struct xgene_pmu_dev *pmu_dev = to_pmu_dev(pmu);
+	struct xgene_pmu *xgene_pmu = pmu_dev->parent;
 
-	xgene_pmu_stop_counters(pmu_dev);
+	xgene_pmu->ops->stop_counters(pmu_dev);
 }
 
 static int xgene_perf_event_init(struct perf_event *event)
@@ -572,27 +593,32 @@ static int xgene_perf_event_init(struct perf_event *event)
 static void xgene_perf_enable_event(struct perf_event *event)
 {
 	struct xgene_pmu_dev *pmu_dev = to_pmu_dev(event->pmu);
+	struct xgene_pmu *xgene_pmu = pmu_dev->parent;
 
-	xgene_pmu_write_evttype(pmu_dev, GET_CNTR(event), GET_EVENTID(event));
-	xgene_pmu_write_agentmsk(pmu_dev, ~((u32)GET_AGENTID(event)));
+	xgene_pmu->ops->write_evttype(pmu_dev, GET_CNTR(event),
+				      GET_EVENTID(event));
+	xgene_pmu->ops->write_agentmsk(pmu_dev, ~((u32)GET_AGENTID(event)));
 	if (pmu_dev->inf->type == PMU_TYPE_IOB)
-		xgene_pmu_write_agent1msk(pmu_dev, ~((u32)GET_AGENT1ID(event)));
+		xgene_pmu->ops->write_agent1msk(pmu_dev,
+						~((u32)GET_AGENT1ID(event)));
 
-	xgene_pmu_enable_counter(pmu_dev, GET_CNTR(event));
-	xgene_pmu_enable_counter_int(pmu_dev, GET_CNTR(event));
+	xgene_pmu->ops->enable_counter(pmu_dev, GET_CNTR(event));
+	xgene_pmu->ops->enable_counter_int(pmu_dev, GET_CNTR(event));
 }
 
 static void xgene_perf_disable_event(struct perf_event *event)
 {
 	struct xgene_pmu_dev *pmu_dev = to_pmu_dev(event->pmu);
+	struct xgene_pmu *xgene_pmu = pmu_dev->parent;
 
-	xgene_pmu_disable_counter(pmu_dev, GET_CNTR(event));
-	xgene_pmu_disable_counter_int(pmu_dev, GET_CNTR(event));
+	xgene_pmu->ops->disable_counter(pmu_dev, GET_CNTR(event));
+	xgene_pmu->ops->disable_counter_int(pmu_dev, GET_CNTR(event));
 }
 
 static void xgene_perf_event_set_period(struct perf_event *event)
 {
 	struct xgene_pmu_dev *pmu_dev = to_pmu_dev(event->pmu);
+	struct xgene_pmu *xgene_pmu = pmu_dev->parent;
 	struct hw_perf_event *hw = &event->hw;
 	/*
 	 * The X-Gene PMU counters have a period of 2^32. To account for the
@@ -603,18 +629,19 @@ static void xgene_perf_event_set_period(struct perf_event *event)
 	u64 val = 1ULL << 31;
 
 	local64_set(&hw->prev_count, val);
-	xgene_pmu_write_counter(pmu_dev, hw->idx, (u32) val);
+	xgene_pmu->ops->write_counter(pmu_dev, hw->idx, val);
 }
 
 static void xgene_perf_event_update(struct perf_event *event)
 {
 	struct xgene_pmu_dev *pmu_dev = to_pmu_dev(event->pmu);
+	struct xgene_pmu *xgene_pmu = pmu_dev->parent;
 	struct hw_perf_event *hw = &event->hw;
 	u64 delta, prev_raw_count, new_raw_count;
 
 again:
 	prev_raw_count = local64_read(&hw->prev_count);
-	new_raw_count = xgene_pmu_read_counter(pmu_dev, GET_CNTR(event));
+	new_raw_count = xgene_pmu->ops->read_counter(pmu_dev, GET_CNTR(event));
 
 	if (local64_cmpxchg(&hw->prev_count, prev_raw_count,
 			    new_raw_count) != prev_raw_count)
@@ -633,6 +660,7 @@ static void xgene_perf_read(struct perf_event *event)
 static void xgene_perf_start(struct perf_event *event, int flags)
 {
 	struct xgene_pmu_dev *pmu_dev = to_pmu_dev(event->pmu);
+	struct xgene_pmu *xgene_pmu = pmu_dev->parent;
 	struct hw_perf_event *hw = &event->hw;
 
 	if (WARN_ON_ONCE(!(hw->state & PERF_HES_STOPPED)))
@@ -646,8 +674,8 @@ static void xgene_perf_start(struct perf_event *event, int flags)
 	if (flags & PERF_EF_RELOAD) {
 		u64 prev_raw_count =  local64_read(&hw->prev_count);
 
-		xgene_pmu_write_counter(pmu_dev, GET_CNTR(event),
-					(u32) prev_raw_count);
+		xgene_pmu->ops->write_counter(pmu_dev, GET_CNTR(event),
+					      prev_raw_count);
 	}
 
 	xgene_perf_enable_event(event);
@@ -736,8 +764,8 @@ static int xgene_init_perf(struct xgene_pmu_dev *pmu_dev, char *name)
 	};
 
 	/* Hardware counter init */
-	xgene_pmu_stop_counters(pmu_dev);
-	xgene_pmu_reset_counters(pmu_dev);
+	xgene_pmu->ops->stop_counters(pmu_dev);
+	xgene_pmu->ops->reset_counters(pmu_dev);
 
 	return perf_pmu_register(&pmu_dev->pmu, name, -1);
 }
@@ -1255,6 +1283,23 @@ static int xgene_pmu_probe_pmu_dev(struct xgene_pmu *xgene_pmu,
 	.id   = PCP_PMU_V2,
 };
 
+const struct xgene_pmu_ops xgene_pmu_ops = {
+	.mask_int = xgene_pmu_mask_int,
+	.unmask_int = xgene_pmu_unmask_int,
+	.read_counter = xgene_pmu_read_counter32,
+	.write_counter = xgene_pmu_write_counter32,
+	.write_evttype = xgene_pmu_write_evttype,
+	.write_agentmsk = xgene_pmu_write_agentmsk,
+	.write_agent1msk = xgene_pmu_write_agent1msk,
+	.enable_counter = xgene_pmu_enable_counter,
+	.disable_counter = xgene_pmu_disable_counter,
+	.enable_counter_int = xgene_pmu_enable_counter_int,
+	.disable_counter_int = xgene_pmu_disable_counter_int,
+	.reset_counters = xgene_pmu_reset_counters,
+	.start_counters = xgene_pmu_start_counters,
+	.stop_counters = xgene_pmu_stop_counters,
+};
+
 static const struct of_device_id xgene_pmu_of_match[] = {
 	{ .compatible	= "apm,xgene-pmu",	.data = &xgene_pmu_data },
 	{ .compatible	= "apm,xgene-pmu-v2",	.data = &xgene_pmu_v2_data },
@@ -1304,6 +1349,8 @@ static int xgene_pmu_probe(struct platform_device *pdev)
 	if (version < 0)
 		return -ENODEV;
 
+	xgene_pmu->ops = &xgene_pmu_ops;
+
 	INIT_LIST_HEAD(&xgene_pmu->l3cpmus);
 	INIT_LIST_HEAD(&xgene_pmu->iobpmus);
 	INIT_LIST_HEAD(&xgene_pmu->mcbpmus);
@@ -1362,7 +1409,7 @@ static int xgene_pmu_probe(struct platform_device *pdev)
 	}
 
 	/* Enable interrupt */
-	xgene_pmu_unmask_int(xgene_pmu);
+	xgene_pmu->ops->unmask_int(xgene_pmu);
 
 	return 0;
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 3/3] perf: xgene: Add support for SoC PMU version 3
  2017-04-03 16:47 ` Hoan Tran
@ 2017-04-03 16:47   ` Hoan Tran
  -1 siblings, 0 replies; 26+ messages in thread
From: Hoan Tran @ 2017-04-03 16:47 UTC (permalink / raw)
  To: Will Deacon, Mark Rutland, Jonathan Corbet, Tai Nguyen
  Cc: linux-arm-kernel, linux-kernel, linux-doc, Loc Ho, Hoan Tran

This patch adds support for SoC-wide (AKA uncore) Performance Monitoring
Unit version 3.

It can support up to
 - 2 IOB PMU instances
 - 8 L3C PMU instances
 - 2 MCB PMU instances
 - 8 MCU PMU instances

Signed-off-by: Hoan Tran <hotran@apm.com>
---
 drivers/perf/xgene_pmu.c | 572 +++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 524 insertions(+), 48 deletions(-)

diff --git a/drivers/perf/xgene_pmu.c b/drivers/perf/xgene_pmu.c
index f34fc78..a72814d 100644
--- a/drivers/perf/xgene_pmu.c
+++ b/drivers/perf/xgene_pmu.c
@@ -37,6 +37,8 @@
 
 #define CSW_CSWCR                       0x0000
 #define  CSW_CSWCR_DUALMCB_MASK         BIT(0)
+#define  CSW_CSWCR_MCB0_ROUTING(x)	(((x) & 0x0C) >> 2)
+#define  CSW_CSWCR_MCB1_ROUTING(x)	(((x) & 0x30) >> 4)
 #define MCBADDRMR                       0x0000
 #define  MCBADDRMR_DUALMCU_MODE_MASK    BIT(2)
 
@@ -50,8 +52,17 @@
 #define  PCPPMU_INT_L3C		BIT(2)
 #define  PCPPMU_INT_IOB		BIT(3)
 
+#define  PCPPMU_V3_INTMASK	0x00FF33FF
+#define  PCPPMU_V3_INTENMASK	0xFFFFFFFF
+#define  PCPPMU_V3_INTCLRMASK	0xFF00CC00
+#define  PCPPMU_V3_INT_MCU	0x000000FF
+#define  PCPPMU_V3_INT_MCB	0x00000300
+#define  PCPPMU_V3_INT_L3C	0x00FF0000
+#define  PCPPMU_V3_INT_IOB	0x00003000
+
 #define PMU_MAX_COUNTERS	4
-#define PMU_CNT_MAX_PERIOD	0x100000000ULL
+#define PMU_CNT_MAX_PERIOD	0xFFFFFFFFULL
+#define PMU_V3_CNT_MAX_PERIOD	0xFFFFFFFFFFFFFFFFULL
 #define PMU_OVERFLOW_MASK	0xF
 #define PMU_PMCR_E		BIT(0)
 #define PMU_PMCR_P		BIT(1)
@@ -73,6 +84,10 @@
 #define PMU_PMOVSR		0xC80
 #define PMU_PMCR		0xE04
 
+/* PMU registers for V3 */
+#define PMU_PMOVSCLR		0xC80
+#define PMU_PMOVSSET		0xCC0
+
 #define to_pmu_dev(p)     container_of(p, struct xgene_pmu_dev, pmu)
 #define GET_CNTR(ev)      (ev->hw.idx)
 #define GET_EVENTID(ev)   (ev->hw.config & 0xFFULL)
@@ -119,6 +134,7 @@ struct xgene_pmu {
 	void __iomem *pcppmu_csr;
 	u32 mcb_active_mask;
 	u32 mc_active_mask;
+	u32 l3c_active_mask;
 	cpumask_t cpu;
 	raw_spinlock_t lock;
 	const struct xgene_pmu_ops *ops;
@@ -143,11 +159,13 @@ struct xgene_pmu_data {
 enum xgene_pmu_version {
 	PCP_PMU_V1 = 1,
 	PCP_PMU_V2,
+	PCP_PMU_V3,
 };
 
 enum xgene_pmu_dev_type {
 	PMU_TYPE_L3C = 0,
 	PMU_TYPE_IOB,
+	PMU_TYPE_IOB_SLOW,
 	PMU_TYPE_MCB,
 	PMU_TYPE_MC,
 };
@@ -213,6 +231,56 @@ static ssize_t xgene_pmu_format_show(struct device *dev,
 	.attrs = mc_pmu_format_attrs,
 };
 
+static struct attribute *l3c_pmu_v3_format_attrs[] = {
+	XGENE_PMU_FORMAT_ATTR(l3c_eventid, "config:0-39"),
+	NULL,
+};
+
+static struct attribute *iob_pmu_v3_format_attrs[] = {
+	XGENE_PMU_FORMAT_ATTR(iob_eventid, "config:0-47"),
+	NULL,
+};
+
+static struct attribute *iob_slow_pmu_v3_format_attrs[] = {
+	XGENE_PMU_FORMAT_ATTR(iob_slow_eventid, "config:0-16"),
+	NULL,
+};
+
+static struct attribute *mcb_pmu_v3_format_attrs[] = {
+	XGENE_PMU_FORMAT_ATTR(mcb_eventid, "config:0-35"),
+	NULL,
+};
+
+static struct attribute *mc_pmu_v3_format_attrs[] = {
+	XGENE_PMU_FORMAT_ATTR(mc_eventid, "config:0-44"),
+	NULL,
+};
+
+static const struct attribute_group l3c_pmu_v3_format_attr_group = {
+	.name = "format",
+	.attrs = l3c_pmu_v3_format_attrs,
+};
+
+static const struct attribute_group iob_pmu_v3_format_attr_group = {
+	.name = "format",
+	.attrs = iob_pmu_v3_format_attrs,
+};
+
+static const struct attribute_group iob_slow_pmu_v3_format_attr_group = {
+	.name = "format",
+	.attrs = iob_slow_pmu_v3_format_attrs,
+};
+
+static const struct attribute_group mcb_pmu_v3_format_attr_group = {
+	.name = "format",
+	.attrs = mcb_pmu_v3_format_attrs,
+};
+
+static const struct attribute_group mc_pmu_v3_format_attr_group = {
+	.name = "format",
+	.attrs = mc_pmu_v3_format_attrs,
+};
+
 /*
  * sysfs event attributes
  */
@@ -329,6 +397,219 @@ static ssize_t xgene_pmu_event_show(struct device *dev,
 	.attrs = mc_pmu_events_attrs,
 };
 
+static struct attribute *l3c_pmu_v3_events_attrs[] = {
+	XGENE_PMU_EVENT_ATTR(cycle-count,			0x00),
+	XGENE_PMU_EVENT_ATTR(read-hit,				0x01),
+	XGENE_PMU_EVENT_ATTR(read-miss,				0x02),
+	XGENE_PMU_EVENT_ATTR(index-flush-eviction,		0x03),
+	XGENE_PMU_EVENT_ATTR(write-caused-replacement,		0x04),
+	XGENE_PMU_EVENT_ATTR(write-not-caused-replacement,	0x05),
+	XGENE_PMU_EVENT_ATTR(clean-eviction,			0x06),
+	XGENE_PMU_EVENT_ATTR(dirty-eviction,			0x07),
+	XGENE_PMU_EVENT_ATTR(reads,				0x08),
+	XGENE_PMU_EVENT_ATTR(writes,				0x09),
+	XGENE_PMU_EVENT_ATTR(requests,				0x0a),
+	XGENE_PMU_EVENT_ATTR(tq-bank-conflict-issue-stall,	0x0b),
+	XGENE_PMU_EVENT_ATTR(tq-full,				0x0c),
+	XGENE_PMU_EVENT_ATTR(ackq-full,				0x0d),
+	XGENE_PMU_EVENT_ATTR(wdb-full,				0x0e),
+	XGENE_PMU_EVENT_ATTR(odb-full,				0x10),
+	XGENE_PMU_EVENT_ATTR(wbq-full,				0x11),
+	XGENE_PMU_EVENT_ATTR(input-req-async-fifo-stall,	0x12),
+	XGENE_PMU_EVENT_ATTR(output-req-async-fifo-stall,	0x13),
+	XGENE_PMU_EVENT_ATTR(output-data-async-fifo-stall,	0x14),
+	XGENE_PMU_EVENT_ATTR(total-insertions,			0x15),
+	XGENE_PMU_EVENT_ATTR(sip-insertions-r-set,		0x16),
+	XGENE_PMU_EVENT_ATTR(sip-insertions-r-clear,		0x17),
+	XGENE_PMU_EVENT_ATTR(dip-insertions-r-set,		0x18),
+	XGENE_PMU_EVENT_ATTR(dip-insertions-r-clear,		0x19),
+	XGENE_PMU_EVENT_ATTR(dip-insertions-force-r-set,	0x1a),
+	XGENE_PMU_EVENT_ATTR(egressions,			0x1b),
+	XGENE_PMU_EVENT_ATTR(replacements,			0x1c),
+	XGENE_PMU_EVENT_ATTR(old-replacements,			0x1d),
+	XGENE_PMU_EVENT_ATTR(young-replacements,		0x1e),
+	XGENE_PMU_EVENT_ATTR(r-set-replacements,		0x1f),
+	XGENE_PMU_EVENT_ATTR(r-clear-replacements,		0x20),
+	XGENE_PMU_EVENT_ATTR(old-r-replacements,		0x21),
+	XGENE_PMU_EVENT_ATTR(old-nr-replacements,		0x22),
+	XGENE_PMU_EVENT_ATTR(young-r-replacements,		0x23),
+	XGENE_PMU_EVENT_ATTR(young-nr-replacements,		0x24),
+	XGENE_PMU_EVENT_ATTR(bloomfilter-clearings,		0x25),
+	XGENE_PMU_EVENT_ATTR(generation-flips,			0x26),
+	XGENE_PMU_EVENT_ATTR(vcc-droop-detected,		0x27),
+	NULL,
+};
+
+static struct attribute *iob_fast_pmu_v3_events_attrs[] = {
+	XGENE_PMU_EVENT_ATTR(cycle-count,			0x00),
+	XGENE_PMU_EVENT_ATTR(PA-req-buf-alloc-all,		0x01),
+	XGENE_PMU_EVENT_ATTR(PA-req-buf-alloc-rd,		0x02),
+	XGENE_PMU_EVENT_ATTR(PA-req-buf-alloc-wr,		0x03),
+	XGENE_PMU_EVENT_ATTR(PA-all-CP-req,			0x04),
+	XGENE_PMU_EVENT_ATTR(PA-CP-blk-req,			0x05),
+	XGENE_PMU_EVENT_ATTR(PA-CP-ptl-req,			0x06),
+	XGENE_PMU_EVENT_ATTR(PA-CP-rd-req,			0x07),
+	XGENE_PMU_EVENT_ATTR(PA-CP-wr-req,			0x08),
+	XGENE_PMU_EVENT_ATTR(BA-all-req,			0x09),
+	XGENE_PMU_EVENT_ATTR(BA-rd-req,				0x0a),
+	XGENE_PMU_EVENT_ATTR(BA-wr-req,				0x0b),
+	XGENE_PMU_EVENT_ATTR(PA-rd-shared-req-issued,		0x10),
+	XGENE_PMU_EVENT_ATTR(PA-rd-exclusive-req-issued,	0x11),
+	XGENE_PMU_EVENT_ATTR(PA-wr-invalidate-req-issued-stashable, 0x12),
+	XGENE_PMU_EVENT_ATTR(PA-wr-invalidate-req-issued-nonstashable, 0x13),
+	XGENE_PMU_EVENT_ATTR(PA-wr-back-req-issued-stashable,	0x14),
+	XGENE_PMU_EVENT_ATTR(PA-wr-back-req-issued-nonstashable, 0x15),
+	XGENE_PMU_EVENT_ATTR(PA-ptl-wr-req,			0x16),
+	XGENE_PMU_EVENT_ATTR(PA-ptl-rd-req,			0x17),
+	XGENE_PMU_EVENT_ATTR(PA-wr-back-clean-data,		0x18),
+	XGENE_PMU_EVENT_ATTR(PA-wr-back-cancelled-on-SS,	0x1b),
+	XGENE_PMU_EVENT_ATTR(PA-barrier-occurrence,		0x1c),
+	XGENE_PMU_EVENT_ATTR(PA-barrier-cycles,			0x1d),
+	XGENE_PMU_EVENT_ATTR(PA-total-CP-snoops,		0x20),
+	XGENE_PMU_EVENT_ATTR(PA-rd-shared-snoop,		0x21),
+	XGENE_PMU_EVENT_ATTR(PA-rd-shared-snoop-hit,		0x22),
+	XGENE_PMU_EVENT_ATTR(PA-rd-exclusive-snoop,		0x23),
+	XGENE_PMU_EVENT_ATTR(PA-rd-exclusive-snoop-hit,		0x24),
+	XGENE_PMU_EVENT_ATTR(PA-rd-wr-invalid-snoop,		0x25),
+	XGENE_PMU_EVENT_ATTR(PA-rd-wr-invalid-snoop-hit,	0x26),
+	XGENE_PMU_EVENT_ATTR(PA-req-buffer-full,		0x28),
+	XGENE_PMU_EVENT_ATTR(Cswlf-outbound-req-fifo-full,	0x29),
+	XGENE_PMU_EVENT_ATTR(Cswlf-inbound-snoop-fifo-backpressure, 0x2a),
+	XGENE_PMU_EVENT_ATTR(Cswlf-outbound-lack-fifo-full,	0x2b),
+	XGENE_PMU_EVENT_ATTR(Cswlf-inbound-gack-fifo-backpressure, 0x2c),
+	XGENE_PMU_EVENT_ATTR(Cswlf-outbound-data-fifo-full,	0x2d),
+	XGENE_PMU_EVENT_ATTR(Cswlf-inbound-data-fifo-backpressure, 0x2e),
+	XGENE_PMU_EVENT_ATTR(Cswlf-inbound-req-backpressure,	0x2f),
+	NULL,
+};
+
+static struct attribute *iob_slow_pmu_v3_events_attrs[] = {
+	XGENE_PMU_EVENT_ATTR(cycle-count,			0x00),
+	XGENE_PMU_EVENT_ATTR(PA-AXI0-rd-req,			0x01),
+	XGENE_PMU_EVENT_ATTR(PA-AXI0-wr-req,			0x02),
+	XGENE_PMU_EVENT_ATTR(PA-AXI1-rd-req,			0x03),
+	XGENE_PMU_EVENT_ATTR(PA-AXI1-wr-req,			0x04),
+	XGENE_PMU_EVENT_ATTR(BA-all-AXI-req,			0x07),
+	XGENE_PMU_EVENT_ATTR(BA-AXI-rd-req,			0x08),
+	XGENE_PMU_EVENT_ATTR(BA-AXI-wr-req,			0x09),
+	XGENE_PMU_EVENT_ATTR(BA-free-list-empty,		0x10),
+	NULL,
+};
+
+static struct attribute *mcb_pmu_v3_events_attrs[] = {
+	XGENE_PMU_EVENT_ATTR(cycle-count,			0x00),
+	XGENE_PMU_EVENT_ATTR(req-receive,			0x01),
+	XGENE_PMU_EVENT_ATTR(rd-req-recv,			0x02),
+	XGENE_PMU_EVENT_ATTR(rd-req-recv-2,			0x03),
+	XGENE_PMU_EVENT_ATTR(wr-req-recv,			0x04),
+	XGENE_PMU_EVENT_ATTR(wr-req-recv-2,			0x05),
+	XGENE_PMU_EVENT_ATTR(rd-req-sent-to-mcu,		0x06),
+	XGENE_PMU_EVENT_ATTR(rd-req-sent-to-mcu-2,		0x07),
+	XGENE_PMU_EVENT_ATTR(rd-req-sent-to-spec-mcu,		0x08),
+	XGENE_PMU_EVENT_ATTR(rd-req-sent-to-spec-mcu-2,		0x09),
+	XGENE_PMU_EVENT_ATTR(glbl-ack-recv-for-rd-sent-to-spec-mcu, 0x0a),
+	XGENE_PMU_EVENT_ATTR(glbl-ack-go-recv-for-rd-sent-to-spec-mcu, 0x0b),
+	XGENE_PMU_EVENT_ATTR(glbl-ack-nogo-recv-for-rd-sent-to-spec-mcu, 0x0c),
+	XGENE_PMU_EVENT_ATTR(glbl-ack-go-recv-any-rd-req,	0x0d),
+	XGENE_PMU_EVENT_ATTR(glbl-ack-go-recv-any-rd-req-2,	0x0e),
+	XGENE_PMU_EVENT_ATTR(wr-req-sent-to-mcu,		0x0f),
+	XGENE_PMU_EVENT_ATTR(gack-recv,				0x10),
+	XGENE_PMU_EVENT_ATTR(rd-gack-recv,			0x11),
+	XGENE_PMU_EVENT_ATTR(wr-gack-recv,			0x12),
+	XGENE_PMU_EVENT_ATTR(cancel-rd-gack,			0x13),
+	XGENE_PMU_EVENT_ATTR(cancel-wr-gack,			0x14),
+	XGENE_PMU_EVENT_ATTR(mcb-csw-req-stall,			0x15),
+	XGENE_PMU_EVENT_ATTR(mcu-req-intf-blocked,		0x16),
+	XGENE_PMU_EVENT_ATTR(mcb-mcu-rd-intf-stall,		0x17),
+	XGENE_PMU_EVENT_ATTR(csw-rd-intf-blocked,		0x18),
+	XGENE_PMU_EVENT_ATTR(csw-local-ack-intf-blocked,	0x19),
+	XGENE_PMU_EVENT_ATTR(mcu-req-table-full,		0x1a),
+	XGENE_PMU_EVENT_ATTR(mcu-stat-table-full,		0x1b),
+	XGENE_PMU_EVENT_ATTR(mcu-wr-table-full,			0x1c),
+	XGENE_PMU_EVENT_ATTR(mcu-rdreceipt-resp,		0x1d),
+	XGENE_PMU_EVENT_ATTR(mcu-wrcomplete-resp,		0x1e),
+	XGENE_PMU_EVENT_ATTR(mcu-retryack-resp,			0x1f),
+	XGENE_PMU_EVENT_ATTR(mcu-pcrdgrant-resp,		0x20),
+	XGENE_PMU_EVENT_ATTR(mcu-req-from-lastload,		0x21),
+	XGENE_PMU_EVENT_ATTR(mcu-req-from-bypass,		0x22),
+	XGENE_PMU_EVENT_ATTR(volt-droop-detect,			0x23),
+	NULL,
+};
+
+static struct attribute *mc_pmu_v3_events_attrs[] = {
+	XGENE_PMU_EVENT_ATTR(cycle-count,			0x00),
+	XGENE_PMU_EVENT_ATTR(pmu-act-sent,			0x01),
+	XGENE_PMU_EVENT_ATTR(pmu-pre-sent,			0x02),
+	XGENE_PMU_EVENT_ATTR(pmu-rd-sent,			0x03),
+	XGENE_PMU_EVENT_ATTR(pmu-rda-sent,			0x04),
+	XGENE_PMU_EVENT_ATTR(pmu-wr-sent,			0x05),
+	XGENE_PMU_EVENT_ATTR(pmu-wra-sent,			0x06),
+	XGENE_PMU_EVENT_ATTR(pmu-pd-entry-vld,			0x07),
+	XGENE_PMU_EVENT_ATTR(pmu-sref-entry-vld,		0x08),
+	XGENE_PMU_EVENT_ATTR(pmu-prea-sent,			0x09),
+	XGENE_PMU_EVENT_ATTR(pmu-ref-sent,			0x0a),
+	XGENE_PMU_EVENT_ATTR(pmu-rd-rda-sent,			0x0b),
+	XGENE_PMU_EVENT_ATTR(pmu-wr-wra-sent,			0x0c),
+	XGENE_PMU_EVENT_ATTR(pmu-raw-hazard,			0x0d),
+	XGENE_PMU_EVENT_ATTR(pmu-war-hazard,			0x0e),
+	XGENE_PMU_EVENT_ATTR(pmu-waw-hazard,			0x0f),
+	XGENE_PMU_EVENT_ATTR(pmu-rar-hazard,			0x10),
+	XGENE_PMU_EVENT_ATTR(pmu-raw-war-waw-hazard,		0x11),
+	XGENE_PMU_EVENT_ATTR(pmu-hprd-lprd-wr-req-vld,		0x12),
+	XGENE_PMU_EVENT_ATTR(pmu-lprd-req-vld,			0x13),
+	XGENE_PMU_EVENT_ATTR(pmu-hprd-req-vld,			0x14),
+	XGENE_PMU_EVENT_ATTR(pmu-hprd-lprd-req-vld,		0x15),
+	XGENE_PMU_EVENT_ATTR(pmu-wr-req-vld,			0x16),
+	XGENE_PMU_EVENT_ATTR(pmu-partial-wr-req-vld,		0x17),
+	XGENE_PMU_EVENT_ATTR(pmu-rd-retry,			0x18),
+	XGENE_PMU_EVENT_ATTR(pmu-wr-retry,			0x19),
+	XGENE_PMU_EVENT_ATTR(pmu-retry-gnt,			0x1a),
+	XGENE_PMU_EVENT_ATTR(pmu-rank-change,			0x1b),
+	XGENE_PMU_EVENT_ATTR(pmu-dir-change,			0x1c),
+	XGENE_PMU_EVENT_ATTR(pmu-rank-dir-change,		0x1d),
+	XGENE_PMU_EVENT_ATTR(pmu-rank-active,			0x1e),
+	XGENE_PMU_EVENT_ATTR(pmu-rank-idle,			0x1f),
+	XGENE_PMU_EVENT_ATTR(pmu-rank-pd,			0x20),
+	XGENE_PMU_EVENT_ATTR(pmu-rank-sref,			0x21),
+	XGENE_PMU_EVENT_ATTR(pmu-queue-fill-gt-thresh,		0x22),
+	XGENE_PMU_EVENT_ATTR(pmu-queue-rds-gt-thresh,		0x23),
+	XGENE_PMU_EVENT_ATTR(pmu-queue-wrs-gt-thresh,		0x24),
+	XGENE_PMU_EVENT_ATTR(pmu-phy-updt-complt,		0x25),
+	XGENE_PMU_EVENT_ATTR(pmu-tz-fail,			0x26),
+	XGENE_PMU_EVENT_ATTR(pmu-dram-errc,			0x27),
+	XGENE_PMU_EVENT_ATTR(pmu-dram-errd,			0x28),
+	XGENE_PMU_EVENT_ATTR(pmu-rd-enq,			0x29),
+	XGENE_PMU_EVENT_ATTR(pmu-wr-enq,			0x2a),
+	XGENE_PMU_EVENT_ATTR(pmu-tmac-limit-reached,		0x2b),
+	XGENE_PMU_EVENT_ATTR(pmu-tmaw-tracker-full,		0x2c),
+	NULL,
+};
+
+static const struct attribute_group l3c_pmu_v3_events_attr_group = {
+	.name = "events",
+	.attrs = l3c_pmu_v3_events_attrs,
+};
+
+static const struct attribute_group iob_fast_pmu_v3_events_attr_group = {
+	.name = "events",
+	.attrs = iob_fast_pmu_v3_events_attrs,
+};
+
+static const struct attribute_group iob_slow_pmu_v3_events_attr_group = {
+	.name = "events",
+	.attrs = iob_slow_pmu_v3_events_attrs,
+};
+
+static const struct attribute_group mcb_pmu_v3_events_attr_group = {
+	.name = "events",
+	.attrs = mcb_pmu_v3_events_attrs,
+};
+
+static const struct attribute_group mc_pmu_v3_events_attr_group = {
+	.name = "events",
+	.attrs = mc_pmu_v3_events_attrs,
+};
+
 /*
  * sysfs cpumask attributes
  */
@@ -352,7 +633,7 @@ static ssize_t xgene_pmu_cpumask_show(struct device *dev,
 };
 
 /*
- * Per PMU device attribute groups
+ * Per PMU device attribute groups of PMU v1 and v2
  */
 static const struct attribute_group *l3c_pmu_attr_groups[] = {
 	&l3c_pmu_format_attr_group,
@@ -382,6 +663,44 @@ static ssize_t xgene_pmu_cpumask_show(struct device *dev,
 	NULL
 };
 
+/*
+ * Per PMU device attribute groups of PMU v3
+ */
+static const struct attribute_group *l3c_pmu_v3_attr_groups[] = {
+	&l3c_pmu_v3_format_attr_group,
+	&pmu_cpumask_attr_group,
+	&l3c_pmu_v3_events_attr_group,
+	NULL
+};
+
+static const struct attribute_group *iob_fast_pmu_v3_attr_groups[] = {
+	&iob_pmu_v3_format_attr_group,
+	&pmu_cpumask_attr_group,
+	&iob_fast_pmu_v3_events_attr_group,
+	NULL
+};
+
+static const struct attribute_group *iob_slow_pmu_v3_attr_groups[] = {
+	&iob_slow_pmu_v3_format_attr_group,
+	&pmu_cpumask_attr_group,
+	&iob_slow_pmu_v3_events_attr_group,
+	NULL
+};
+
+static const struct attribute_group *mcb_pmu_v3_attr_groups[] = {
+	&mcb_pmu_v3_format_attr_group,
+	&pmu_cpumask_attr_group,
+	&mcb_pmu_v3_events_attr_group,
+	NULL
+};
+
+static const struct attribute_group *mc_pmu_v3_attr_groups[] = {
+	&mc_pmu_v3_format_attr_group,
+	&pmu_cpumask_attr_group,
+	&mc_pmu_v3_events_attr_group,
+	NULL
+};
+
 static int get_next_avail_cntr(struct xgene_pmu_dev *pmu_dev)
 {
 	int cntr;
@@ -405,17 +724,47 @@ static inline void xgene_pmu_mask_int(struct xgene_pmu *xgene_pmu)
 	writel(PCPPMU_INTENMASK, xgene_pmu->pcppmu_csr + PCPPMU_INTMASK_REG);
 }
 
+static inline void xgene_pmu_v3_mask_int(struct xgene_pmu *xgene_pmu)
+{
+	writel(PCPPMU_V3_INTENMASK, xgene_pmu->pcppmu_csr + PCPPMU_INTMASK_REG);
+}
+
 static inline void xgene_pmu_unmask_int(struct xgene_pmu *xgene_pmu)
 {
 	writel(PCPPMU_INTCLRMASK, xgene_pmu->pcppmu_csr + PCPPMU_INTMASK_REG);
 }
 
+static inline void xgene_pmu_v3_unmask_int(struct xgene_pmu *xgene_pmu)
+{
+	writel(PCPPMU_V3_INTCLRMASK,
+	       xgene_pmu->pcppmu_csr + PCPPMU_INTMASK_REG);
+}
+
 static inline u64 xgene_pmu_read_counter32(struct xgene_pmu_dev *pmu_dev,
 					   int idx)
 {
 	return (u64)readl(pmu_dev->inf->csr + PMU_PMEVCNTR0 + (4 * idx));
 }
 
+static inline u64 xgene_pmu_read_counter64(struct xgene_pmu_dev *pmu_dev,
+					   int idx)
+{
+	u32 lo, hi;
+
+	/*
+	 * v3 has 64-bit counter registers composed by 2 32-bit registers
+	 * This can be a problem if the counter increases and carries
+	 * out of bit [31] between 2 reads. The extra reads would help
+	 * to prevent this issue.
+	 */
+	do {
+		hi = xgene_pmu_read_counter32(pmu_dev, 2 * idx + 1);
+		lo = xgene_pmu_read_counter32(pmu_dev, 2 * idx);
+	} while (hi != xgene_pmu_read_counter32(pmu_dev, 2 * idx + 1));
+
+	return (((u64)hi << 32) | lo);
+}
+
 static inline void
 xgene_pmu_write_counter32(struct xgene_pmu_dev *pmu_dev, int idx, u64 val)
 {
@@ -423,6 +772,19 @@ static inline u64 xgene_pmu_read_counter32(struct xgene_pmu_dev *pmu_dev,
 }
 
 static inline void
+xgene_pmu_write_counter64(struct xgene_pmu_dev *pmu_dev, int idx, u64 val)
+{
+	u32 cnt_lo, cnt_hi;
+
+	cnt_lo = val & 0xFFFFFFFF;
+	cnt_hi = val >> 32;
+
+	/* v3 has 64-bit counter registers composed by 2 32-bit registers */
+	xgene_pmu_write_counter32(pmu_dev, 2 * idx, val);
+	xgene_pmu_write_counter32(pmu_dev, 2 * idx + 1, val >> 32);
+}
+
+static inline void
 xgene_pmu_write_evttype(struct xgene_pmu_dev *pmu_dev, int idx, u32 val)
 {
 	writel(val, pmu_dev->inf->csr + PMU_PMEVTYPER0 + (4 * idx));
@@ -435,12 +797,18 @@ static inline u64 xgene_pmu_read_counter32(struct xgene_pmu_dev *pmu_dev,
 }
 
 static inline void
+xgene_pmu_v3_write_agentmsk(struct xgene_pmu_dev *pmu_dev, u32 val) { }
+
+static inline void
 xgene_pmu_write_agent1msk(struct xgene_pmu_dev *pmu_dev, u32 val)
 {
 	writel(val, pmu_dev->inf->csr + PMU_PMAMR1);
 }
 
 static inline void
+xgene_pmu_v3_write_agent1msk(struct xgene_pmu_dev *pmu_dev, u32 val) { }
+
+static inline void
 xgene_pmu_enable_counter(struct xgene_pmu_dev *pmu_dev, int idx)
 {
 	u32 val;
@@ -621,7 +989,7 @@ static void xgene_perf_event_set_period(struct perf_event *event)
 	struct xgene_pmu *xgene_pmu = pmu_dev->parent;
 	struct hw_perf_event *hw = &event->hw;
 	/*
-	 * The X-Gene PMU counters have a period of 2^32. To account for the
+	 * The X-Gene PMU counters have a period of 2^32 or more. To account for the
 	 * possiblity of extreme interrupt latency we program for a period of
 	 * half that. Hopefully we can handle the interrupt before another 2^31
 	 * events occur and the counter overtakes its previous value.
@@ -741,7 +1109,10 @@ static int xgene_init_perf(struct xgene_pmu_dev *pmu_dev, char *name)
 {
 	struct xgene_pmu *xgene_pmu;
 
-	pmu_dev->max_period = PMU_CNT_MAX_PERIOD - 1;
+	if (pmu_dev->parent->version == PCP_PMU_V3)
+		pmu_dev->max_period = PMU_V3_CNT_MAX_PERIOD;
+	else
+		pmu_dev->max_period = PMU_CNT_MAX_PERIOD;
 	/* First version PMU supports only single event counter */
 	xgene_pmu = pmu_dev->parent;
 	if (xgene_pmu->version == PCP_PMU_V1)
@@ -786,20 +1157,38 @@ static int xgene_init_perf(struct xgene_pmu_dev *pmu_dev, char *name)
 
 	switch (pmu->inf->type) {
 	case PMU_TYPE_L3C:
-		pmu->attr_groups = l3c_pmu_attr_groups;
+		if (!(xgene_pmu->l3c_active_mask & pmu->inf->enable_mask))
+			goto dev_err;
+		if (xgene_pmu->version == PCP_PMU_V3)
+			pmu->attr_groups = l3c_pmu_v3_attr_groups;
+		else
+			pmu->attr_groups = l3c_pmu_attr_groups;
 		break;
 	case PMU_TYPE_IOB:
-		pmu->attr_groups = iob_pmu_attr_groups;
+		if (xgene_pmu->version == PCP_PMU_V3)
+			pmu->attr_groups = iob_fast_pmu_v3_attr_groups;
+		else
+			pmu->attr_groups = iob_pmu_attr_groups;
+		break;
+	case PMU_TYPE_IOB_SLOW:
+		if (xgene_pmu->version == PCP_PMU_V3)
+			pmu->attr_groups = iob_slow_pmu_v3_attr_groups;
 		break;
 	case PMU_TYPE_MCB:
 		if (!(xgene_pmu->mcb_active_mask & pmu->inf->enable_mask))
 			goto dev_err;
-		pmu->attr_groups = mcb_pmu_attr_groups;
+		if (xgene_pmu->version == PCP_PMU_V3)
+			pmu->attr_groups = mcb_pmu_v3_attr_groups;
+		else
+			pmu->attr_groups = mcb_pmu_attr_groups;
 		break;
 	case PMU_TYPE_MC:
 		if (!(xgene_pmu->mc_active_mask & pmu->inf->enable_mask))
 			goto dev_err;
-		pmu->attr_groups = mc_pmu_attr_groups;
+		if (xgene_pmu->version == PCP_PMU_V3)
+			pmu->attr_groups = mc_pmu_v3_attr_groups;
+		else
+			pmu->attr_groups = mc_pmu_attr_groups;
 		break;
 	default:
 		return -EINVAL;
@@ -823,18 +1212,25 @@ static int xgene_init_perf(struct xgene_pmu_dev *pmu_dev, char *name)
 static void _xgene_pmu_isr(int irq, struct xgene_pmu_dev *pmu_dev)
 {
 	struct xgene_pmu *xgene_pmu = pmu_dev->parent;
+	void __iomem *csr = pmu_dev->inf->csr;
 	u32 pmovsr;
 	int idx;
 
-	pmovsr = readl(pmu_dev->inf->csr + PMU_PMOVSR) & PMU_OVERFLOW_MASK;
+	if (xgene_pmu->version == PCP_PMU_V3)
+		pmovsr = readl(csr + PMU_PMOVSSET) & PMU_OVERFLOW_MASK;
+	else
+		pmovsr = readl(csr + PMU_PMOVSR) & PMU_OVERFLOW_MASK;
+
 	if (!pmovsr)
 		return;
 
 	/* Clear interrupt flag */
 	if (xgene_pmu->version == PCP_PMU_V1)
-		writel(0x0, pmu_dev->inf->csr + PMU_PMOVSR);
+		writel(0x0, csr + PMU_PMOVSR);
+	else if (xgene_pmu->version == PCP_PMU_V2)
+		writel(pmovsr, csr + PMU_PMOVSR);
 	else
-		writel(pmovsr, pmu_dev->inf->csr + PMU_PMOVSR);
+		writel(pmovsr, csr + PMU_PMOVSCLR);
 
 	for (idx = 0; idx < PMU_MAX_COUNTERS; idx++) {
 		struct perf_event *event = pmu_dev->pmu_counter_event[idx];
@@ -850,6 +1246,7 @@ static void _xgene_pmu_isr(int irq, struct xgene_pmu_dev *pmu_dev)
 
 static irqreturn_t xgene_pmu_isr(int irq, void *dev_id)
 {
+	u32 intr_mcu, intr_mcb, intr_l3c, intr_iob;
 	struct xgene_pmu_dev_ctx *ctx;
 	struct xgene_pmu *xgene_pmu = dev_id;
 	unsigned long flags;
@@ -859,22 +1256,33 @@ static irqreturn_t xgene_pmu_isr(int irq, void *dev_id)
 
 	/* Get Interrupt PMU source */
 	val = readl(xgene_pmu->pcppmu_csr + PCPPMU_INTSTATUS_REG);
-	if (val & PCPPMU_INT_MCU) {
+	if (xgene_pmu->version == PCP_PMU_V3) {
+		intr_mcu = PCPPMU_V3_INT_MCU;
+		intr_mcb = PCPPMU_V3_INT_MCB;
+		intr_l3c = PCPPMU_V3_INT_L3C;
+		intr_iob = PCPPMU_V3_INT_IOB;
+	} else {
+		intr_mcu = PCPPMU_INT_MCU;
+		intr_mcb = PCPPMU_INT_MCB;
+		intr_l3c = PCPPMU_INT_L3C;
+		intr_iob = PCPPMU_INT_IOB;
+	}
+	if (val & intr_mcu) {
 		list_for_each_entry(ctx, &xgene_pmu->mcpmus, next) {
 			_xgene_pmu_isr(irq, ctx->pmu_dev);
 		}
 	}
-	if (val & PCPPMU_INT_MCB) {
+	if (val & intr_mcb) {
 		list_for_each_entry(ctx, &xgene_pmu->mcbpmus, next) {
 			_xgene_pmu_isr(irq, ctx->pmu_dev);
 		}
 	}
-	if (val & PCPPMU_INT_L3C) {
+	if (val & intr_l3c) {
 		list_for_each_entry(ctx, &xgene_pmu->l3cpmus, next) {
 			_xgene_pmu_isr(irq, ctx->pmu_dev);
 		}
 	}
-	if (val & PCPPMU_INT_IOB) {
+	if (val & intr_iob) {
 		list_for_each_entry(ctx, &xgene_pmu->iobpmus, next) {
 			_xgene_pmu_isr(irq, ctx->pmu_dev);
 		}
@@ -885,12 +1293,14 @@ static irqreturn_t xgene_pmu_isr(int irq, void *dev_id)
 	return IRQ_HANDLED;
 }
 
-static int acpi_pmu_probe_active_mcb_mcu(struct xgene_pmu *xgene_pmu,
+static int acpi_pmu_probe_active_mcb_mcu_l3c(struct xgene_pmu *xgene_pmu,
 					 struct platform_device *pdev)
 {
 	void __iomem *csw_csr, *mcba_csr, *mcbb_csr;
 	struct resource *res;
 	unsigned int reg;
+	u32 mcb0routing;
+	u32 mcb1routing;
 
 	res = platform_get_resource(pdev, IORESOURCE_MEM, 1);
 	csw_csr = devm_ioremap_resource(&pdev->dev, res);
@@ -899,41 +1309,72 @@ static int acpi_pmu_probe_active_mcb_mcu(struct xgene_pmu *xgene_pmu,
 		return PTR_ERR(csw_csr);
 	}
 
-	res = platform_get_resource(pdev, IORESOURCE_MEM, 2);
-	mcba_csr = devm_ioremap_resource(&pdev->dev, res);
-	if (IS_ERR(mcba_csr)) {
-		dev_err(&pdev->dev, "ioremap failed for MCBA CSR resource\n");
-		return PTR_ERR(mcba_csr);
-	}
-
-	res = platform_get_resource(pdev, IORESOURCE_MEM, 3);
-	mcbb_csr = devm_ioremap_resource(&pdev->dev, res);
-	if (IS_ERR(mcbb_csr)) {
-		dev_err(&pdev->dev, "ioremap failed for MCBB CSR resource\n");
-		return PTR_ERR(mcbb_csr);
-	}
-
 	reg = readl(csw_csr + CSW_CSWCR);
-	if (reg & CSW_CSWCR_DUALMCB_MASK) {
-		/* Dual MCB active */
-		xgene_pmu->mcb_active_mask = 0x3;
-		/* Probe all active MC(s) */
-		reg = readl(mcbb_csr + CSW_CSWCR);
-		xgene_pmu->mc_active_mask =
-			(reg & MCBADDRMR_DUALMCU_MODE_MASK) ? 0xF : 0x5;
+	if (xgene_pmu->version == PCP_PMU_V3) {
+		mcb0routing = CSW_CSWCR_MCB0_ROUTING(reg);
+		mcb1routing = CSW_CSWCR_MCB0_ROUTING(reg);
+		if (reg & CSW_CSWCR_DUALMCB_MASK) {
+			/* Dual MCB active */
+			xgene_pmu->mcb_active_mask = 0x3;
+			/* Probe all active L3C(s), maximum is 8 */
+			xgene_pmu->l3c_active_mask = 0xFF;
+			/* Probe all active MC(s), maximum is 8 */
+			if ((mcb0routing == 0x2) && (mcb1routing == 0x2))
+				xgene_pmu->mc_active_mask = 0xFF;
+			else if ((mcb0routing == 0x1) && (mcb1routing == 0x1))
+				xgene_pmu->mc_active_mask =  0x33;
+			else
+				xgene_pmu->mc_active_mask =  0x11;
+
+		} else {
+			/* Single MCB active */
+			xgene_pmu->mcb_active_mask = 0x1;
+			/* Probe all active L3C(s), maximum is 4 */
+			xgene_pmu->l3c_active_mask = 0x0F;
+			/* Probe all active MC(s), maximum is 4 */
+			if ((mcb0routing == 0x2) && (mcb1routing == 0x0))
+				xgene_pmu->mc_active_mask = 0x0F;
+			else if ((mcb0routing == 0x1) && (mcb1routing == 0x0))
+				xgene_pmu->mc_active_mask =  0x03;
+			else
+				xgene_pmu->mc_active_mask =  0x01;
+		}
 	} else {
-		/* Single MCB active */
-		xgene_pmu->mcb_active_mask = 0x1;
-		/* Probe all active MC(s) */
-		reg = readl(mcba_csr + CSW_CSWCR);
-		xgene_pmu->mc_active_mask =
-			(reg & MCBADDRMR_DUALMCU_MODE_MASK) ? 0x3 : 0x1;
-	}
+		res = platform_get_resource(pdev, IORESOURCE_MEM, 2);
+		mcba_csr = devm_ioremap_resource(&pdev->dev, res);
+		if (IS_ERR(mcba_csr)) {
+			dev_err(&pdev->dev, "ioremap failed for MCBA CSR resource\n");
+			return PTR_ERR(mcba_csr);
+		}
+
+		res = platform_get_resource(pdev, IORESOURCE_MEM, 3);
+		mcbb_csr = devm_ioremap_resource(&pdev->dev, res);
+		if (IS_ERR(mcbb_csr)) {
+			dev_err(&pdev->dev, "ioremap failed for MCBB CSR resource\n");
+			return PTR_ERR(mcbb_csr);
+		}
 
+		xgene_pmu->l3c_active_mask = 0x1;
+		if (reg & CSW_CSWCR_DUALMCB_MASK) {
+			/* Dual MCB active */
+			xgene_pmu->mcb_active_mask = 0x3;
+			/* Probe all active MC(s) */
+			reg = readl(mcbb_csr + CSW_CSWCR);
+			xgene_pmu->mc_active_mask =
+				(reg & MCBADDRMR_DUALMCU_MODE_MASK) ? 0xF : 0x5;
+		} else {
+			/* Single MCB active */
+			xgene_pmu->mcb_active_mask = 0x1;
+			/* Probe all active MC(s) */
+			reg = readl(mcba_csr + CSW_CSWCR);
+			xgene_pmu->mc_active_mask =
+				(reg & MCBADDRMR_DUALMCU_MODE_MASK) ? 0x3 : 0x1;
+		}
+	}
 	return 0;
 }
 
-static int fdt_pmu_probe_active_mcb_mcu(struct xgene_pmu *xgene_pmu,
+static int fdt_pmu_probe_active_mcb_mcu_l3c(struct xgene_pmu *xgene_pmu,
 					struct platform_device *pdev)
 {
 	struct regmap *csw_map, *mcba_map, *mcbb_map;
@@ -958,6 +1399,7 @@ static int fdt_pmu_probe_active_mcb_mcu(struct xgene_pmu *xgene_pmu,
 		return PTR_ERR(mcbb_map);
 	}
 
+	xgene_pmu->l3c_active_mask = 0x1;
 	if (regmap_read(csw_map, CSW_CSWCR, &reg))
 		return -EINVAL;
 
@@ -986,8 +1428,8 @@ static int xgene_pmu_probe_active_mcb_mcu(struct xgene_pmu *xgene_pmu,
 					  struct platform_device *pdev)
 {
 	if (has_acpi_companion(&pdev->dev))
-		return acpi_pmu_probe_active_mcb_mcu(xgene_pmu, pdev);
-	return fdt_pmu_probe_active_mcb_mcu(xgene_pmu, pdev);
+		return acpi_pmu_probe_active_mcb_mcu_l3c(xgene_pmu, pdev);
+	return fdt_pmu_probe_active_mcb_mcu_l3c(xgene_pmu, pdev);
 }
 
 static char *xgene_pmu_dev_name(struct device *dev, u32 type, int id)
@@ -997,6 +1439,8 @@ static char *xgene_pmu_dev_name(struct device *dev, u32 type, int id)
 		return devm_kasprintf(dev, GFP_KERNEL, "l3c%d", id);
 	case PMU_TYPE_IOB:
 		return devm_kasprintf(dev, GFP_KERNEL, "iob%d", id);
+	case PMU_TYPE_IOB_SLOW:
+		return devm_kasprintf(dev, GFP_KERNEL, "iob-slow%d", id);
 	case PMU_TYPE_MCB:
 		return devm_kasprintf(dev, GFP_KERNEL, "mcb%d", id);
 	case PMU_TYPE_MC:
@@ -1080,6 +1524,11 @@ xgene_pmu_dev_ctx *acpi_get_pmu_hw_inf(struct xgene_pmu *xgene_pmu,
 	{"APMC0D5E", PMU_TYPE_IOB},
 	{"APMC0D5F", PMU_TYPE_MCB},
 	{"APMC0D60", PMU_TYPE_MC},
+	{"APMC0D84", PMU_TYPE_L3C},
+	{"APMC0D85", PMU_TYPE_IOB},
+	{"APMC0D86", PMU_TYPE_IOB_SLOW},
+	{"APMC0D87", PMU_TYPE_MCB},
+	{"APMC0D88", PMU_TYPE_MC},
 	{},
 };
 
@@ -1134,6 +1583,9 @@ static acpi_status acpi_pmu_dev_add(acpi_handle handle, u32 level,
 	case PMU_TYPE_IOB:
 		list_add(&ctx->next, &xgene_pmu->iobpmus);
 		break;
+	case PMU_TYPE_IOB_SLOW:
+		list_add(&ctx->next, &xgene_pmu->iobpmus);
+		break;
 	case PMU_TYPE_MCB:
 		list_add(&ctx->next, &xgene_pmu->mcbpmus);
 		break;
@@ -1255,6 +1707,9 @@ static int fdt_pmu_probe_pmu_dev(struct xgene_pmu *xgene_pmu,
 		case PMU_TYPE_IOB:
 			list_add(&ctx->next, &xgene_pmu->iobpmus);
 			break;
+		case PMU_TYPE_IOB_SLOW:
+			list_add(&ctx->next, &xgene_pmu->iobpmus);
+			break;
 		case PMU_TYPE_MCB:
 			list_add(&ctx->next, &xgene_pmu->mcbpmus);
 			break;
@@ -1300,6 +1755,23 @@ static int xgene_pmu_probe_pmu_dev(struct xgene_pmu *xgene_pmu,
 	.stop_counters = xgene_pmu_stop_counters,
 };
 
+const struct xgene_pmu_ops xgene_pmu_v3_ops = {
+	.mask_int = xgene_pmu_v3_mask_int,
+	.unmask_int = xgene_pmu_v3_unmask_int,
+	.read_counter = xgene_pmu_read_counter64,
+	.write_counter = xgene_pmu_write_counter64,
+	.write_evttype = xgene_pmu_write_evttype,
+	.write_agentmsk = xgene_pmu_v3_write_agentmsk,
+	.write_agent1msk = xgene_pmu_v3_write_agent1msk,
+	.enable_counter = xgene_pmu_enable_counter,
+	.disable_counter = xgene_pmu_disable_counter,
+	.enable_counter_int = xgene_pmu_enable_counter_int,
+	.disable_counter_int = xgene_pmu_disable_counter_int,
+	.reset_counters = xgene_pmu_reset_counters,
+	.start_counters = xgene_pmu_start_counters,
+	.stop_counters = xgene_pmu_stop_counters,
+};
+
 static const struct of_device_id xgene_pmu_of_match[] = {
 	{ .compatible	= "apm,xgene-pmu",	.data = &xgene_pmu_data },
 	{ .compatible	= "apm,xgene-pmu-v2",	.data = &xgene_pmu_v2_data },
@@ -1310,6 +1782,7 @@ static int xgene_pmu_probe_pmu_dev(struct xgene_pmu *xgene_pmu,
 static const struct acpi_device_id xgene_pmu_acpi_match[] = {
 	{"APMC0D5B", PCP_PMU_V1},
 	{"APMC0D5C", PCP_PMU_V2},
+	{"APMC0D83", PCP_PMU_V3},
 	{},
 };
 MODULE_DEVICE_TABLE(acpi, xgene_pmu_acpi_match);
@@ -1349,7 +1822,10 @@ static int xgene_pmu_probe(struct platform_device *pdev)
 	if (version < 0)
 		return -ENODEV;
 
-	xgene_pmu->ops = &xgene_pmu_ops;
+	if (version != PCP_PMU_V3)
+		xgene_pmu->ops = &xgene_pmu_ops;
+	else
+		xgene_pmu->ops = &xgene_pmu_v3_ops;
 
 	INIT_LIST_HEAD(&xgene_pmu->l3cpmus);
 	INIT_LIST_HEAD(&xgene_pmu->iobpmus);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 3/3] perf: xgene: Add support for SoC PMU version 3
@ 2017-04-03 16:47   ` Hoan Tran
  0 siblings, 0 replies; 26+ messages in thread
From: Hoan Tran @ 2017-04-03 16:47 UTC (permalink / raw)
  To: linux-arm-kernel

This patch adds support for SoC-wide (AKA uncore) Performance Monitoring
Unit version 3.

It can support up to
 - 2 IOB PMU instances
 - 8 L3C PMU instances
 - 2 MCB PMU instances
 - 8 MCU PMU instances

Signed-off-by: Hoan Tran <hotran@apm.com>
---
 drivers/perf/xgene_pmu.c | 572 +++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 524 insertions(+), 48 deletions(-)

diff --git a/drivers/perf/xgene_pmu.c b/drivers/perf/xgene_pmu.c
index f34fc78..a72814d 100644
--- a/drivers/perf/xgene_pmu.c
+++ b/drivers/perf/xgene_pmu.c
@@ -37,6 +37,8 @@
 
 #define CSW_CSWCR                       0x0000
 #define  CSW_CSWCR_DUALMCB_MASK         BIT(0)
+#define  CSW_CSWCR_MCB0_ROUTING(x)	(((x) & 0x0C) >> 2)
+#define  CSW_CSWCR_MCB1_ROUTING(x)	(((x) & 0x30) >> 4)
 #define MCBADDRMR                       0x0000
 #define  MCBADDRMR_DUALMCU_MODE_MASK    BIT(2)
 
@@ -50,8 +52,17 @@
 #define  PCPPMU_INT_L3C		BIT(2)
 #define  PCPPMU_INT_IOB		BIT(3)
 
+#define  PCPPMU_V3_INTMASK	0x00FF33FF
+#define  PCPPMU_V3_INTENMASK	0xFFFFFFFF
+#define  PCPPMU_V3_INTCLRMASK	0xFF00CC00
+#define  PCPPMU_V3_INT_MCU	0x000000FF
+#define  PCPPMU_V3_INT_MCB	0x00000300
+#define  PCPPMU_V3_INT_L3C	0x00FF0000
+#define  PCPPMU_V3_INT_IOB	0x00003000
+
 #define PMU_MAX_COUNTERS	4
-#define PMU_CNT_MAX_PERIOD	0x100000000ULL
+#define PMU_CNT_MAX_PERIOD	0xFFFFFFFFULL
+#define PMU_V3_CNT_MAX_PERIOD	0xFFFFFFFFFFFFFFFFULL
 #define PMU_OVERFLOW_MASK	0xF
 #define PMU_PMCR_E		BIT(0)
 #define PMU_PMCR_P		BIT(1)
@@ -73,6 +84,10 @@
 #define PMU_PMOVSR		0xC80
 #define PMU_PMCR		0xE04
 
+/* PMU registers for V3 */
+#define PMU_PMOVSCLR		0xC80
+#define PMU_PMOVSSET		0xCC0
+
 #define to_pmu_dev(p)     container_of(p, struct xgene_pmu_dev, pmu)
 #define GET_CNTR(ev)      (ev->hw.idx)
 #define GET_EVENTID(ev)   (ev->hw.config & 0xFFULL)
@@ -119,6 +134,7 @@ struct xgene_pmu {
 	void __iomem *pcppmu_csr;
 	u32 mcb_active_mask;
 	u32 mc_active_mask;
+	u32 l3c_active_mask;
 	cpumask_t cpu;
 	raw_spinlock_t lock;
 	const struct xgene_pmu_ops *ops;
@@ -143,11 +159,13 @@ struct xgene_pmu_data {
 enum xgene_pmu_version {
 	PCP_PMU_V1 = 1,
 	PCP_PMU_V2,
+	PCP_PMU_V3,
 };
 
 enum xgene_pmu_dev_type {
 	PMU_TYPE_L3C = 0,
 	PMU_TYPE_IOB,
+	PMU_TYPE_IOB_SLOW,
 	PMU_TYPE_MCB,
 	PMU_TYPE_MC,
 };
@@ -213,6 +231,56 @@ static ssize_t xgene_pmu_format_show(struct device *dev,
 	.attrs = mc_pmu_format_attrs,
 };
 
+static struct attribute *l3c_pmu_v3_format_attrs[] = {
+	XGENE_PMU_FORMAT_ATTR(l3c_eventid, "config:0-39"),
+	NULL,
+};
+
+static struct attribute *iob_pmu_v3_format_attrs[] = {
+	XGENE_PMU_FORMAT_ATTR(iob_eventid, "config:0-47"),
+	NULL,
+};
+
+static struct attribute *iob_slow_pmu_v3_format_attrs[] = {
+	XGENE_PMU_FORMAT_ATTR(iob_slow_eventid, "config:0-16"),
+	NULL,
+};
+
+static struct attribute *mcb_pmu_v3_format_attrs[] = {
+	XGENE_PMU_FORMAT_ATTR(mcb_eventid, "config:0-35"),
+	NULL,
+};
+
+static struct attribute *mc_pmu_v3_format_attrs[] = {
+	XGENE_PMU_FORMAT_ATTR(mc_eventid, "config:0-44"),
+	NULL,
+};
+
+static const struct attribute_group l3c_pmu_v3_format_attr_group = {
+	.name = "format",
+	.attrs = l3c_pmu_v3_format_attrs,
+};
+
+static const struct attribute_group iob_pmu_v3_format_attr_group = {
+	.name = "format",
+	.attrs = iob_pmu_v3_format_attrs,
+};
+
+static const struct attribute_group iob_slow_pmu_v3_format_attr_group = {
+	.name = "format",
+	.attrs = iob_slow_pmu_v3_format_attrs,
+};
+
+static const struct attribute_group mcb_pmu_v3_format_attr_group = {
+	.name = "format",
+	.attrs = mcb_pmu_v3_format_attrs,
+};
+
+static const struct attribute_group mc_pmu_v3_format_attr_group = {
+	.name = "format",
+	.attrs = mc_pmu_v3_format_attrs,
+};
+
 /*
  * sysfs event attributes
  */
@@ -329,6 +397,219 @@ static ssize_t xgene_pmu_event_show(struct device *dev,
 	.attrs = mc_pmu_events_attrs,
 };
 
+static struct attribute *l3c_pmu_v3_events_attrs[] = {
+	XGENE_PMU_EVENT_ATTR(cycle-count,			0x00),
+	XGENE_PMU_EVENT_ATTR(read-hit,				0x01),
+	XGENE_PMU_EVENT_ATTR(read-miss,				0x02),
+	XGENE_PMU_EVENT_ATTR(index-flush-eviction,		0x03),
+	XGENE_PMU_EVENT_ATTR(write-caused-replacement,		0x04),
+	XGENE_PMU_EVENT_ATTR(write-not-caused-replacement,	0x05),
+	XGENE_PMU_EVENT_ATTR(clean-eviction,			0x06),
+	XGENE_PMU_EVENT_ATTR(dirty-eviction,			0x07),
+	XGENE_PMU_EVENT_ATTR(reads,				0x08),
+	XGENE_PMU_EVENT_ATTR(writes,				0x09),
+	XGENE_PMU_EVENT_ATTR(requests,				0x0a),
+	XGENE_PMU_EVENT_ATTR(tq-bank-conflict-issue-stall,	0x0b),
+	XGENE_PMU_EVENT_ATTR(tq-full,				0x0c),
+	XGENE_PMU_EVENT_ATTR(ackq-full,				0x0d),
+	XGENE_PMU_EVENT_ATTR(wdb-full,				0x0e),
+	XGENE_PMU_EVENT_ATTR(odb-full,				0x10),
+	XGENE_PMU_EVENT_ATTR(wbq-full,				0x11),
+	XGENE_PMU_EVENT_ATTR(input-req-async-fifo-stall,	0x12),
+	XGENE_PMU_EVENT_ATTR(output-req-async-fifo-stall,	0x13),
+	XGENE_PMU_EVENT_ATTR(output-data-async-fifo-stall,	0x14),
+	XGENE_PMU_EVENT_ATTR(total-insertions,			0x15),
+	XGENE_PMU_EVENT_ATTR(sip-insertions-r-set,		0x16),
+	XGENE_PMU_EVENT_ATTR(sip-insertions-r-clear,		0x17),
+	XGENE_PMU_EVENT_ATTR(dip-insertions-r-set,		0x18),
+	XGENE_PMU_EVENT_ATTR(dip-insertions-r-clear,		0x19),
+	XGENE_PMU_EVENT_ATTR(dip-insertions-force-r-set,	0x1a),
+	XGENE_PMU_EVENT_ATTR(egressions,			0x1b),
+	XGENE_PMU_EVENT_ATTR(replacements,			0x1c),
+	XGENE_PMU_EVENT_ATTR(old-replacements,			0x1d),
+	XGENE_PMU_EVENT_ATTR(young-replacements,		0x1e),
+	XGENE_PMU_EVENT_ATTR(r-set-replacements,		0x1f),
+	XGENE_PMU_EVENT_ATTR(r-clear-replacements,		0x20),
+	XGENE_PMU_EVENT_ATTR(old-r-replacements,		0x21),
+	XGENE_PMU_EVENT_ATTR(old-nr-replacements,		0x22),
+	XGENE_PMU_EVENT_ATTR(young-r-replacements,		0x23),
+	XGENE_PMU_EVENT_ATTR(young-nr-replacements,		0x24),
+	XGENE_PMU_EVENT_ATTR(bloomfilter-clearings,		0x25),
+	XGENE_PMU_EVENT_ATTR(generation-flips,			0x26),
+	XGENE_PMU_EVENT_ATTR(vcc-droop-detected,		0x27),
+	NULL,
+};
+
+static struct attribute *iob_fast_pmu_v3_events_attrs[] = {
+	XGENE_PMU_EVENT_ATTR(cycle-count,			0x00),
+	XGENE_PMU_EVENT_ATTR(PA-req-buf-alloc-all,		0x01),
+	XGENE_PMU_EVENT_ATTR(PA-req-buf-alloc-rd,		0x02),
+	XGENE_PMU_EVENT_ATTR(PA-req-buf-alloc-wr,		0x03),
+	XGENE_PMU_EVENT_ATTR(PA-all-CP-req,			0x04),
+	XGENE_PMU_EVENT_ATTR(PA-CP-blk-req,			0x05),
+	XGENE_PMU_EVENT_ATTR(PA-CP-ptl-req,			0x06),
+	XGENE_PMU_EVENT_ATTR(PA-CP-rd-req,			0x07),
+	XGENE_PMU_EVENT_ATTR(PA-CP-wr-req,			0x08),
+	XGENE_PMU_EVENT_ATTR(BA-all-req,			0x09),
+	XGENE_PMU_EVENT_ATTR(BA-rd-req,				0x0a),
+	XGENE_PMU_EVENT_ATTR(BA-wr-req,				0x0b),
+	XGENE_PMU_EVENT_ATTR(PA-rd-shared-req-issued,		0x10),
+	XGENE_PMU_EVENT_ATTR(PA-rd-exclusive-req-issued,	0x11),
+	XGENE_PMU_EVENT_ATTR(PA-wr-invalidate-req-issued-stashable, 0x12),
+	XGENE_PMU_EVENT_ATTR(PA-wr-invalidate-req-issued-nonstashable, 0x13),
+	XGENE_PMU_EVENT_ATTR(PA-wr-back-req-issued-stashable,	0x14),
+	XGENE_PMU_EVENT_ATTR(PA-wr-back-req-issued-nonstashable, 0x15),
+	XGENE_PMU_EVENT_ATTR(PA-ptl-wr-req,			0x16),
+	XGENE_PMU_EVENT_ATTR(PA-ptl-rd-req,			0x17),
+	XGENE_PMU_EVENT_ATTR(PA-wr-back-clean-data,		0x18),
+	XGENE_PMU_EVENT_ATTR(PA-wr-back-cancelled-on-SS,	0x1b),
+	XGENE_PMU_EVENT_ATTR(PA-barrier-occurrence,		0x1c),
+	XGENE_PMU_EVENT_ATTR(PA-barrier-cycles,			0x1d),
+	XGENE_PMU_EVENT_ATTR(PA-total-CP-snoops,		0x20),
+	XGENE_PMU_EVENT_ATTR(PA-rd-shared-snoop,		0x21),
+	XGENE_PMU_EVENT_ATTR(PA-rd-shared-snoop-hit,		0x22),
+	XGENE_PMU_EVENT_ATTR(PA-rd-exclusive-snoop,		0x23),
+	XGENE_PMU_EVENT_ATTR(PA-rd-exclusive-snoop-hit,		0x24),
+	XGENE_PMU_EVENT_ATTR(PA-rd-wr-invalid-snoop,		0x25),
+	XGENE_PMU_EVENT_ATTR(PA-rd-wr-invalid-snoop-hit,	0x26),
+	XGENE_PMU_EVENT_ATTR(PA-req-buffer-full,		0x28),
+	XGENE_PMU_EVENT_ATTR(Cswlf-outbound-req-fifo-full,	0x29),
+	XGENE_PMU_EVENT_ATTR(Cswlf-inbound-snoop-fifo-backpressure, 0x2a),
+	XGENE_PMU_EVENT_ATTR(Cswlf-outbound-lack-fifo-full,	0x2b),
+	XGENE_PMU_EVENT_ATTR(Cswlf-inbound-gack-fifo-backpressure, 0x2c),
+	XGENE_PMU_EVENT_ATTR(Cswlf-outbound-data-fifo-full,	0x2d),
+	XGENE_PMU_EVENT_ATTR(Cswlf-inbound-data-fifo-backpressure, 0x2e),
+	XGENE_PMU_EVENT_ATTR(Cswlf-inbound-req-backpressure,	0x2f),
+	NULL,
+};
+
+static struct attribute *iob_slow_pmu_v3_events_attrs[] = {
+	XGENE_PMU_EVENT_ATTR(cycle-count,			0x00),
+	XGENE_PMU_EVENT_ATTR(PA-AXI0-rd-req,			0x01),
+	XGENE_PMU_EVENT_ATTR(PA-AXI0-wr-req,			0x02),
+	XGENE_PMU_EVENT_ATTR(PA-AXI1-rd-req,			0x03),
+	XGENE_PMU_EVENT_ATTR(PA-AXI1-wr-req,			0x04),
+	XGENE_PMU_EVENT_ATTR(BA-all-AXI-req,			0x07),
+	XGENE_PMU_EVENT_ATTR(BA-AXI-rd-req,			0x08),
+	XGENE_PMU_EVENT_ATTR(BA-AXI-wr-req,			0x09),
+	XGENE_PMU_EVENT_ATTR(BA-free-list-empty,		0x10),
+	NULL,
+};
+
+static struct attribute *mcb_pmu_v3_events_attrs[] = {
+	XGENE_PMU_EVENT_ATTR(cycle-count,			0x00),
+	XGENE_PMU_EVENT_ATTR(req-receive,			0x01),
+	XGENE_PMU_EVENT_ATTR(rd-req-recv,			0x02),
+	XGENE_PMU_EVENT_ATTR(rd-req-recv-2,			0x03),
+	XGENE_PMU_EVENT_ATTR(wr-req-recv,			0x04),
+	XGENE_PMU_EVENT_ATTR(wr-req-recv-2,			0x05),
+	XGENE_PMU_EVENT_ATTR(rd-req-sent-to-mcu,		0x06),
+	XGENE_PMU_EVENT_ATTR(rd-req-sent-to-mcu-2,		0x07),
+	XGENE_PMU_EVENT_ATTR(rd-req-sent-to-spec-mcu,		0x08),
+	XGENE_PMU_EVENT_ATTR(rd-req-sent-to-spec-mcu-2,		0x09),
+	XGENE_PMU_EVENT_ATTR(glbl-ack-recv-for-rd-sent-to-spec-mcu, 0x0a),
+	XGENE_PMU_EVENT_ATTR(glbl-ack-go-recv-for-rd-sent-to-spec-mcu, 0x0b),
+	XGENE_PMU_EVENT_ATTR(glbl-ack-nogo-recv-for-rd-sent-to-spec-mcu, 0x0c),
+	XGENE_PMU_EVENT_ATTR(glbl-ack-go-recv-any-rd-req,	0x0d),
+	XGENE_PMU_EVENT_ATTR(glbl-ack-go-recv-any-rd-req-2,	0x0e),
+	XGENE_PMU_EVENT_ATTR(wr-req-sent-to-mcu,		0x0f),
+	XGENE_PMU_EVENT_ATTR(gack-recv,				0x10),
+	XGENE_PMU_EVENT_ATTR(rd-gack-recv,			0x11),
+	XGENE_PMU_EVENT_ATTR(wr-gack-recv,			0x12),
+	XGENE_PMU_EVENT_ATTR(cancel-rd-gack,			0x13),
+	XGENE_PMU_EVENT_ATTR(cancel-wr-gack,			0x14),
+	XGENE_PMU_EVENT_ATTR(mcb-csw-req-stall,			0x15),
+	XGENE_PMU_EVENT_ATTR(mcu-req-intf-blocked,		0x16),
+	XGENE_PMU_EVENT_ATTR(mcb-mcu-rd-intf-stall,		0x17),
+	XGENE_PMU_EVENT_ATTR(csw-rd-intf-blocked,		0x18),
+	XGENE_PMU_EVENT_ATTR(csw-local-ack-intf-blocked,	0x19),
+	XGENE_PMU_EVENT_ATTR(mcu-req-table-full,		0x1a),
+	XGENE_PMU_EVENT_ATTR(mcu-stat-table-full,		0x1b),
+	XGENE_PMU_EVENT_ATTR(mcu-wr-table-full,			0x1c),
+	XGENE_PMU_EVENT_ATTR(mcu-rdreceipt-resp,		0x1d),
+	XGENE_PMU_EVENT_ATTR(mcu-wrcomplete-resp,		0x1e),
+	XGENE_PMU_EVENT_ATTR(mcu-retryack-resp,			0x1f),
+	XGENE_PMU_EVENT_ATTR(mcu-pcrdgrant-resp,		0x20),
+	XGENE_PMU_EVENT_ATTR(mcu-req-from-lastload,		0x21),
+	XGENE_PMU_EVENT_ATTR(mcu-req-from-bypass,		0x22),
+	XGENE_PMU_EVENT_ATTR(volt-droop-detect,			0x23),
+	NULL,
+};
+
+static struct attribute *mc_pmu_v3_events_attrs[] = {
+	XGENE_PMU_EVENT_ATTR(cycle-count,			0x00),
+	XGENE_PMU_EVENT_ATTR(pmu-act-sent,			0x01),
+	XGENE_PMU_EVENT_ATTR(pmu-pre-sent,			0x02),
+	XGENE_PMU_EVENT_ATTR(pmu-rd-sent,			0x03),
+	XGENE_PMU_EVENT_ATTR(pmu-rda-sent,			0x04),
+	XGENE_PMU_EVENT_ATTR(pmu-wr-sent,			0x05),
+	XGENE_PMU_EVENT_ATTR(pmu-wra-sent,			0x06),
+	XGENE_PMU_EVENT_ATTR(pmu-pd-entry-vld,			0x07),
+	XGENE_PMU_EVENT_ATTR(pmu-sref-entry-vld,		0x08),
+	XGENE_PMU_EVENT_ATTR(pmu-prea-sent,			0x09),
+	XGENE_PMU_EVENT_ATTR(pmu-ref-sent,			0x0a),
+	XGENE_PMU_EVENT_ATTR(pmu-rd-rda-sent,			0x0b),
+	XGENE_PMU_EVENT_ATTR(pmu-wr-wra-sent,			0x0c),
+	XGENE_PMU_EVENT_ATTR(pmu-raw-hazard,			0x0d),
+	XGENE_PMU_EVENT_ATTR(pmu-war-hazard,			0x0e),
+	XGENE_PMU_EVENT_ATTR(pmu-waw-hazard,			0x0f),
+	XGENE_PMU_EVENT_ATTR(pmu-rar-hazard,			0x10),
+	XGENE_PMU_EVENT_ATTR(pmu-raw-war-waw-hazard,		0x11),
+	XGENE_PMU_EVENT_ATTR(pmu-hprd-lprd-wr-req-vld,		0x12),
+	XGENE_PMU_EVENT_ATTR(pmu-lprd-req-vld,			0x13),
+	XGENE_PMU_EVENT_ATTR(pmu-hprd-req-vld,			0x14),
+	XGENE_PMU_EVENT_ATTR(pmu-hprd-lprd-req-vld,		0x15),
+	XGENE_PMU_EVENT_ATTR(pmu-wr-req-vld,			0x16),
+	XGENE_PMU_EVENT_ATTR(pmu-partial-wr-req-vld,		0x17),
+	XGENE_PMU_EVENT_ATTR(pmu-rd-retry,			0x18),
+	XGENE_PMU_EVENT_ATTR(pmu-wr-retry,			0x19),
+	XGENE_PMU_EVENT_ATTR(pmu-retry-gnt,			0x1a),
+	XGENE_PMU_EVENT_ATTR(pmu-rank-change,			0x1b),
+	XGENE_PMU_EVENT_ATTR(pmu-dir-change,			0x1c),
+	XGENE_PMU_EVENT_ATTR(pmu-rank-dir-change,		0x1d),
+	XGENE_PMU_EVENT_ATTR(pmu-rank-active,			0x1e),
+	XGENE_PMU_EVENT_ATTR(pmu-rank-idle,			0x1f),
+	XGENE_PMU_EVENT_ATTR(pmu-rank-pd,			0x20),
+	XGENE_PMU_EVENT_ATTR(pmu-rank-sref,			0x21),
+	XGENE_PMU_EVENT_ATTR(pmu-queue-fill-gt-thresh,		0x22),
+	XGENE_PMU_EVENT_ATTR(pmu-queue-rds-gt-thresh,		0x23),
+	XGENE_PMU_EVENT_ATTR(pmu-queue-wrs-gt-thresh,		0x24),
+	XGENE_PMU_EVENT_ATTR(pmu-phy-updt-complt,		0x25),
+	XGENE_PMU_EVENT_ATTR(pmu-tz-fail,			0x26),
+	XGENE_PMU_EVENT_ATTR(pmu-dram-errc,			0x27),
+	XGENE_PMU_EVENT_ATTR(pmu-dram-errd,			0x28),
+	XGENE_PMU_EVENT_ATTR(pmu-rd-enq,			0x29),
+	XGENE_PMU_EVENT_ATTR(pmu-wr-enq,			0x2a),
+	XGENE_PMU_EVENT_ATTR(pmu-tmac-limit-reached,		0x2b),
+	XGENE_PMU_EVENT_ATTR(pmu-tmaw-tracker-full,		0x2c),
+	NULL,
+};
+
+static const struct attribute_group l3c_pmu_v3_events_attr_group = {
+	.name = "events",
+	.attrs = l3c_pmu_v3_events_attrs,
+};
+
+static const struct attribute_group iob_fast_pmu_v3_events_attr_group = {
+	.name = "events",
+	.attrs = iob_fast_pmu_v3_events_attrs,
+};
+
+static const struct attribute_group iob_slow_pmu_v3_events_attr_group = {
+	.name = "events",
+	.attrs = iob_slow_pmu_v3_events_attrs,
+};
+
+static const struct attribute_group mcb_pmu_v3_events_attr_group = {
+	.name = "events",
+	.attrs = mcb_pmu_v3_events_attrs,
+};
+
+static const struct attribute_group mc_pmu_v3_events_attr_group = {
+	.name = "events",
+	.attrs = mc_pmu_v3_events_attrs,
+};
+
 /*
  * sysfs cpumask attributes
  */
@@ -352,7 +633,7 @@ static ssize_t xgene_pmu_cpumask_show(struct device *dev,
 };
 
 /*
- * Per PMU device attribute groups
+ * Per PMU device attribute groups of PMU v1 and v2
  */
 static const struct attribute_group *l3c_pmu_attr_groups[] = {
 	&l3c_pmu_format_attr_group,
@@ -382,6 +663,44 @@ static ssize_t xgene_pmu_cpumask_show(struct device *dev,
 	NULL
 };
 
+/*
+ * Per PMU device attribute groups of PMU v3
+ */
+static const struct attribute_group *l3c_pmu_v3_attr_groups[] = {
+	&l3c_pmu_v3_format_attr_group,
+	&pmu_cpumask_attr_group,
+	&l3c_pmu_v3_events_attr_group,
+	NULL
+};
+
+static const struct attribute_group *iob_fast_pmu_v3_attr_groups[] = {
+	&iob_pmu_v3_format_attr_group,
+	&pmu_cpumask_attr_group,
+	&iob_fast_pmu_v3_events_attr_group,
+	NULL
+};
+
+static const struct attribute_group *iob_slow_pmu_v3_attr_groups[] = {
+	&iob_slow_pmu_v3_format_attr_group,
+	&pmu_cpumask_attr_group,
+	&iob_slow_pmu_v3_events_attr_group,
+	NULL
+};
+
+static const struct attribute_group *mcb_pmu_v3_attr_groups[] = {
+	&mcb_pmu_v3_format_attr_group,
+	&pmu_cpumask_attr_group,
+	&mcb_pmu_v3_events_attr_group,
+	NULL
+};
+
+static const struct attribute_group *mc_pmu_v3_attr_groups[] = {
+	&mc_pmu_v3_format_attr_group,
+	&pmu_cpumask_attr_group,
+	&mc_pmu_v3_events_attr_group,
+	NULL
+};
+
 static int get_next_avail_cntr(struct xgene_pmu_dev *pmu_dev)
 {
 	int cntr;
@@ -405,17 +724,47 @@ static inline void xgene_pmu_mask_int(struct xgene_pmu *xgene_pmu)
 	writel(PCPPMU_INTENMASK, xgene_pmu->pcppmu_csr + PCPPMU_INTMASK_REG);
 }
 
+static inline void xgene_pmu_v3_mask_int(struct xgene_pmu *xgene_pmu)
+{
+	writel(PCPPMU_V3_INTENMASK, xgene_pmu->pcppmu_csr + PCPPMU_INTMASK_REG);
+}
+
 static inline void xgene_pmu_unmask_int(struct xgene_pmu *xgene_pmu)
 {
 	writel(PCPPMU_INTCLRMASK, xgene_pmu->pcppmu_csr + PCPPMU_INTMASK_REG);
 }
 
+static inline void xgene_pmu_v3_unmask_int(struct xgene_pmu *xgene_pmu)
+{
+	writel(PCPPMU_V3_INTCLRMASK,
+	       xgene_pmu->pcppmu_csr + PCPPMU_INTMASK_REG);
+}
+
 static inline u64 xgene_pmu_read_counter32(struct xgene_pmu_dev *pmu_dev,
 					   int idx)
 {
 	return (u64)readl(pmu_dev->inf->csr + PMU_PMEVCNTR0 + (4 * idx));
 }
 
+static inline u64 xgene_pmu_read_counter64(struct xgene_pmu_dev *pmu_dev,
+					   int idx)
+{
+	u32 lo, hi;
+
+	/*
+	 * v3 has 64-bit counter registers composed by 2 32-bit registers
+	 * This can be a problem if the counter increases and carries
+	 * out of bit [31] between 2 reads. The extra reads would help
+	 * to prevent this issue.
+	 */
+	do {
+		hi = xgene_pmu_read_counter32(pmu_dev, 2 * idx + 1);
+		lo = xgene_pmu_read_counter32(pmu_dev, 2 * idx);
+	} while (hi != xgene_pmu_read_counter32(pmu_dev, 2 * idx + 1));
+
+	return (((u64)hi << 32) | lo);
+}
+
 static inline void
 xgene_pmu_write_counter32(struct xgene_pmu_dev *pmu_dev, int idx, u64 val)
 {
@@ -423,6 +772,19 @@ static inline u64 xgene_pmu_read_counter32(struct xgene_pmu_dev *pmu_dev,
 }
 
 static inline void
+xgene_pmu_write_counter64(struct xgene_pmu_dev *pmu_dev, int idx, u64 val)
+{
+	u32 cnt_lo, cnt_hi;
+
+	cnt_lo = val & 0xFFFFFFFF;
+	cnt_hi = val >> 32;
+
+	/* v3 has 64-bit counter registers composed by 2 32-bit registers */
+	xgene_pmu_write_counter32(pmu_dev, 2 * idx, val);
+	xgene_pmu_write_counter32(pmu_dev, 2 * idx + 1, val >> 32);
+}
+
+static inline void
 xgene_pmu_write_evttype(struct xgene_pmu_dev *pmu_dev, int idx, u32 val)
 {
 	writel(val, pmu_dev->inf->csr + PMU_PMEVTYPER0 + (4 * idx));
@@ -435,12 +797,18 @@ static inline u64 xgene_pmu_read_counter32(struct xgene_pmu_dev *pmu_dev,
 }
 
 static inline void
+xgene_pmu_v3_write_agentmsk(struct xgene_pmu_dev *pmu_dev, u32 val) { }
+
+static inline void
 xgene_pmu_write_agent1msk(struct xgene_pmu_dev *pmu_dev, u32 val)
 {
 	writel(val, pmu_dev->inf->csr + PMU_PMAMR1);
 }
 
 static inline void
+xgene_pmu_v3_write_agent1msk(struct xgene_pmu_dev *pmu_dev, u32 val) { }
+
+static inline void
 xgene_pmu_enable_counter(struct xgene_pmu_dev *pmu_dev, int idx)
 {
 	u32 val;
@@ -621,7 +989,7 @@ static void xgene_perf_event_set_period(struct perf_event *event)
 	struct xgene_pmu *xgene_pmu = pmu_dev->parent;
 	struct hw_perf_event *hw = &event->hw;
 	/*
-	 * The X-Gene PMU counters have a period of 2^32. To account for the
+	 * The X-Gene PMU counters have a period of 2^32 or more. To account for the
 	 * possiblity of extreme interrupt latency we program for a period of
 	 * half that. Hopefully we can handle the interrupt before another 2^31
 	 * events occur and the counter overtakes its previous value.
@@ -741,7 +1109,10 @@ static int xgene_init_perf(struct xgene_pmu_dev *pmu_dev, char *name)
 {
 	struct xgene_pmu *xgene_pmu;
 
-	pmu_dev->max_period = PMU_CNT_MAX_PERIOD - 1;
+	if (pmu_dev->parent->version == PCP_PMU_V3)
+		pmu_dev->max_period = PMU_V3_CNT_MAX_PERIOD;
+	else
+		pmu_dev->max_period = PMU_CNT_MAX_PERIOD;
 	/* First version PMU supports only single event counter */
 	xgene_pmu = pmu_dev->parent;
 	if (xgene_pmu->version == PCP_PMU_V1)
@@ -786,20 +1157,38 @@ static int xgene_init_perf(struct xgene_pmu_dev *pmu_dev, char *name)
 
 	switch (pmu->inf->type) {
 	case PMU_TYPE_L3C:
-		pmu->attr_groups = l3c_pmu_attr_groups;
+		if (!(xgene_pmu->l3c_active_mask & pmu->inf->enable_mask))
+			goto dev_err;
+		if (xgene_pmu->version == PCP_PMU_V3)
+			pmu->attr_groups = l3c_pmu_v3_attr_groups;
+		else
+			pmu->attr_groups = l3c_pmu_attr_groups;
 		break;
 	case PMU_TYPE_IOB:
-		pmu->attr_groups = iob_pmu_attr_groups;
+		if (xgene_pmu->version == PCP_PMU_V3)
+			pmu->attr_groups = iob_fast_pmu_v3_attr_groups;
+		else
+			pmu->attr_groups = iob_pmu_attr_groups;
+		break;
+	case PMU_TYPE_IOB_SLOW:
+		if (xgene_pmu->version == PCP_PMU_V3)
+			pmu->attr_groups = iob_slow_pmu_v3_attr_groups;
 		break;
 	case PMU_TYPE_MCB:
 		if (!(xgene_pmu->mcb_active_mask & pmu->inf->enable_mask))
 			goto dev_err;
-		pmu->attr_groups = mcb_pmu_attr_groups;
+		if (xgene_pmu->version == PCP_PMU_V3)
+			pmu->attr_groups = mcb_pmu_v3_attr_groups;
+		else
+			pmu->attr_groups = mcb_pmu_attr_groups;
 		break;
 	case PMU_TYPE_MC:
 		if (!(xgene_pmu->mc_active_mask & pmu->inf->enable_mask))
 			goto dev_err;
-		pmu->attr_groups = mc_pmu_attr_groups;
+		if (xgene_pmu->version == PCP_PMU_V3)
+			pmu->attr_groups = mc_pmu_v3_attr_groups;
+		else
+			pmu->attr_groups = mc_pmu_attr_groups;
 		break;
 	default:
 		return -EINVAL;
@@ -823,18 +1212,25 @@ static int xgene_init_perf(struct xgene_pmu_dev *pmu_dev, char *name)
 static void _xgene_pmu_isr(int irq, struct xgene_pmu_dev *pmu_dev)
 {
 	struct xgene_pmu *xgene_pmu = pmu_dev->parent;
+	void __iomem *csr = pmu_dev->inf->csr;
 	u32 pmovsr;
 	int idx;
 
-	pmovsr = readl(pmu_dev->inf->csr + PMU_PMOVSR) & PMU_OVERFLOW_MASK;
+	if (xgene_pmu->version == PCP_PMU_V3)
+		pmovsr = readl(csr + PMU_PMOVSSET) & PMU_OVERFLOW_MASK;
+	else
+		pmovsr = readl(csr + PMU_PMOVSR) & PMU_OVERFLOW_MASK;
+
 	if (!pmovsr)
 		return;
 
 	/* Clear interrupt flag */
 	if (xgene_pmu->version == PCP_PMU_V1)
-		writel(0x0, pmu_dev->inf->csr + PMU_PMOVSR);
+		writel(0x0, csr + PMU_PMOVSR);
+	else if (xgene_pmu->version == PCP_PMU_V2)
+		writel(pmovsr, csr + PMU_PMOVSR);
 	else
-		writel(pmovsr, pmu_dev->inf->csr + PMU_PMOVSR);
+		writel(pmovsr, csr + PMU_PMOVSCLR);
 
 	for (idx = 0; idx < PMU_MAX_COUNTERS; idx++) {
 		struct perf_event *event = pmu_dev->pmu_counter_event[idx];
@@ -850,6 +1246,7 @@ static void _xgene_pmu_isr(int irq, struct xgene_pmu_dev *pmu_dev)
 
 static irqreturn_t xgene_pmu_isr(int irq, void *dev_id)
 {
+	u32 intr_mcu, intr_mcb, intr_l3c, intr_iob;
 	struct xgene_pmu_dev_ctx *ctx;
 	struct xgene_pmu *xgene_pmu = dev_id;
 	unsigned long flags;
@@ -859,22 +1256,33 @@ static irqreturn_t xgene_pmu_isr(int irq, void *dev_id)
 
 	/* Get Interrupt PMU source */
 	val = readl(xgene_pmu->pcppmu_csr + PCPPMU_INTSTATUS_REG);
-	if (val & PCPPMU_INT_MCU) {
+	if (xgene_pmu->version == PCP_PMU_V3) {
+		intr_mcu = PCPPMU_V3_INT_MCU;
+		intr_mcb = PCPPMU_V3_INT_MCB;
+		intr_l3c = PCPPMU_V3_INT_L3C;
+		intr_iob = PCPPMU_V3_INT_IOB;
+	} else {
+		intr_mcu = PCPPMU_INT_MCU;
+		intr_mcb = PCPPMU_INT_MCB;
+		intr_l3c = PCPPMU_INT_L3C;
+		intr_iob = PCPPMU_INT_IOB;
+	}
+	if (val & intr_mcu) {
 		list_for_each_entry(ctx, &xgene_pmu->mcpmus, next) {
 			_xgene_pmu_isr(irq, ctx->pmu_dev);
 		}
 	}
-	if (val & PCPPMU_INT_MCB) {
+	if (val & intr_mcb) {
 		list_for_each_entry(ctx, &xgene_pmu->mcbpmus, next) {
 			_xgene_pmu_isr(irq, ctx->pmu_dev);
 		}
 	}
-	if (val & PCPPMU_INT_L3C) {
+	if (val & intr_l3c) {
 		list_for_each_entry(ctx, &xgene_pmu->l3cpmus, next) {
 			_xgene_pmu_isr(irq, ctx->pmu_dev);
 		}
 	}
-	if (val & PCPPMU_INT_IOB) {
+	if (val & intr_iob) {
 		list_for_each_entry(ctx, &xgene_pmu->iobpmus, next) {
 			_xgene_pmu_isr(irq, ctx->pmu_dev);
 		}
@@ -885,12 +1293,14 @@ static irqreturn_t xgene_pmu_isr(int irq, void *dev_id)
 	return IRQ_HANDLED;
 }
 
-static int acpi_pmu_probe_active_mcb_mcu(struct xgene_pmu *xgene_pmu,
+static int acpi_pmu_probe_active_mcb_mcu_l3c(struct xgene_pmu *xgene_pmu,
 					 struct platform_device *pdev)
 {
 	void __iomem *csw_csr, *mcba_csr, *mcbb_csr;
 	struct resource *res;
 	unsigned int reg;
+	u32 mcb0routing;
+	u32 mcb1routing;
 
 	res = platform_get_resource(pdev, IORESOURCE_MEM, 1);
 	csw_csr = devm_ioremap_resource(&pdev->dev, res);
@@ -899,41 +1309,72 @@ static int acpi_pmu_probe_active_mcb_mcu(struct xgene_pmu *xgene_pmu,
 		return PTR_ERR(csw_csr);
 	}
 
-	res = platform_get_resource(pdev, IORESOURCE_MEM, 2);
-	mcba_csr = devm_ioremap_resource(&pdev->dev, res);
-	if (IS_ERR(mcba_csr)) {
-		dev_err(&pdev->dev, "ioremap failed for MCBA CSR resource\n");
-		return PTR_ERR(mcba_csr);
-	}
-
-	res = platform_get_resource(pdev, IORESOURCE_MEM, 3);
-	mcbb_csr = devm_ioremap_resource(&pdev->dev, res);
-	if (IS_ERR(mcbb_csr)) {
-		dev_err(&pdev->dev, "ioremap failed for MCBB CSR resource\n");
-		return PTR_ERR(mcbb_csr);
-	}
-
 	reg = readl(csw_csr + CSW_CSWCR);
-	if (reg & CSW_CSWCR_DUALMCB_MASK) {
-		/* Dual MCB active */
-		xgene_pmu->mcb_active_mask = 0x3;
-		/* Probe all active MC(s) */
-		reg = readl(mcbb_csr + CSW_CSWCR);
-		xgene_pmu->mc_active_mask =
-			(reg & MCBADDRMR_DUALMCU_MODE_MASK) ? 0xF : 0x5;
+	if (xgene_pmu->version == PCP_PMU_V3) {
+		mcb0routing = CSW_CSWCR_MCB0_ROUTING(reg);
+		mcb1routing = CSW_CSWCR_MCB0_ROUTING(reg);
+		if (reg & CSW_CSWCR_DUALMCB_MASK) {
+			/* Dual MCB active */
+			xgene_pmu->mcb_active_mask = 0x3;
+			/* Probe all active L3C(s), maximum is 8 */
+			xgene_pmu->l3c_active_mask = 0xFF;
+			/* Probe all active MC(s), maximum is 8 */
+			if ((mcb0routing == 0x2) && (mcb1routing == 0x2))
+				xgene_pmu->mc_active_mask = 0xFF;
+			else if ((mcb0routing == 0x1) && (mcb1routing == 0x1))
+				xgene_pmu->mc_active_mask =  0x33;
+			else
+				xgene_pmu->mc_active_mask =  0x11;
+
+		} else {
+			/* Single MCB active */
+			xgene_pmu->mcb_active_mask = 0x1;
+			/* Probe all active L3C(s), maximum is 4 */
+			xgene_pmu->l3c_active_mask = 0x0F;
+			/* Probe all active MC(s), maximum is 4 */
+			if ((mcb0routing == 0x2) && (mcb1routing == 0x0))
+				xgene_pmu->mc_active_mask = 0x0F;
+			else if ((mcb0routing == 0x1) && (mcb1routing == 0x0))
+				xgene_pmu->mc_active_mask =  0x03;
+			else
+				xgene_pmu->mc_active_mask =  0x01;
+		}
 	} else {
-		/* Single MCB active */
-		xgene_pmu->mcb_active_mask = 0x1;
-		/* Probe all active MC(s) */
-		reg = readl(mcba_csr + CSW_CSWCR);
-		xgene_pmu->mc_active_mask =
-			(reg & MCBADDRMR_DUALMCU_MODE_MASK) ? 0x3 : 0x1;
-	}
+		res = platform_get_resource(pdev, IORESOURCE_MEM, 2);
+		mcba_csr = devm_ioremap_resource(&pdev->dev, res);
+		if (IS_ERR(mcba_csr)) {
+			dev_err(&pdev->dev, "ioremap failed for MCBA CSR resource\n");
+			return PTR_ERR(mcba_csr);
+		}
+
+		res = platform_get_resource(pdev, IORESOURCE_MEM, 3);
+		mcbb_csr = devm_ioremap_resource(&pdev->dev, res);
+		if (IS_ERR(mcbb_csr)) {
+			dev_err(&pdev->dev, "ioremap failed for MCBB CSR resource\n");
+			return PTR_ERR(mcbb_csr);
+		}
 
+		xgene_pmu->l3c_active_mask = 0x1;
+		if (reg & CSW_CSWCR_DUALMCB_MASK) {
+			/* Dual MCB active */
+			xgene_pmu->mcb_active_mask = 0x3;
+			/* Probe all active MC(s) */
+			reg = readl(mcbb_csr + CSW_CSWCR);
+			xgene_pmu->mc_active_mask =
+				(reg & MCBADDRMR_DUALMCU_MODE_MASK) ? 0xF : 0x5;
+		} else {
+			/* Single MCB active */
+			xgene_pmu->mcb_active_mask = 0x1;
+			/* Probe all active MC(s) */
+			reg = readl(mcba_csr + CSW_CSWCR);
+			xgene_pmu->mc_active_mask =
+				(reg & MCBADDRMR_DUALMCU_MODE_MASK) ? 0x3 : 0x1;
+		}
+	}
 	return 0;
 }
 
-static int fdt_pmu_probe_active_mcb_mcu(struct xgene_pmu *xgene_pmu,
+static int fdt_pmu_probe_active_mcb_mcu_l3c(struct xgene_pmu *xgene_pmu,
 					struct platform_device *pdev)
 {
 	struct regmap *csw_map, *mcba_map, *mcbb_map;
@@ -958,6 +1399,7 @@ static int fdt_pmu_probe_active_mcb_mcu(struct xgene_pmu *xgene_pmu,
 		return PTR_ERR(mcbb_map);
 	}
 
+	xgene_pmu->l3c_active_mask = 0x1;
 	if (regmap_read(csw_map, CSW_CSWCR, &reg))
 		return -EINVAL;
 
@@ -986,8 +1428,8 @@ static int xgene_pmu_probe_active_mcb_mcu(struct xgene_pmu *xgene_pmu,
 					  struct platform_device *pdev)
 {
 	if (has_acpi_companion(&pdev->dev))
-		return acpi_pmu_probe_active_mcb_mcu(xgene_pmu, pdev);
-	return fdt_pmu_probe_active_mcb_mcu(xgene_pmu, pdev);
+		return acpi_pmu_probe_active_mcb_mcu_l3c(xgene_pmu, pdev);
+	return fdt_pmu_probe_active_mcb_mcu_l3c(xgene_pmu, pdev);
 }
 
 static char *xgene_pmu_dev_name(struct device *dev, u32 type, int id)
@@ -997,6 +1439,8 @@ static char *xgene_pmu_dev_name(struct device *dev, u32 type, int id)
 		return devm_kasprintf(dev, GFP_KERNEL, "l3c%d", id);
 	case PMU_TYPE_IOB:
 		return devm_kasprintf(dev, GFP_KERNEL, "iob%d", id);
+	case PMU_TYPE_IOB_SLOW:
+		return devm_kasprintf(dev, GFP_KERNEL, "iob-slow%d", id);
 	case PMU_TYPE_MCB:
 		return devm_kasprintf(dev, GFP_KERNEL, "mcb%d", id);
 	case PMU_TYPE_MC:
@@ -1080,6 +1524,11 @@ xgene_pmu_dev_ctx *acpi_get_pmu_hw_inf(struct xgene_pmu *xgene_pmu,
 	{"APMC0D5E", PMU_TYPE_IOB},
 	{"APMC0D5F", PMU_TYPE_MCB},
 	{"APMC0D60", PMU_TYPE_MC},
+	{"APMC0D84", PMU_TYPE_L3C},
+	{"APMC0D85", PMU_TYPE_IOB},
+	{"APMC0D86", PMU_TYPE_IOB_SLOW},
+	{"APMC0D87", PMU_TYPE_MCB},
+	{"APMC0D88", PMU_TYPE_MC},
 	{},
 };
 
@@ -1134,6 +1583,9 @@ static acpi_status acpi_pmu_dev_add(acpi_handle handle, u32 level,
 	case PMU_TYPE_IOB:
 		list_add(&ctx->next, &xgene_pmu->iobpmus);
 		break;
+	case PMU_TYPE_IOB_SLOW:
+		list_add(&ctx->next, &xgene_pmu->iobpmus);
+		break;
 	case PMU_TYPE_MCB:
 		list_add(&ctx->next, &xgene_pmu->mcbpmus);
 		break;
@@ -1255,6 +1707,9 @@ static int fdt_pmu_probe_pmu_dev(struct xgene_pmu *xgene_pmu,
 		case PMU_TYPE_IOB:
 			list_add(&ctx->next, &xgene_pmu->iobpmus);
 			break;
+		case PMU_TYPE_IOB_SLOW:
+			list_add(&ctx->next, &xgene_pmu->iobpmus);
+			break;
 		case PMU_TYPE_MCB:
 			list_add(&ctx->next, &xgene_pmu->mcbpmus);
 			break;
@@ -1300,6 +1755,23 @@ static int xgene_pmu_probe_pmu_dev(struct xgene_pmu *xgene_pmu,
 	.stop_counters = xgene_pmu_stop_counters,
 };
 
+const struct xgene_pmu_ops xgene_pmu_v3_ops = {
+	.mask_int = xgene_pmu_v3_mask_int,
+	.unmask_int = xgene_pmu_v3_unmask_int,
+	.read_counter = xgene_pmu_read_counter64,
+	.write_counter = xgene_pmu_write_counter64,
+	.write_evttype = xgene_pmu_write_evttype,
+	.write_agentmsk = xgene_pmu_v3_write_agentmsk,
+	.write_agent1msk = xgene_pmu_v3_write_agent1msk,
+	.enable_counter = xgene_pmu_enable_counter,
+	.disable_counter = xgene_pmu_disable_counter,
+	.enable_counter_int = xgene_pmu_enable_counter_int,
+	.disable_counter_int = xgene_pmu_disable_counter_int,
+	.reset_counters = xgene_pmu_reset_counters,
+	.start_counters = xgene_pmu_start_counters,
+	.stop_counters = xgene_pmu_stop_counters,
+};
+
 static const struct of_device_id xgene_pmu_of_match[] = {
 	{ .compatible	= "apm,xgene-pmu",	.data = &xgene_pmu_data },
 	{ .compatible	= "apm,xgene-pmu-v2",	.data = &xgene_pmu_v2_data },
@@ -1310,6 +1782,7 @@ static int xgene_pmu_probe_pmu_dev(struct xgene_pmu *xgene_pmu,
 static const struct acpi_device_id xgene_pmu_acpi_match[] = {
 	{"APMC0D5B", PCP_PMU_V1},
 	{"APMC0D5C", PCP_PMU_V2},
+	{"APMC0D83", PCP_PMU_V3},
 	{},
 };
 MODULE_DEVICE_TABLE(acpi, xgene_pmu_acpi_match);
@@ -1349,7 +1822,10 @@ static int xgene_pmu_probe(struct platform_device *pdev)
 	if (version < 0)
 		return -ENODEV;
 
-	xgene_pmu->ops = &xgene_pmu_ops;
+	if (version != PCP_PMU_V3)
+		xgene_pmu->ops = &xgene_pmu_ops;
+	else
+		xgene_pmu->ops = &xgene_pmu_v3_ops;
 
 	INIT_LIST_HEAD(&xgene_pmu->l3cpmus);
 	INIT_LIST_HEAD(&xgene_pmu->iobpmus);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 0/3] perf: xgene: Add support for SoC PMU version 3
  2017-04-03 16:47 ` Hoan Tran
@ 2017-04-17 16:36   ` Hoan Tran
  -1 siblings, 0 replies; 26+ messages in thread
From: Hoan Tran @ 2017-04-17 16:36 UTC (permalink / raw)
  To: Will Deacon, Mark Rutland, Jonathan Corbet, Tai Nguyen
  Cc: linux-arm-kernel, lkml, linux-doc, Loc Ho, Hoan Tran

Hi All,

Do you have any comments on this patch set?

Thanks

On Mon, Apr 3, 2017 at 9:47 AM, Hoan Tran <hotran@apm.com> wrote:
> This patch set adds support for SoC-wide (AKA uncore) Performance Monitoring
> Unit version 3.
>
> It can support up to
>  - 2 IOB PMU instances
>  - 8 L3C PMU instances
>  - 2 MCB PMU instances
>  - 8 MCU PMU instances
>
> v2:
>  * Split into separate patches
>  * Use the function pointers for the PMU leaf functions
>  * Parse PMU subnode by the match table
>  * Dont allow user change agent id by config1 for SoC PMU v3
>
> Hoan Tran (3):
>   perf: xgene: Parse PMU subnode from the match table
>   perf: xgene: Move PMU leaf functions into function pointer structure
>   perf: xgene: Add support for SoC PMU version 3
>
>  drivers/perf/xgene_pmu.c | 695 +++++++++++++++++++++++++++++++++++++++++------
>  1 file changed, 619 insertions(+), 76 deletions(-)
>
> --
> 1.9.1
>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v2 0/3] perf: xgene: Add support for SoC PMU version 3
@ 2017-04-17 16:36   ` Hoan Tran
  0 siblings, 0 replies; 26+ messages in thread
From: Hoan Tran @ 2017-04-17 16:36 UTC (permalink / raw)
  To: linux-arm-kernel

Hi All,

Do you have any comments on this patch set?

Thanks

On Mon, Apr 3, 2017 at 9:47 AM, Hoan Tran <hotran@apm.com> wrote:
> This patch set adds support for SoC-wide (AKA uncore) Performance Monitoring
> Unit version 3.
>
> It can support up to
>  - 2 IOB PMU instances
>  - 8 L3C PMU instances
>  - 2 MCB PMU instances
>  - 8 MCU PMU instances
>
> v2:
>  * Split into separate patches
>  * Use the function pointers for the PMU leaf functions
>  * Parse PMU subnode by the match table
>  * Dont allow user change agent id by config1 for SoC PMU v3
>
> Hoan Tran (3):
>   perf: xgene: Parse PMU subnode from the match table
>   perf: xgene: Move PMU leaf functions into function pointer structure
>   perf: xgene: Add support for SoC PMU version 3
>
>  drivers/perf/xgene_pmu.c | 695 +++++++++++++++++++++++++++++++++++++++++------
>  1 file changed, 619 insertions(+), 76 deletions(-)
>
> --
> 1.9.1
>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 0/3] perf: xgene: Add support for SoC PMU version 3
       [not found] ` <CAFHUOYzJ1PoUV=BmcVxSHZ_8qENm6DW5_Hd3uuFHE9kdqhXCzA@mail.gmail.com>
@ 2017-05-05 15:48     ` Hoan Tran
  0 siblings, 0 replies; 26+ messages in thread
From: Hoan Tran @ 2017-05-05 15:48 UTC (permalink / raw)
  To: Will Deacon, Mark Rutland, Jonathan Corbet, Tai Nguyen
  Cc: linux-arm-kernel, lkml, linux-doc, Loc Ho, Hoan Tran

Ping!

Thanks
Hoan

On Thu, Apr 13, 2017 at 10:50 AM, Hoan Tran <hotran@apm.com> wrote:
> Hi All,
>
> Do you have any comments on this patch set?
>
> Thanks
> Hoan
>
> On Mon, Apr 3, 2017 at 9:47 AM, Hoan Tran <hotran@apm.com> wrote:
>>
>> This patch set adds support for SoC-wide (AKA uncore) Performance
>> Monitoring
>> Unit version 3.
>>
>> It can support up to
>>  - 2 IOB PMU instances
>>  - 8 L3C PMU instances
>>  - 2 MCB PMU instances
>>  - 8 MCU PMU instances
>>
>> v2:
>>  * Split into separate patches
>>  * Use the function pointers for the PMU leaf functions
>>  * Parse PMU subnode by the match table
>>  * Dont allow user change agent id by config1 for SoC PMU v3
>>
>> Hoan Tran (3):
>>   perf: xgene: Parse PMU subnode from the match table
>>   perf: xgene: Move PMU leaf functions into function pointer structure
>>   perf: xgene: Add support for SoC PMU version 3
>>
>>  drivers/perf/xgene_pmu.c | 695
>> +++++++++++++++++++++++++++++++++++++++++------
>>  1 file changed, 619 insertions(+), 76 deletions(-)
>>
>> --
>> 1.9.1
>>
>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v2 0/3] perf: xgene: Add support for SoC PMU version 3
@ 2017-05-05 15:48     ` Hoan Tran
  0 siblings, 0 replies; 26+ messages in thread
From: Hoan Tran @ 2017-05-05 15:48 UTC (permalink / raw)
  To: linux-arm-kernel

Ping!

Thanks
Hoan

On Thu, Apr 13, 2017 at 10:50 AM, Hoan Tran <hotran@apm.com> wrote:
> Hi All,
>
> Do you have any comments on this patch set?
>
> Thanks
> Hoan
>
> On Mon, Apr 3, 2017 at 9:47 AM, Hoan Tran <hotran@apm.com> wrote:
>>
>> This patch set adds support for SoC-wide (AKA uncore) Performance
>> Monitoring
>> Unit version 3.
>>
>> It can support up to
>>  - 2 IOB PMU instances
>>  - 8 L3C PMU instances
>>  - 2 MCB PMU instances
>>  - 8 MCU PMU instances
>>
>> v2:
>>  * Split into separate patches
>>  * Use the function pointers for the PMU leaf functions
>>  * Parse PMU subnode by the match table
>>  * Dont allow user change agent id by config1 for SoC PMU v3
>>
>> Hoan Tran (3):
>>   perf: xgene: Parse PMU subnode from the match table
>>   perf: xgene: Move PMU leaf functions into function pointer structure
>>   perf: xgene: Add support for SoC PMU version 3
>>
>>  drivers/perf/xgene_pmu.c | 695
>> +++++++++++++++++++++++++++++++++++++++++------
>>  1 file changed, 619 insertions(+), 76 deletions(-)
>>
>> --
>> 1.9.1
>>
>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 0/3] perf: xgene: Add support for SoC PMU version 3
  2017-05-05 15:48     ` Hoan Tran
@ 2017-05-30 17:29       ` Hoan Tran
  -1 siblings, 0 replies; 26+ messages in thread
From: Hoan Tran @ 2017-05-30 17:29 UTC (permalink / raw)
  To: Will Deacon, Mark Rutland, Jonathan Corbet, Tai Nguyen
  Cc: linux-arm-kernel, lkml, linux-doc, Loc Ho, Hoan Tran

Hi All,

Ping again

Thanks
Hoan

On Fri, May 5, 2017 at 8:48 AM, Hoan Tran <hotran@apm.com> wrote:
> Ping!
>
> Thanks
> Hoan
>
> On Thu, Apr 13, 2017 at 10:50 AM, Hoan Tran <hotran@apm.com> wrote:
>> Hi All,
>>
>> Do you have any comments on this patch set?
>>
>> Thanks
>> Hoan
>>
>> On Mon, Apr 3, 2017 at 9:47 AM, Hoan Tran <hotran@apm.com> wrote:
>>>
>>> This patch set adds support for SoC-wide (AKA uncore) Performance
>>> Monitoring
>>> Unit version 3.
>>>
>>> It can support up to
>>>  - 2 IOB PMU instances
>>>  - 8 L3C PMU instances
>>>  - 2 MCB PMU instances
>>>  - 8 MCU PMU instances
>>>
>>> v2:
>>>  * Split into separate patches
>>>  * Use the function pointers for the PMU leaf functions
>>>  * Parse PMU subnode by the match table
>>>  * Dont allow user change agent id by config1 for SoC PMU v3
>>>
>>> Hoan Tran (3):
>>>   perf: xgene: Parse PMU subnode from the match table
>>>   perf: xgene: Move PMU leaf functions into function pointer structure
>>>   perf: xgene: Add support for SoC PMU version 3
>>>
>>>  drivers/perf/xgene_pmu.c | 695
>>> +++++++++++++++++++++++++++++++++++++++++------
>>>  1 file changed, 619 insertions(+), 76 deletions(-)
>>>
>>> --
>>> 1.9.1
>>>
>>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v2 0/3] perf: xgene: Add support for SoC PMU version 3
@ 2017-05-30 17:29       ` Hoan Tran
  0 siblings, 0 replies; 26+ messages in thread
From: Hoan Tran @ 2017-05-30 17:29 UTC (permalink / raw)
  To: linux-arm-kernel

Hi All,

Ping again

Thanks
Hoan

On Fri, May 5, 2017 at 8:48 AM, Hoan Tran <hotran@apm.com> wrote:
> Ping!
>
> Thanks
> Hoan
>
> On Thu, Apr 13, 2017 at 10:50 AM, Hoan Tran <hotran@apm.com> wrote:
>> Hi All,
>>
>> Do you have any comments on this patch set?
>>
>> Thanks
>> Hoan
>>
>> On Mon, Apr 3, 2017 at 9:47 AM, Hoan Tran <hotran@apm.com> wrote:
>>>
>>> This patch set adds support for SoC-wide (AKA uncore) Performance
>>> Monitoring
>>> Unit version 3.
>>>
>>> It can support up to
>>>  - 2 IOB PMU instances
>>>  - 8 L3C PMU instances
>>>  - 2 MCB PMU instances
>>>  - 8 MCU PMU instances
>>>
>>> v2:
>>>  * Split into separate patches
>>>  * Use the function pointers for the PMU leaf functions
>>>  * Parse PMU subnode by the match table
>>>  * Dont allow user change agent id by config1 for SoC PMU v3
>>>
>>> Hoan Tran (3):
>>>   perf: xgene: Parse PMU subnode from the match table
>>>   perf: xgene: Move PMU leaf functions into function pointer structure
>>>   perf: xgene: Add support for SoC PMU version 3
>>>
>>>  drivers/perf/xgene_pmu.c | 695
>>> +++++++++++++++++++++++++++++++++++++++++------
>>>  1 file changed, 619 insertions(+), 76 deletions(-)
>>>
>>> --
>>> 1.9.1
>>>
>>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 1/3] perf: xgene: Parse PMU subnode from the match table
  2017-04-03 16:47   ` Hoan Tran
@ 2017-06-02 14:59     ` Mark Rutland
  -1 siblings, 0 replies; 26+ messages in thread
From: Mark Rutland @ 2017-06-02 14:59 UTC (permalink / raw)
  To: Hoan Tran
  Cc: Will Deacon, Jonathan Corbet, Tai Nguyen, linux-arm-kernel,
	linux-kernel, linux-doc, Loc Ho

Hi Hoan,

Apologies for the last reply.

On Mon, Apr 03, 2017 at 09:47:55AM -0700, Hoan Tran wrote:
> +static const struct acpi_device_id xgene_pmu_acpi_type_match[] = {
> +	{"APMC0D5D", PMU_TYPE_L3C},
> +	{"APMC0D5E", PMU_TYPE_IOB},
> +	{"APMC0D5F", PMU_TYPE_MCB},
> +	{"APMC0D60", PMU_TYPE_MC},
> +	{},
> +};
> +
> +static const struct acpi_device_id *xgene_pmu_acpi_match_type(
> +					const struct acpi_device_id *ids,
> +					struct acpi_device *adev)
> +{
> +	const struct acpi_device_id *match_id = NULL;
> +	const struct acpi_device_id *id;
> +
> +	for (id = ids; id->id[0] || id->cls; id++) {
> +		if (!acpi_match_device_ids(adev, id))
> +			match_id = id;
> +		else if (match_id)
> +			break;
> +	}
> +
> +	return match_id;
> +}

I don't believe this look is necessary. AFAICT, acpi_match_device_ids()
already iterates over the id table it is given.

> +
>  static acpi_status acpi_pmu_dev_add(acpi_handle handle, u32 level,
>  				    void *data, void **return_value)
>  {
> +	const struct acpi_device_id *acpi_id;
>  	struct xgene_pmu *xgene_pmu = data;
>  	struct xgene_pmu_dev_ctx *ctx;
>  	struct acpi_device *adev;
> @@ -1059,17 +1085,11 @@ static acpi_status acpi_pmu_dev_add(acpi_handle handle, u32 level,
>  	if (acpi_bus_get_status(adev) || !adev->status.present)
>  		return AE_OK;
>  
> -	if (!strcmp(acpi_device_hid(adev), "APMC0D5D"))
> -		ctx = acpi_get_pmu_hw_inf(xgene_pmu, adev, PMU_TYPE_L3C);
> -	else if (!strcmp(acpi_device_hid(adev), "APMC0D5E"))
> -		ctx = acpi_get_pmu_hw_inf(xgene_pmu, adev, PMU_TYPE_IOB);
> -	else if (!strcmp(acpi_device_hid(adev), "APMC0D5F"))
> -		ctx = acpi_get_pmu_hw_inf(xgene_pmu, adev, PMU_TYPE_MCB);
> -	else if (!strcmp(acpi_device_hid(adev), "APMC0D60"))
> -		ctx = acpi_get_pmu_hw_inf(xgene_pmu, adev, PMU_TYPE_MC);
> -	else
> -		ctx = NULL;
> +	acpi_id = xgene_pmu_acpi_match_type(xgene_pmu_acpi_type_match, adev);
> +	if (!acpi_id)
> +		return AE_OK;

As above, and as I covered in my reply to v1, I think the above should
be:

	acpi_id = acpi_match_device_ids(adev, xgene_pmu_acpi_type_match);
	if (!acpi_id)
		return AE_OK;

... or am I missing something?

With that change:

Acked-by: Mark Rutland <mark.rutland@arm.com>

Thanks,
Mark.

>  
> +	ctx = acpi_get_pmu_hw_inf(xgene_pmu, adev, (u32)acpi_id->driver_data);
>  	if (!ctx)
>  		return AE_OK;
>  
> -- 
> 1.9.1
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v2 1/3] perf: xgene: Parse PMU subnode from the match table
@ 2017-06-02 14:59     ` Mark Rutland
  0 siblings, 0 replies; 26+ messages in thread
From: Mark Rutland @ 2017-06-02 14:59 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Hoan,

Apologies for the last reply.

On Mon, Apr 03, 2017 at 09:47:55AM -0700, Hoan Tran wrote:
> +static const struct acpi_device_id xgene_pmu_acpi_type_match[] = {
> +	{"APMC0D5D", PMU_TYPE_L3C},
> +	{"APMC0D5E", PMU_TYPE_IOB},
> +	{"APMC0D5F", PMU_TYPE_MCB},
> +	{"APMC0D60", PMU_TYPE_MC},
> +	{},
> +};
> +
> +static const struct acpi_device_id *xgene_pmu_acpi_match_type(
> +					const struct acpi_device_id *ids,
> +					struct acpi_device *adev)
> +{
> +	const struct acpi_device_id *match_id = NULL;
> +	const struct acpi_device_id *id;
> +
> +	for (id = ids; id->id[0] || id->cls; id++) {
> +		if (!acpi_match_device_ids(adev, id))
> +			match_id = id;
> +		else if (match_id)
> +			break;
> +	}
> +
> +	return match_id;
> +}

I don't believe this look is necessary. AFAICT, acpi_match_device_ids()
already iterates over the id table it is given.

> +
>  static acpi_status acpi_pmu_dev_add(acpi_handle handle, u32 level,
>  				    void *data, void **return_value)
>  {
> +	const struct acpi_device_id *acpi_id;
>  	struct xgene_pmu *xgene_pmu = data;
>  	struct xgene_pmu_dev_ctx *ctx;
>  	struct acpi_device *adev;
> @@ -1059,17 +1085,11 @@ static acpi_status acpi_pmu_dev_add(acpi_handle handle, u32 level,
>  	if (acpi_bus_get_status(adev) || !adev->status.present)
>  		return AE_OK;
>  
> -	if (!strcmp(acpi_device_hid(adev), "APMC0D5D"))
> -		ctx = acpi_get_pmu_hw_inf(xgene_pmu, adev, PMU_TYPE_L3C);
> -	else if (!strcmp(acpi_device_hid(adev), "APMC0D5E"))
> -		ctx = acpi_get_pmu_hw_inf(xgene_pmu, adev, PMU_TYPE_IOB);
> -	else if (!strcmp(acpi_device_hid(adev), "APMC0D5F"))
> -		ctx = acpi_get_pmu_hw_inf(xgene_pmu, adev, PMU_TYPE_MCB);
> -	else if (!strcmp(acpi_device_hid(adev), "APMC0D60"))
> -		ctx = acpi_get_pmu_hw_inf(xgene_pmu, adev, PMU_TYPE_MC);
> -	else
> -		ctx = NULL;
> +	acpi_id = xgene_pmu_acpi_match_type(xgene_pmu_acpi_type_match, adev);
> +	if (!acpi_id)
> +		return AE_OK;

As above, and as I covered in my reply to v1, I think the above should
be:

	acpi_id = acpi_match_device_ids(adev, xgene_pmu_acpi_type_match);
	if (!acpi_id)
		return AE_OK;

... or am I missing something?

With that change:

Acked-by: Mark Rutland <mark.rutland@arm.com>

Thanks,
Mark.

>  
> +	ctx = acpi_get_pmu_hw_inf(xgene_pmu, adev, (u32)acpi_id->driver_data);
>  	if (!ctx)
>  		return AE_OK;
>  
> -- 
> 1.9.1
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 3/3] perf: xgene: Add support for SoC PMU version 3
  2017-04-03 16:47   ` Hoan Tran
@ 2017-06-02 16:04     ` Mark Rutland
  -1 siblings, 0 replies; 26+ messages in thread
From: Mark Rutland @ 2017-06-02 16:04 UTC (permalink / raw)
  To: Hoan Tran
  Cc: Will Deacon, Jonathan Corbet, Tai Nguyen, linux-arm-kernel,
	linux-kernel, linux-doc, Loc Ho

Hi Hoan,

Apologies for the delay in getting to this.

On Mon, Apr 03, 2017 at 09:47:57AM -0700, Hoan Tran wrote:
> This patch adds support for SoC-wide (AKA uncore) Performance Monitoring
> Unit version 3.
> 
> It can support up to
>  - 2 IOB PMU instances
>  - 8 L3C PMU instances
>  - 2 MCB PMU instances
>  - 8 MCU PMU instances

Please elaborate on the differences compared with prior versions.

I see that counters are 64-bit. Are there any other important details to
be aware of?

[...]

> +static struct attribute *l3c_pmu_v3_events_attrs[] = {

> +	XGENE_PMU_EVENT_ATTR(read-hit,				0x01),
> +	XGENE_PMU_EVENT_ATTR(read-miss,				0x02),

> +	XGENE_PMU_EVENT_ATTR(reads,				0x08),
> +	XGENE_PMU_EVENT_ATTR(writes,				0x09),

> +};

Some of these are singular (e.g. "read-hit", "read-miss"), while others
are plural (e.g. "reads", "writes").

The existing driver use the singular form. Please remain consistent with
that. e.g. turn "reads" into "read".

> +	XGENE_PMU_EVENT_ATTR(PA-req-buf-alloc-all,		0x01),

Likewise, please consistently use lower case.

> +	XGENE_PMU_EVENT_ATTR(pmu-act-sent,			0x01),

Surely we don't need "pmu" in the event names?

[...]

>  static inline u64 xgene_pmu_read_counter32(struct xgene_pmu_dev *pmu_dev,
>  					   int idx)
>  {
>  	return (u64)readl(pmu_dev->inf->csr + PMU_PMEVCNTR0 + (4 * idx));
>  }

I don't think the cast is necessary. Please remove it.

>  static inline void
> +xgene_pmu_write_counter64(struct xgene_pmu_dev *pmu_dev, int idx, u64 val)
> +{
> +	u32 cnt_lo, cnt_hi;
> +
> +	cnt_lo = val & 0xFFFFFFFF;
> +	cnt_hi = val >> 32;

	cnt_hi = upper_32_bits(val);
	cnt_lo = lower_32_bits(val);

(both are in <linux/kernel.h>)

> +
> +	/* v3 has 64-bit counter registers composed by 2 32-bit registers */
> +	xgene_pmu_write_counter32(pmu_dev, 2 * idx, val);
> +	xgene_pmu_write_counter32(pmu_dev, 2 * idx + 1, val >> 32);

Please use cnt_hi and cnt_lo for the writes.

[...]

> @@ -621,7 +989,7 @@ static void xgene_perf_event_set_period(struct perf_event *event)
>  	struct xgene_pmu *xgene_pmu = pmu_dev->parent;
>  	struct hw_perf_event *hw = &event->hw;
>  	/*
> -	 * The X-Gene PMU counters have a period of 2^32. To account for the
> +	 * The X-Gene PMU counters have a period of 2^32 or more. To account for the
>  	 * possiblity of extreme interrupt latency we program for a period of
>  	 * half that. Hopefully we can handle the interrupt before another 2^31
>  	 * events occur and the counter overtakes its previous value.

This comment is now a little out-of-date, as we don't start 64-bit
counters at half their period.

I guess we don't expect a 64-bit counter to overflow, so can we state
that in the comment?

[...]

> -static int acpi_pmu_probe_active_mcb_mcu(struct xgene_pmu *xgene_pmu,
> +static int acpi_pmu_probe_active_mcb_mcu_l3c(struct xgene_pmu *xgene_pmu,
>  					 struct platform_device *pdev)
>  {
>  	void __iomem *csw_csr, *mcba_csr, *mcbb_csr;
>  	struct resource *res;
>  	unsigned int reg;
> +	u32 mcb0routing;
> +	u32 mcb1routing;
>  
>  	res = platform_get_resource(pdev, IORESOURCE_MEM, 1);
>  	csw_csr = devm_ioremap_resource(&pdev->dev, res);
> @@ -899,41 +1309,72 @@ static int acpi_pmu_probe_active_mcb_mcu(struct xgene_pmu *xgene_pmu,
>  		return PTR_ERR(csw_csr);
>  	}
>  
> -	res = platform_get_resource(pdev, IORESOURCE_MEM, 2);
> -	mcba_csr = devm_ioremap_resource(&pdev->dev, res);
> -	if (IS_ERR(mcba_csr)) {
> -		dev_err(&pdev->dev, "ioremap failed for MCBA CSR resource\n");
> -		return PTR_ERR(mcba_csr);
> -	}
> -
> -	res = platform_get_resource(pdev, IORESOURCE_MEM, 3);
> -	mcbb_csr = devm_ioremap_resource(&pdev->dev, res);
> -	if (IS_ERR(mcbb_csr)) {
> -		dev_err(&pdev->dev, "ioremap failed for MCBB CSR resource\n");
> -		return PTR_ERR(mcbb_csr);
> -	}
> -
>  	reg = readl(csw_csr + CSW_CSWCR);
> -	if (reg & CSW_CSWCR_DUALMCB_MASK) {
> -		/* Dual MCB active */
> -		xgene_pmu->mcb_active_mask = 0x3;
> -		/* Probe all active MC(s) */
> -		reg = readl(mcbb_csr + CSW_CSWCR);
> -		xgene_pmu->mc_active_mask =
> -			(reg & MCBADDRMR_DUALMCU_MODE_MASK) ? 0xF : 0x5;
> +	if (xgene_pmu->version == PCP_PMU_V3) {
> +		mcb0routing = CSW_CSWCR_MCB0_ROUTING(reg);
> +		mcb1routing = CSW_CSWCR_MCB0_ROUTING(reg);
> +		if (reg & CSW_CSWCR_DUALMCB_MASK) {
> +			/* Dual MCB active */
> +			xgene_pmu->mcb_active_mask = 0x3;
> +			/* Probe all active L3C(s), maximum is 8 */
> +			xgene_pmu->l3c_active_mask = 0xFF;
> +			/* Probe all active MC(s), maximum is 8 */
> +			if ((mcb0routing == 0x2) && (mcb1routing == 0x2))
> +				xgene_pmu->mc_active_mask = 0xFF;
> +			else if ((mcb0routing == 0x1) && (mcb1routing == 0x1))
> +				xgene_pmu->mc_active_mask =  0x33;
> +			else
> +				xgene_pmu->mc_active_mask =  0x11;
> +
> +		} else {
> +			/* Single MCB active */
> +			xgene_pmu->mcb_active_mask = 0x1;
> +			/* Probe all active L3C(s), maximum is 4 */
> +			xgene_pmu->l3c_active_mask = 0x0F;
> +			/* Probe all active MC(s), maximum is 4 */
> +			if ((mcb0routing == 0x2) && (mcb1routing == 0x0))
> +				xgene_pmu->mc_active_mask = 0x0F;
> +			else if ((mcb0routing == 0x1) && (mcb1routing == 0x0))
> +				xgene_pmu->mc_active_mask =  0x03;
> +			else
> +				xgene_pmu->mc_active_mask =  0x01;
> +		}
>  	} else {
> -		/* Single MCB active */
> -		xgene_pmu->mcb_active_mask = 0x1;
> -		/* Probe all active MC(s) */
> -		reg = readl(mcba_csr + CSW_CSWCR);
> -		xgene_pmu->mc_active_mask =
> -			(reg & MCBADDRMR_DUALMCU_MODE_MASK) ? 0x3 : 0x1;
> -	}
> +		res = platform_get_resource(pdev, IORESOURCE_MEM, 2);
> +		mcba_csr = devm_ioremap_resource(&pdev->dev, res);
> +		if (IS_ERR(mcba_csr)) {
> +			dev_err(&pdev->dev, "ioremap failed for MCBA CSR resource\n");
> +			return PTR_ERR(mcba_csr);
> +		}
> +
> +		res = platform_get_resource(pdev, IORESOURCE_MEM, 3);
> +		mcbb_csr = devm_ioremap_resource(&pdev->dev, res);
> +		if (IS_ERR(mcbb_csr)) {
> +			dev_err(&pdev->dev, "ioremap failed for MCBB CSR resource\n");
> +			return PTR_ERR(mcbb_csr);
> +		}
>  
> +		xgene_pmu->l3c_active_mask = 0x1;
> +		if (reg & CSW_CSWCR_DUALMCB_MASK) {
> +			/* Dual MCB active */
> +			xgene_pmu->mcb_active_mask = 0x3;
> +			/* Probe all active MC(s) */
> +			reg = readl(mcbb_csr + CSW_CSWCR);
> +			xgene_pmu->mc_active_mask =
> +				(reg & MCBADDRMR_DUALMCU_MODE_MASK) ? 0xF : 0x5;
> +		} else {
> +			/* Single MCB active */
> +			xgene_pmu->mcb_active_mask = 0x1;
> +			/* Probe all active MC(s) */
> +			reg = readl(mcba_csr + CSW_CSWCR);
> +			xgene_pmu->mc_active_mask =
> +				(reg & MCBADDRMR_DUALMCU_MODE_MASK) ? 0x3 : 0x1;
> +		}
> +	}
>  	return 0;
>  }

Please split these into separate functions. e.g. have
acpi_pmu_probe_active_mcb_mcu_v1_2() and
acpi_pmu_probe_active_mcb_mcu_v3().

>  static char *xgene_pmu_dev_name(struct device *dev, u32 type, int id)
> @@ -997,6 +1439,8 @@ static char *xgene_pmu_dev_name(struct device *dev, u32 type, int id)
>  		return devm_kasprintf(dev, GFP_KERNEL, "l3c%d", id);
>  	case PMU_TYPE_IOB:
>  		return devm_kasprintf(dev, GFP_KERNEL, "iob%d", id);
> +	case PMU_TYPE_IOB_SLOW:
> +		return devm_kasprintf(dev, GFP_KERNEL, "iob-slow%d", id);

What is IOB slow? How does it relate to the existing IOB?

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v2 3/3] perf: xgene: Add support for SoC PMU version 3
@ 2017-06-02 16:04     ` Mark Rutland
  0 siblings, 0 replies; 26+ messages in thread
From: Mark Rutland @ 2017-06-02 16:04 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Hoan,

Apologies for the delay in getting to this.

On Mon, Apr 03, 2017 at 09:47:57AM -0700, Hoan Tran wrote:
> This patch adds support for SoC-wide (AKA uncore) Performance Monitoring
> Unit version 3.
> 
> It can support up to
>  - 2 IOB PMU instances
>  - 8 L3C PMU instances
>  - 2 MCB PMU instances
>  - 8 MCU PMU instances

Please elaborate on the differences compared with prior versions.

I see that counters are 64-bit. Are there any other important details to
be aware of?

[...]

> +static struct attribute *l3c_pmu_v3_events_attrs[] = {

> +	XGENE_PMU_EVENT_ATTR(read-hit,				0x01),
> +	XGENE_PMU_EVENT_ATTR(read-miss,				0x02),

> +	XGENE_PMU_EVENT_ATTR(reads,				0x08),
> +	XGENE_PMU_EVENT_ATTR(writes,				0x09),

> +};

Some of these are singular (e.g. "read-hit", "read-miss"), while others
are plural (e.g. "reads", "writes").

The existing driver use the singular form. Please remain consistent with
that. e.g. turn "reads" into "read".

> +	XGENE_PMU_EVENT_ATTR(PA-req-buf-alloc-all,		0x01),

Likewise, please consistently use lower case.

> +	XGENE_PMU_EVENT_ATTR(pmu-act-sent,			0x01),

Surely we don't need "pmu" in the event names?

[...]

>  static inline u64 xgene_pmu_read_counter32(struct xgene_pmu_dev *pmu_dev,
>  					   int idx)
>  {
>  	return (u64)readl(pmu_dev->inf->csr + PMU_PMEVCNTR0 + (4 * idx));
>  }

I don't think the cast is necessary. Please remove it.

>  static inline void
> +xgene_pmu_write_counter64(struct xgene_pmu_dev *pmu_dev, int idx, u64 val)
> +{
> +	u32 cnt_lo, cnt_hi;
> +
> +	cnt_lo = val & 0xFFFFFFFF;
> +	cnt_hi = val >> 32;

	cnt_hi = upper_32_bits(val);
	cnt_lo = lower_32_bits(val);

(both are in <linux/kernel.h>)

> +
> +	/* v3 has 64-bit counter registers composed by 2 32-bit registers */
> +	xgene_pmu_write_counter32(pmu_dev, 2 * idx, val);
> +	xgene_pmu_write_counter32(pmu_dev, 2 * idx + 1, val >> 32);

Please use cnt_hi and cnt_lo for the writes.

[...]

> @@ -621,7 +989,7 @@ static void xgene_perf_event_set_period(struct perf_event *event)
>  	struct xgene_pmu *xgene_pmu = pmu_dev->parent;
>  	struct hw_perf_event *hw = &event->hw;
>  	/*
> -	 * The X-Gene PMU counters have a period of 2^32. To account for the
> +	 * The X-Gene PMU counters have a period of 2^32 or more. To account for the
>  	 * possiblity of extreme interrupt latency we program for a period of
>  	 * half that. Hopefully we can handle the interrupt before another 2^31
>  	 * events occur and the counter overtakes its previous value.

This comment is now a little out-of-date, as we don't start 64-bit
counters at half their period.

I guess we don't expect a 64-bit counter to overflow, so can we state
that in the comment?

[...]

> -static int acpi_pmu_probe_active_mcb_mcu(struct xgene_pmu *xgene_pmu,
> +static int acpi_pmu_probe_active_mcb_mcu_l3c(struct xgene_pmu *xgene_pmu,
>  					 struct platform_device *pdev)
>  {
>  	void __iomem *csw_csr, *mcba_csr, *mcbb_csr;
>  	struct resource *res;
>  	unsigned int reg;
> +	u32 mcb0routing;
> +	u32 mcb1routing;
>  
>  	res = platform_get_resource(pdev, IORESOURCE_MEM, 1);
>  	csw_csr = devm_ioremap_resource(&pdev->dev, res);
> @@ -899,41 +1309,72 @@ static int acpi_pmu_probe_active_mcb_mcu(struct xgene_pmu *xgene_pmu,
>  		return PTR_ERR(csw_csr);
>  	}
>  
> -	res = platform_get_resource(pdev, IORESOURCE_MEM, 2);
> -	mcba_csr = devm_ioremap_resource(&pdev->dev, res);
> -	if (IS_ERR(mcba_csr)) {
> -		dev_err(&pdev->dev, "ioremap failed for MCBA CSR resource\n");
> -		return PTR_ERR(mcba_csr);
> -	}
> -
> -	res = platform_get_resource(pdev, IORESOURCE_MEM, 3);
> -	mcbb_csr = devm_ioremap_resource(&pdev->dev, res);
> -	if (IS_ERR(mcbb_csr)) {
> -		dev_err(&pdev->dev, "ioremap failed for MCBB CSR resource\n");
> -		return PTR_ERR(mcbb_csr);
> -	}
> -
>  	reg = readl(csw_csr + CSW_CSWCR);
> -	if (reg & CSW_CSWCR_DUALMCB_MASK) {
> -		/* Dual MCB active */
> -		xgene_pmu->mcb_active_mask = 0x3;
> -		/* Probe all active MC(s) */
> -		reg = readl(mcbb_csr + CSW_CSWCR);
> -		xgene_pmu->mc_active_mask =
> -			(reg & MCBADDRMR_DUALMCU_MODE_MASK) ? 0xF : 0x5;
> +	if (xgene_pmu->version == PCP_PMU_V3) {
> +		mcb0routing = CSW_CSWCR_MCB0_ROUTING(reg);
> +		mcb1routing = CSW_CSWCR_MCB0_ROUTING(reg);
> +		if (reg & CSW_CSWCR_DUALMCB_MASK) {
> +			/* Dual MCB active */
> +			xgene_pmu->mcb_active_mask = 0x3;
> +			/* Probe all active L3C(s), maximum is 8 */
> +			xgene_pmu->l3c_active_mask = 0xFF;
> +			/* Probe all active MC(s), maximum is 8 */
> +			if ((mcb0routing == 0x2) && (mcb1routing == 0x2))
> +				xgene_pmu->mc_active_mask = 0xFF;
> +			else if ((mcb0routing == 0x1) && (mcb1routing == 0x1))
> +				xgene_pmu->mc_active_mask =  0x33;
> +			else
> +				xgene_pmu->mc_active_mask =  0x11;
> +
> +		} else {
> +			/* Single MCB active */
> +			xgene_pmu->mcb_active_mask = 0x1;
> +			/* Probe all active L3C(s), maximum is 4 */
> +			xgene_pmu->l3c_active_mask = 0x0F;
> +			/* Probe all active MC(s), maximum is 4 */
> +			if ((mcb0routing == 0x2) && (mcb1routing == 0x0))
> +				xgene_pmu->mc_active_mask = 0x0F;
> +			else if ((mcb0routing == 0x1) && (mcb1routing == 0x0))
> +				xgene_pmu->mc_active_mask =  0x03;
> +			else
> +				xgene_pmu->mc_active_mask =  0x01;
> +		}
>  	} else {
> -		/* Single MCB active */
> -		xgene_pmu->mcb_active_mask = 0x1;
> -		/* Probe all active MC(s) */
> -		reg = readl(mcba_csr + CSW_CSWCR);
> -		xgene_pmu->mc_active_mask =
> -			(reg & MCBADDRMR_DUALMCU_MODE_MASK) ? 0x3 : 0x1;
> -	}
> +		res = platform_get_resource(pdev, IORESOURCE_MEM, 2);
> +		mcba_csr = devm_ioremap_resource(&pdev->dev, res);
> +		if (IS_ERR(mcba_csr)) {
> +			dev_err(&pdev->dev, "ioremap failed for MCBA CSR resource\n");
> +			return PTR_ERR(mcba_csr);
> +		}
> +
> +		res = platform_get_resource(pdev, IORESOURCE_MEM, 3);
> +		mcbb_csr = devm_ioremap_resource(&pdev->dev, res);
> +		if (IS_ERR(mcbb_csr)) {
> +			dev_err(&pdev->dev, "ioremap failed for MCBB CSR resource\n");
> +			return PTR_ERR(mcbb_csr);
> +		}
>  
> +		xgene_pmu->l3c_active_mask = 0x1;
> +		if (reg & CSW_CSWCR_DUALMCB_MASK) {
> +			/* Dual MCB active */
> +			xgene_pmu->mcb_active_mask = 0x3;
> +			/* Probe all active MC(s) */
> +			reg = readl(mcbb_csr + CSW_CSWCR);
> +			xgene_pmu->mc_active_mask =
> +				(reg & MCBADDRMR_DUALMCU_MODE_MASK) ? 0xF : 0x5;
> +		} else {
> +			/* Single MCB active */
> +			xgene_pmu->mcb_active_mask = 0x1;
> +			/* Probe all active MC(s) */
> +			reg = readl(mcba_csr + CSW_CSWCR);
> +			xgene_pmu->mc_active_mask =
> +				(reg & MCBADDRMR_DUALMCU_MODE_MASK) ? 0x3 : 0x1;
> +		}
> +	}
>  	return 0;
>  }

Please split these into separate functions. e.g. have
acpi_pmu_probe_active_mcb_mcu_v1_2() and
acpi_pmu_probe_active_mcb_mcu_v3().

>  static char *xgene_pmu_dev_name(struct device *dev, u32 type, int id)
> @@ -997,6 +1439,8 @@ static char *xgene_pmu_dev_name(struct device *dev, u32 type, int id)
>  		return devm_kasprintf(dev, GFP_KERNEL, "l3c%d", id);
>  	case PMU_TYPE_IOB:
>  		return devm_kasprintf(dev, GFP_KERNEL, "iob%d", id);
> +	case PMU_TYPE_IOB_SLOW:
> +		return devm_kasprintf(dev, GFP_KERNEL, "iob-slow%d", id);

What is IOB slow? How does it relate to the existing IOB?

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 1/3] perf: xgene: Parse PMU subnode from the match table
  2017-06-02 14:59     ` Mark Rutland
@ 2017-06-02 16:54       ` Hoan Tran
  -1 siblings, 0 replies; 26+ messages in thread
From: Hoan Tran @ 2017-06-02 16:54 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Will Deacon, Jonathan Corbet, Tai Nguyen, linux-arm-kernel, lkml,
	linux-doc, Loc Ho

Hi Mark

On Fri, Jun 2, 2017 at 7:59 AM, Mark Rutland <mark.rutland@arm.com> wrote:
> Hi Hoan,
>
> Apologies for the last reply.
>
> On Mon, Apr 03, 2017 at 09:47:55AM -0700, Hoan Tran wrote:
>> +static const struct acpi_device_id xgene_pmu_acpi_type_match[] = {
>> +     {"APMC0D5D", PMU_TYPE_L3C},
>> +     {"APMC0D5E", PMU_TYPE_IOB},
>> +     {"APMC0D5F", PMU_TYPE_MCB},
>> +     {"APMC0D60", PMU_TYPE_MC},
>> +     {},
>> +};
>> +
>> +static const struct acpi_device_id *xgene_pmu_acpi_match_type(
>> +                                     const struct acpi_device_id *ids,
>> +                                     struct acpi_device *adev)
>> +{
>> +     const struct acpi_device_id *match_id = NULL;
>> +     const struct acpi_device_id *id;
>> +
>> +     for (id = ids; id->id[0] || id->cls; id++) {
>> +             if (!acpi_match_device_ids(adev, id))
>> +                     match_id = id;
>> +             else if (match_id)
>> +                     break;
>> +     }
>> +
>> +     return match_id;
>> +}
>
> I don't believe this look is necessary. AFAICT, acpi_match_device_ids()
> already iterates over the id table it is given.

The acpi_match_device_ids() function just returns if a device ID is
available on the given list. It does not return the first matching ID.
That's the reason I created this function to find the first matching ID.

>
>> +
>>  static acpi_status acpi_pmu_dev_add(acpi_handle handle, u32 level,
>>                                   void *data, void **return_value)
>>  {
>> +     const struct acpi_device_id *acpi_id;
>>       struct xgene_pmu *xgene_pmu = data;
>>       struct xgene_pmu_dev_ctx *ctx;
>>       struct acpi_device *adev;
>> @@ -1059,17 +1085,11 @@ static acpi_status acpi_pmu_dev_add(acpi_handle handle, u32 level,
>>       if (acpi_bus_get_status(adev) || !adev->status.present)
>>               return AE_OK;
>>
>> -     if (!strcmp(acpi_device_hid(adev), "APMC0D5D"))
>> -             ctx = acpi_get_pmu_hw_inf(xgene_pmu, adev, PMU_TYPE_L3C);
>> -     else if (!strcmp(acpi_device_hid(adev), "APMC0D5E"))
>> -             ctx = acpi_get_pmu_hw_inf(xgene_pmu, adev, PMU_TYPE_IOB);
>> -     else if (!strcmp(acpi_device_hid(adev), "APMC0D5F"))
>> -             ctx = acpi_get_pmu_hw_inf(xgene_pmu, adev, PMU_TYPE_MCB);
>> -     else if (!strcmp(acpi_device_hid(adev), "APMC0D60"))
>> -             ctx = acpi_get_pmu_hw_inf(xgene_pmu, adev, PMU_TYPE_MC);
>> -     else
>> -             ctx = NULL;
>> +     acpi_id = xgene_pmu_acpi_match_type(xgene_pmu_acpi_type_match, adev);
>> +     if (!acpi_id)
>> +             return AE_OK;
>
> As above, and as I covered in my reply to v1, I think the above should
> be:
>
>         acpi_id = acpi_match_device_ids(adev, xgene_pmu_acpi_type_match);
>         if (!acpi_id)
>                 return AE_OK;
>
> ... or am I missing something?

The same above.

Thanks
Hoan

>
> With that change:
>
> Acked-by: Mark Rutland <mark.rutland@arm.com>
>
> Thanks,
> Mark.
>
>>
>> +     ctx = acpi_get_pmu_hw_inf(xgene_pmu, adev, (u32)acpi_id->driver_data);
>>       if (!ctx)
>>               return AE_OK;
>>
>> --
>> 1.9.1
>>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v2 1/3] perf: xgene: Parse PMU subnode from the match table
@ 2017-06-02 16:54       ` Hoan Tran
  0 siblings, 0 replies; 26+ messages in thread
From: Hoan Tran @ 2017-06-02 16:54 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Mark

On Fri, Jun 2, 2017 at 7:59 AM, Mark Rutland <mark.rutland@arm.com> wrote:
> Hi Hoan,
>
> Apologies for the last reply.
>
> On Mon, Apr 03, 2017 at 09:47:55AM -0700, Hoan Tran wrote:
>> +static const struct acpi_device_id xgene_pmu_acpi_type_match[] = {
>> +     {"APMC0D5D", PMU_TYPE_L3C},
>> +     {"APMC0D5E", PMU_TYPE_IOB},
>> +     {"APMC0D5F", PMU_TYPE_MCB},
>> +     {"APMC0D60", PMU_TYPE_MC},
>> +     {},
>> +};
>> +
>> +static const struct acpi_device_id *xgene_pmu_acpi_match_type(
>> +                                     const struct acpi_device_id *ids,
>> +                                     struct acpi_device *adev)
>> +{
>> +     const struct acpi_device_id *match_id = NULL;
>> +     const struct acpi_device_id *id;
>> +
>> +     for (id = ids; id->id[0] || id->cls; id++) {
>> +             if (!acpi_match_device_ids(adev, id))
>> +                     match_id = id;
>> +             else if (match_id)
>> +                     break;
>> +     }
>> +
>> +     return match_id;
>> +}
>
> I don't believe this look is necessary. AFAICT, acpi_match_device_ids()
> already iterates over the id table it is given.

The acpi_match_device_ids() function just returns if a device ID is
available on the given list. It does not return the first matching ID.
That's the reason I created this function to find the first matching ID.

>
>> +
>>  static acpi_status acpi_pmu_dev_add(acpi_handle handle, u32 level,
>>                                   void *data, void **return_value)
>>  {
>> +     const struct acpi_device_id *acpi_id;
>>       struct xgene_pmu *xgene_pmu = data;
>>       struct xgene_pmu_dev_ctx *ctx;
>>       struct acpi_device *adev;
>> @@ -1059,17 +1085,11 @@ static acpi_status acpi_pmu_dev_add(acpi_handle handle, u32 level,
>>       if (acpi_bus_get_status(adev) || !adev->status.present)
>>               return AE_OK;
>>
>> -     if (!strcmp(acpi_device_hid(adev), "APMC0D5D"))
>> -             ctx = acpi_get_pmu_hw_inf(xgene_pmu, adev, PMU_TYPE_L3C);
>> -     else if (!strcmp(acpi_device_hid(adev), "APMC0D5E"))
>> -             ctx = acpi_get_pmu_hw_inf(xgene_pmu, adev, PMU_TYPE_IOB);
>> -     else if (!strcmp(acpi_device_hid(adev), "APMC0D5F"))
>> -             ctx = acpi_get_pmu_hw_inf(xgene_pmu, adev, PMU_TYPE_MCB);
>> -     else if (!strcmp(acpi_device_hid(adev), "APMC0D60"))
>> -             ctx = acpi_get_pmu_hw_inf(xgene_pmu, adev, PMU_TYPE_MC);
>> -     else
>> -             ctx = NULL;
>> +     acpi_id = xgene_pmu_acpi_match_type(xgene_pmu_acpi_type_match, adev);
>> +     if (!acpi_id)
>> +             return AE_OK;
>
> As above, and as I covered in my reply to v1, I think the above should
> be:
>
>         acpi_id = acpi_match_device_ids(adev, xgene_pmu_acpi_type_match);
>         if (!acpi_id)
>                 return AE_OK;
>
> ... or am I missing something?

The same above.

Thanks
Hoan

>
> With that change:
>
> Acked-by: Mark Rutland <mark.rutland@arm.com>
>
> Thanks,
> Mark.
>
>>
>> +     ctx = acpi_get_pmu_hw_inf(xgene_pmu, adev, (u32)acpi_id->driver_data);
>>       if (!ctx)
>>               return AE_OK;
>>
>> --
>> 1.9.1
>>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 1/3] perf: xgene: Parse PMU subnode from the match table
  2017-06-02 16:54       ` Hoan Tran
@ 2017-06-02 17:23         ` Mark Rutland
  -1 siblings, 0 replies; 26+ messages in thread
From: Mark Rutland @ 2017-06-02 17:23 UTC (permalink / raw)
  To: Hoan Tran
  Cc: Will Deacon, Jonathan Corbet, Tai Nguyen, linux-arm-kernel, lkml,
	linux-doc, Loc Ho

On Fri, Jun 02, 2017 at 09:54:32AM -0700, Hoan Tran wrote:
> On Fri, Jun 2, 2017 at 7:59 AM, Mark Rutland <mark.rutland@arm.com> wrote:
> > On Mon, Apr 03, 2017 at 09:47:55AM -0700, Hoan Tran wrote:
> >> +static const struct acpi_device_id *xgene_pmu_acpi_match_type(
> >> +                                     const struct acpi_device_id *ids,
> >> +                                     struct acpi_device *adev)
> >> +{
> >> +     const struct acpi_device_id *match_id = NULL;
> >> +     const struct acpi_device_id *id;
> >> +
> >> +     for (id = ids; id->id[0] || id->cls; id++) {
> >> +             if (!acpi_match_device_ids(adev, id))
> >> +                     match_id = id;
> >> +             else if (match_id)
> >> +                     break;
> >> +     }
> >> +
> >> +     return match_id;
> >> +}
> >
> > I don't believe this look is necessary. AFAICT, acpi_match_device_ids()
> > already iterates over the id table it is given.
> 
> The acpi_match_device_ids() function just returns if a device ID is
> available on the given list. It does not return the first matching ID.
> That's the reason I created this function to find the first matching ID.

Ah, I see. Thanks for correcting me!

Can we use acpi_match_device(ids, &adev->dev), or is that the wrong dev?

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v2 1/3] perf: xgene: Parse PMU subnode from the match table
@ 2017-06-02 17:23         ` Mark Rutland
  0 siblings, 0 replies; 26+ messages in thread
From: Mark Rutland @ 2017-06-02 17:23 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jun 02, 2017 at 09:54:32AM -0700, Hoan Tran wrote:
> On Fri, Jun 2, 2017 at 7:59 AM, Mark Rutland <mark.rutland@arm.com> wrote:
> > On Mon, Apr 03, 2017 at 09:47:55AM -0700, Hoan Tran wrote:
> >> +static const struct acpi_device_id *xgene_pmu_acpi_match_type(
> >> +                                     const struct acpi_device_id *ids,
> >> +                                     struct acpi_device *adev)
> >> +{
> >> +     const struct acpi_device_id *match_id = NULL;
> >> +     const struct acpi_device_id *id;
> >> +
> >> +     for (id = ids; id->id[0] || id->cls; id++) {
> >> +             if (!acpi_match_device_ids(adev, id))
> >> +                     match_id = id;
> >> +             else if (match_id)
> >> +                     break;
> >> +     }
> >> +
> >> +     return match_id;
> >> +}
> >
> > I don't believe this look is necessary. AFAICT, acpi_match_device_ids()
> > already iterates over the id table it is given.
> 
> The acpi_match_device_ids() function just returns if a device ID is
> available on the given list. It does not return the first matching ID.
> That's the reason I created this function to find the first matching ID.

Ah, I see. Thanks for correcting me!

Can we use acpi_match_device(ids, &adev->dev), or is that the wrong dev?

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 1/3] perf: xgene: Parse PMU subnode from the match table
  2017-06-02 17:23         ` Mark Rutland
@ 2017-06-02 20:41           ` Hoan Tran
  -1 siblings, 0 replies; 26+ messages in thread
From: Hoan Tran @ 2017-06-02 20:41 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Will Deacon, Jonathan Corbet, Tai Nguyen, linux-arm-kernel, lkml,
	linux-doc, Loc Ho

Hi Mark,

On Fri, Jun 2, 2017 at 10:23 AM, Mark Rutland <mark.rutland@arm.com> wrote:
> On Fri, Jun 02, 2017 at 09:54:32AM -0700, Hoan Tran wrote:
>> On Fri, Jun 2, 2017 at 7:59 AM, Mark Rutland <mark.rutland@arm.com> wrote:
>> > On Mon, Apr 03, 2017 at 09:47:55AM -0700, Hoan Tran wrote:
>> >> +static const struct acpi_device_id *xgene_pmu_acpi_match_type(
>> >> +                                     const struct acpi_device_id *ids,
>> >> +                                     struct acpi_device *adev)
>> >> +{
>> >> +     const struct acpi_device_id *match_id = NULL;
>> >> +     const struct acpi_device_id *id;
>> >> +
>> >> +     for (id = ids; id->id[0] || id->cls; id++) {
>> >> +             if (!acpi_match_device_ids(adev, id))
>> >> +                     match_id = id;
>> >> +             else if (match_id)
>> >> +                     break;
>> >> +     }
>> >> +
>> >> +     return match_id;
>> >> +}
>> >
>> > I don't believe this look is necessary. AFAICT, acpi_match_device_ids()
>> > already iterates over the id table it is given.
>>
>> The acpi_match_device_ids() function just returns if a device ID is
>> available on the given list. It does not return the first matching ID.
>> That's the reason I created this function to find the first matching ID.
>
> Ah, I see. Thanks for correcting me!
>
> Can we use acpi_match_device(ids, &adev->dev), or is that the wrong dev?

They are subnode device, so they don't have full dev. Because of that,
acpi_match_device doesn't work.

Thanks
Hoan

>
> Thanks,
> Mark.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v2 1/3] perf: xgene: Parse PMU subnode from the match table
@ 2017-06-02 20:41           ` Hoan Tran
  0 siblings, 0 replies; 26+ messages in thread
From: Hoan Tran @ 2017-06-02 20:41 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Mark,

On Fri, Jun 2, 2017 at 10:23 AM, Mark Rutland <mark.rutland@arm.com> wrote:
> On Fri, Jun 02, 2017 at 09:54:32AM -0700, Hoan Tran wrote:
>> On Fri, Jun 2, 2017 at 7:59 AM, Mark Rutland <mark.rutland@arm.com> wrote:
>> > On Mon, Apr 03, 2017 at 09:47:55AM -0700, Hoan Tran wrote:
>> >> +static const struct acpi_device_id *xgene_pmu_acpi_match_type(
>> >> +                                     const struct acpi_device_id *ids,
>> >> +                                     struct acpi_device *adev)
>> >> +{
>> >> +     const struct acpi_device_id *match_id = NULL;
>> >> +     const struct acpi_device_id *id;
>> >> +
>> >> +     for (id = ids; id->id[0] || id->cls; id++) {
>> >> +             if (!acpi_match_device_ids(adev, id))
>> >> +                     match_id = id;
>> >> +             else if (match_id)
>> >> +                     break;
>> >> +     }
>> >> +
>> >> +     return match_id;
>> >> +}
>> >
>> > I don't believe this look is necessary. AFAICT, acpi_match_device_ids()
>> > already iterates over the id table it is given.
>>
>> The acpi_match_device_ids() function just returns if a device ID is
>> available on the given list. It does not return the first matching ID.
>> That's the reason I created this function to find the first matching ID.
>
> Ah, I see. Thanks for correcting me!
>
> Can we use acpi_match_device(ids, &adev->dev), or is that the wrong dev?

They are subnode device, so they don't have full dev. Because of that,
acpi_match_device doesn't work.

Thanks
Hoan

>
> Thanks,
> Mark.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 3/3] perf: xgene: Add support for SoC PMU version 3
  2017-06-02 16:04     ` Mark Rutland
@ 2017-06-02 21:02       ` Hoan Tran
  -1 siblings, 0 replies; 26+ messages in thread
From: Hoan Tran @ 2017-06-02 21:02 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Will Deacon, Jonathan Corbet, Tai Nguyen, linux-arm-kernel, lkml,
	linux-doc, Loc Ho

Hi Mark,

On Fri, Jun 2, 2017 at 9:04 AM, Mark Rutland <mark.rutland@arm.com> wrote:
> Hi Hoan,
>
> Apologies for the delay in getting to this.
>
> On Mon, Apr 03, 2017 at 09:47:57AM -0700, Hoan Tran wrote:
>> This patch adds support for SoC-wide (AKA uncore) Performance Monitoring
>> Unit version 3.
>>
>> It can support up to
>>  - 2 IOB PMU instances
>>  - 8 L3C PMU instances
>>  - 2 MCB PMU instances
>>  - 8 MCU PMU instances
>
> Please elaborate on the differences compared with prior versions.
>
> I see that counters are 64-bit. Are there any other important details to
> be aware of?

Yes, let me add 64 bit counter into the patch description. Beside of
that, I don't think anything else besides more PMU instances compared
with the previous version.

>
> [...]
>
>> +static struct attribute *l3c_pmu_v3_events_attrs[] = {
>
>> +     XGENE_PMU_EVENT_ATTR(read-hit,                          0x01),
>> +     XGENE_PMU_EVENT_ATTR(read-miss,                         0x02),
>
>> +     XGENE_PMU_EVENT_ATTR(reads,                             0x08),
>> +     XGENE_PMU_EVENT_ATTR(writes,                            0x09),
>
>> +};
>
> Some of these are singular (e.g. "read-hit", "read-miss"), while others
> are plural (e.g. "reads", "writes").
>
> The existing driver use the singular form. Please remain consistent with
> that. e.g. turn "reads" into "read".

Will fix it.

>
>> +     XGENE_PMU_EVENT_ATTR(PA-req-buf-alloc-all,              0x01),
>
> Likewise, please consistently use lower case.

Yes

>
>> +     XGENE_PMU_EVENT_ATTR(pmu-act-sent,                      0x01),
>
> Surely we don't need "pmu" in the event names?

Yes, will remove "pmu".

>
> [...]
>
>>  static inline u64 xgene_pmu_read_counter32(struct xgene_pmu_dev *pmu_dev,
>>                                          int idx)
>>  {
>>       return (u64)readl(pmu_dev->inf->csr + PMU_PMEVCNTR0 + (4 * idx));
>>  }
>
> I don't think the cast is necessary. Please remove it.
>
>>  static inline void
>> +xgene_pmu_write_counter64(struct xgene_pmu_dev *pmu_dev, int idx, u64 val)
>> +{
>> +     u32 cnt_lo, cnt_hi;
>> +
>> +     cnt_lo = val & 0xFFFFFFFF;
>> +     cnt_hi = val >> 32;
>
>         cnt_hi = upper_32_bits(val);
>         cnt_lo = lower_32_bits(val);
>
> (both are in <linux/kernel.h>)
>
>> +
>> +     /* v3 has 64-bit counter registers composed by 2 32-bit registers */
>> +     xgene_pmu_write_counter32(pmu_dev, 2 * idx, val);
>> +     xgene_pmu_write_counter32(pmu_dev, 2 * idx + 1, val >> 32);
>
> Please use cnt_hi and cnt_lo for the writes.
>
> [...]
>
>> @@ -621,7 +989,7 @@ static void xgene_perf_event_set_period(struct perf_event *event)
>>       struct xgene_pmu *xgene_pmu = pmu_dev->parent;
>>       struct hw_perf_event *hw = &event->hw;
>>       /*
>> -      * The X-Gene PMU counters have a period of 2^32. To account for the
>> +      * The X-Gene PMU counters have a period of 2^32 or more. To account for the
>>        * possiblity of extreme interrupt latency we program for a period of
>>        * half that. Hopefully we can handle the interrupt before another 2^31
>>        * events occur and the counter overtakes its previous value.
>
> This comment is now a little out-of-date, as we don't start 64-bit
> counters at half their period.
>
> I guess we don't expect a 64-bit counter to overflow, so can we state
> that in the comment?

How about this one.

        /*
         * For 32 bit counter, it has a period of 2^32. To account for the
         * possibility of extreme interrupt latency we program for a period of
         * half that. Hopefully, we can handle the interrupt before another 2^31
         * events occur and the counter overtakes its previous value.
         * For 64 bit counter, we don't expect it overflow.
         */

>
> [...]
>
>> -static int acpi_pmu_probe_active_mcb_mcu(struct xgene_pmu *xgene_pmu,
>> +static int acpi_pmu_probe_active_mcb_mcu_l3c(struct xgene_pmu *xgene_pmu,
>>                                        struct platform_device *pdev)
>>  {
>>       void __iomem *csw_csr, *mcba_csr, *mcbb_csr;
>>       struct resource *res;
>>       unsigned int reg;
>> +     u32 mcb0routing;
>> +     u32 mcb1routing;
>>
>>       res = platform_get_resource(pdev, IORESOURCE_MEM, 1);
>>       csw_csr = devm_ioremap_resource(&pdev->dev, res);
>> @@ -899,41 +1309,72 @@ static int acpi_pmu_probe_active_mcb_mcu(struct xgene_pmu *xgene_pmu,
>>               return PTR_ERR(csw_csr);
>>       }
>>
>> -     res = platform_get_resource(pdev, IORESOURCE_MEM, 2);
>> -     mcba_csr = devm_ioremap_resource(&pdev->dev, res);
>> -     if (IS_ERR(mcba_csr)) {
>> -             dev_err(&pdev->dev, "ioremap failed for MCBA CSR resource\n");
>> -             return PTR_ERR(mcba_csr);
>> -     }
>> -
>> -     res = platform_get_resource(pdev, IORESOURCE_MEM, 3);
>> -     mcbb_csr = devm_ioremap_resource(&pdev->dev, res);
>> -     if (IS_ERR(mcbb_csr)) {
>> -             dev_err(&pdev->dev, "ioremap failed for MCBB CSR resource\n");
>> -             return PTR_ERR(mcbb_csr);
>> -     }
>> -
>>       reg = readl(csw_csr + CSW_CSWCR);
>> -     if (reg & CSW_CSWCR_DUALMCB_MASK) {
>> -             /* Dual MCB active */
>> -             xgene_pmu->mcb_active_mask = 0x3;
>> -             /* Probe all active MC(s) */
>> -             reg = readl(mcbb_csr + CSW_CSWCR);
>> -             xgene_pmu->mc_active_mask =
>> -                     (reg & MCBADDRMR_DUALMCU_MODE_MASK) ? 0xF : 0x5;
>> +     if (xgene_pmu->version == PCP_PMU_V3) {
>> +             mcb0routing = CSW_CSWCR_MCB0_ROUTING(reg);
>> +             mcb1routing = CSW_CSWCR_MCB0_ROUTING(reg);
>> +             if (reg & CSW_CSWCR_DUALMCB_MASK) {
>> +                     /* Dual MCB active */
>> +                     xgene_pmu->mcb_active_mask = 0x3;
>> +                     /* Probe all active L3C(s), maximum is 8 */
>> +                     xgene_pmu->l3c_active_mask = 0xFF;
>> +                     /* Probe all active MC(s), maximum is 8 */
>> +                     if ((mcb0routing == 0x2) && (mcb1routing == 0x2))
>> +                             xgene_pmu->mc_active_mask = 0xFF;
>> +                     else if ((mcb0routing == 0x1) && (mcb1routing == 0x1))
>> +                             xgene_pmu->mc_active_mask =  0x33;
>> +                     else
>> +                             xgene_pmu->mc_active_mask =  0x11;
>> +
>> +             } else {
>> +                     /* Single MCB active */
>> +                     xgene_pmu->mcb_active_mask = 0x1;
>> +                     /* Probe all active L3C(s), maximum is 4 */
>> +                     xgene_pmu->l3c_active_mask = 0x0F;
>> +                     /* Probe all active MC(s), maximum is 4 */
>> +                     if ((mcb0routing == 0x2) && (mcb1routing == 0x0))
>> +                             xgene_pmu->mc_active_mask = 0x0F;
>> +                     else if ((mcb0routing == 0x1) && (mcb1routing == 0x0))
>> +                             xgene_pmu->mc_active_mask =  0x03;
>> +                     else
>> +                             xgene_pmu->mc_active_mask =  0x01;
>> +             }
>>       } else {
>> -             /* Single MCB active */
>> -             xgene_pmu->mcb_active_mask = 0x1;
>> -             /* Probe all active MC(s) */
>> -             reg = readl(mcba_csr + CSW_CSWCR);
>> -             xgene_pmu->mc_active_mask =
>> -                     (reg & MCBADDRMR_DUALMCU_MODE_MASK) ? 0x3 : 0x1;
>> -     }
>> +             res = platform_get_resource(pdev, IORESOURCE_MEM, 2);
>> +             mcba_csr = devm_ioremap_resource(&pdev->dev, res);
>> +             if (IS_ERR(mcba_csr)) {
>> +                     dev_err(&pdev->dev, "ioremap failed for MCBA CSR resource\n");
>> +                     return PTR_ERR(mcba_csr);
>> +             }
>> +
>> +             res = platform_get_resource(pdev, IORESOURCE_MEM, 3);
>> +             mcbb_csr = devm_ioremap_resource(&pdev->dev, res);
>> +             if (IS_ERR(mcbb_csr)) {
>> +                     dev_err(&pdev->dev, "ioremap failed for MCBB CSR resource\n");
>> +                     return PTR_ERR(mcbb_csr);
>> +             }
>>
>> +             xgene_pmu->l3c_active_mask = 0x1;
>> +             if (reg & CSW_CSWCR_DUALMCB_MASK) {
>> +                     /* Dual MCB active */
>> +                     xgene_pmu->mcb_active_mask = 0x3;
>> +                     /* Probe all active MC(s) */
>> +                     reg = readl(mcbb_csr + CSW_CSWCR);
>> +                     xgene_pmu->mc_active_mask =
>> +                             (reg & MCBADDRMR_DUALMCU_MODE_MASK) ? 0xF : 0x5;
>> +             } else {
>> +                     /* Single MCB active */
>> +                     xgene_pmu->mcb_active_mask = 0x1;
>> +                     /* Probe all active MC(s) */
>> +                     reg = readl(mcba_csr + CSW_CSWCR);
>> +                     xgene_pmu->mc_active_mask =
>> +                             (reg & MCBADDRMR_DUALMCU_MODE_MASK) ? 0x3 : 0x1;
>> +             }
>> +     }
>>       return 0;
>>  }
>
> Please split these into separate functions. e.g. have
> acpi_pmu_probe_active_mcb_mcu_v1_2() and
> acpi_pmu_probe_active_mcb_mcu_v3().
>

Yes, I can do it.

>>  static char *xgene_pmu_dev_name(struct device *dev, u32 type, int id)
>> @@ -997,6 +1439,8 @@ static char *xgene_pmu_dev_name(struct device *dev, u32 type, int id)
>>               return devm_kasprintf(dev, GFP_KERNEL, "l3c%d", id);
>>       case PMU_TYPE_IOB:
>>               return devm_kasprintf(dev, GFP_KERNEL, "iob%d", id);
>> +     case PMU_TYPE_IOB_SLOW:
>> +             return devm_kasprintf(dev, GFP_KERNEL, "iob-slow%d", id);
>
> What is IOB slow? How does it relate to the existing IOB?

IOB SLOW PMU is another PMU, it's not related to IOB PMU. They are
sitting in 2 different clock domains.

Thanks
Hoan

>
> Thanks,
> Mark.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v2 3/3] perf: xgene: Add support for SoC PMU version 3
@ 2017-06-02 21:02       ` Hoan Tran
  0 siblings, 0 replies; 26+ messages in thread
From: Hoan Tran @ 2017-06-02 21:02 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Mark,

On Fri, Jun 2, 2017 at 9:04 AM, Mark Rutland <mark.rutland@arm.com> wrote:
> Hi Hoan,
>
> Apologies for the delay in getting to this.
>
> On Mon, Apr 03, 2017 at 09:47:57AM -0700, Hoan Tran wrote:
>> This patch adds support for SoC-wide (AKA uncore) Performance Monitoring
>> Unit version 3.
>>
>> It can support up to
>>  - 2 IOB PMU instances
>>  - 8 L3C PMU instances
>>  - 2 MCB PMU instances
>>  - 8 MCU PMU instances
>
> Please elaborate on the differences compared with prior versions.
>
> I see that counters are 64-bit. Are there any other important details to
> be aware of?

Yes, let me add 64 bit counter into the patch description. Beside of
that, I don't think anything else besides more PMU instances compared
with the previous version.

>
> [...]
>
>> +static struct attribute *l3c_pmu_v3_events_attrs[] = {
>
>> +     XGENE_PMU_EVENT_ATTR(read-hit,                          0x01),
>> +     XGENE_PMU_EVENT_ATTR(read-miss,                         0x02),
>
>> +     XGENE_PMU_EVENT_ATTR(reads,                             0x08),
>> +     XGENE_PMU_EVENT_ATTR(writes,                            0x09),
>
>> +};
>
> Some of these are singular (e.g. "read-hit", "read-miss"), while others
> are plural (e.g. "reads", "writes").
>
> The existing driver use the singular form. Please remain consistent with
> that. e.g. turn "reads" into "read".

Will fix it.

>
>> +     XGENE_PMU_EVENT_ATTR(PA-req-buf-alloc-all,              0x01),
>
> Likewise, please consistently use lower case.

Yes

>
>> +     XGENE_PMU_EVENT_ATTR(pmu-act-sent,                      0x01),
>
> Surely we don't need "pmu" in the event names?

Yes, will remove "pmu".

>
> [...]
>
>>  static inline u64 xgene_pmu_read_counter32(struct xgene_pmu_dev *pmu_dev,
>>                                          int idx)
>>  {
>>       return (u64)readl(pmu_dev->inf->csr + PMU_PMEVCNTR0 + (4 * idx));
>>  }
>
> I don't think the cast is necessary. Please remove it.
>
>>  static inline void
>> +xgene_pmu_write_counter64(struct xgene_pmu_dev *pmu_dev, int idx, u64 val)
>> +{
>> +     u32 cnt_lo, cnt_hi;
>> +
>> +     cnt_lo = val & 0xFFFFFFFF;
>> +     cnt_hi = val >> 32;
>
>         cnt_hi = upper_32_bits(val);
>         cnt_lo = lower_32_bits(val);
>
> (both are in <linux/kernel.h>)
>
>> +
>> +     /* v3 has 64-bit counter registers composed by 2 32-bit registers */
>> +     xgene_pmu_write_counter32(pmu_dev, 2 * idx, val);
>> +     xgene_pmu_write_counter32(pmu_dev, 2 * idx + 1, val >> 32);
>
> Please use cnt_hi and cnt_lo for the writes.
>
> [...]
>
>> @@ -621,7 +989,7 @@ static void xgene_perf_event_set_period(struct perf_event *event)
>>       struct xgene_pmu *xgene_pmu = pmu_dev->parent;
>>       struct hw_perf_event *hw = &event->hw;
>>       /*
>> -      * The X-Gene PMU counters have a period of 2^32. To account for the
>> +      * The X-Gene PMU counters have a period of 2^32 or more. To account for the
>>        * possiblity of extreme interrupt latency we program for a period of
>>        * half that. Hopefully we can handle the interrupt before another 2^31
>>        * events occur and the counter overtakes its previous value.
>
> This comment is now a little out-of-date, as we don't start 64-bit
> counters at half their period.
>
> I guess we don't expect a 64-bit counter to overflow, so can we state
> that in the comment?

How about this one.

        /*
         * For 32 bit counter, it has a period of 2^32. To account for the
         * possibility of extreme interrupt latency we program for a period of
         * half that. Hopefully, we can handle the interrupt before another 2^31
         * events occur and the counter overtakes its previous value.
         * For 64 bit counter, we don't expect it overflow.
         */

>
> [...]
>
>> -static int acpi_pmu_probe_active_mcb_mcu(struct xgene_pmu *xgene_pmu,
>> +static int acpi_pmu_probe_active_mcb_mcu_l3c(struct xgene_pmu *xgene_pmu,
>>                                        struct platform_device *pdev)
>>  {
>>       void __iomem *csw_csr, *mcba_csr, *mcbb_csr;
>>       struct resource *res;
>>       unsigned int reg;
>> +     u32 mcb0routing;
>> +     u32 mcb1routing;
>>
>>       res = platform_get_resource(pdev, IORESOURCE_MEM, 1);
>>       csw_csr = devm_ioremap_resource(&pdev->dev, res);
>> @@ -899,41 +1309,72 @@ static int acpi_pmu_probe_active_mcb_mcu(struct xgene_pmu *xgene_pmu,
>>               return PTR_ERR(csw_csr);
>>       }
>>
>> -     res = platform_get_resource(pdev, IORESOURCE_MEM, 2);
>> -     mcba_csr = devm_ioremap_resource(&pdev->dev, res);
>> -     if (IS_ERR(mcba_csr)) {
>> -             dev_err(&pdev->dev, "ioremap failed for MCBA CSR resource\n");
>> -             return PTR_ERR(mcba_csr);
>> -     }
>> -
>> -     res = platform_get_resource(pdev, IORESOURCE_MEM, 3);
>> -     mcbb_csr = devm_ioremap_resource(&pdev->dev, res);
>> -     if (IS_ERR(mcbb_csr)) {
>> -             dev_err(&pdev->dev, "ioremap failed for MCBB CSR resource\n");
>> -             return PTR_ERR(mcbb_csr);
>> -     }
>> -
>>       reg = readl(csw_csr + CSW_CSWCR);
>> -     if (reg & CSW_CSWCR_DUALMCB_MASK) {
>> -             /* Dual MCB active */
>> -             xgene_pmu->mcb_active_mask = 0x3;
>> -             /* Probe all active MC(s) */
>> -             reg = readl(mcbb_csr + CSW_CSWCR);
>> -             xgene_pmu->mc_active_mask =
>> -                     (reg & MCBADDRMR_DUALMCU_MODE_MASK) ? 0xF : 0x5;
>> +     if (xgene_pmu->version == PCP_PMU_V3) {
>> +             mcb0routing = CSW_CSWCR_MCB0_ROUTING(reg);
>> +             mcb1routing = CSW_CSWCR_MCB0_ROUTING(reg);
>> +             if (reg & CSW_CSWCR_DUALMCB_MASK) {
>> +                     /* Dual MCB active */
>> +                     xgene_pmu->mcb_active_mask = 0x3;
>> +                     /* Probe all active L3C(s), maximum is 8 */
>> +                     xgene_pmu->l3c_active_mask = 0xFF;
>> +                     /* Probe all active MC(s), maximum is 8 */
>> +                     if ((mcb0routing == 0x2) && (mcb1routing == 0x2))
>> +                             xgene_pmu->mc_active_mask = 0xFF;
>> +                     else if ((mcb0routing == 0x1) && (mcb1routing == 0x1))
>> +                             xgene_pmu->mc_active_mask =  0x33;
>> +                     else
>> +                             xgene_pmu->mc_active_mask =  0x11;
>> +
>> +             } else {
>> +                     /* Single MCB active */
>> +                     xgene_pmu->mcb_active_mask = 0x1;
>> +                     /* Probe all active L3C(s), maximum is 4 */
>> +                     xgene_pmu->l3c_active_mask = 0x0F;
>> +                     /* Probe all active MC(s), maximum is 4 */
>> +                     if ((mcb0routing == 0x2) && (mcb1routing == 0x0))
>> +                             xgene_pmu->mc_active_mask = 0x0F;
>> +                     else if ((mcb0routing == 0x1) && (mcb1routing == 0x0))
>> +                             xgene_pmu->mc_active_mask =  0x03;
>> +                     else
>> +                             xgene_pmu->mc_active_mask =  0x01;
>> +             }
>>       } else {
>> -             /* Single MCB active */
>> -             xgene_pmu->mcb_active_mask = 0x1;
>> -             /* Probe all active MC(s) */
>> -             reg = readl(mcba_csr + CSW_CSWCR);
>> -             xgene_pmu->mc_active_mask =
>> -                     (reg & MCBADDRMR_DUALMCU_MODE_MASK) ? 0x3 : 0x1;
>> -     }
>> +             res = platform_get_resource(pdev, IORESOURCE_MEM, 2);
>> +             mcba_csr = devm_ioremap_resource(&pdev->dev, res);
>> +             if (IS_ERR(mcba_csr)) {
>> +                     dev_err(&pdev->dev, "ioremap failed for MCBA CSR resource\n");
>> +                     return PTR_ERR(mcba_csr);
>> +             }
>> +
>> +             res = platform_get_resource(pdev, IORESOURCE_MEM, 3);
>> +             mcbb_csr = devm_ioremap_resource(&pdev->dev, res);
>> +             if (IS_ERR(mcbb_csr)) {
>> +                     dev_err(&pdev->dev, "ioremap failed for MCBB CSR resource\n");
>> +                     return PTR_ERR(mcbb_csr);
>> +             }
>>
>> +             xgene_pmu->l3c_active_mask = 0x1;
>> +             if (reg & CSW_CSWCR_DUALMCB_MASK) {
>> +                     /* Dual MCB active */
>> +                     xgene_pmu->mcb_active_mask = 0x3;
>> +                     /* Probe all active MC(s) */
>> +                     reg = readl(mcbb_csr + CSW_CSWCR);
>> +                     xgene_pmu->mc_active_mask =
>> +                             (reg & MCBADDRMR_DUALMCU_MODE_MASK) ? 0xF : 0x5;
>> +             } else {
>> +                     /* Single MCB active */
>> +                     xgene_pmu->mcb_active_mask = 0x1;
>> +                     /* Probe all active MC(s) */
>> +                     reg = readl(mcba_csr + CSW_CSWCR);
>> +                     xgene_pmu->mc_active_mask =
>> +                             (reg & MCBADDRMR_DUALMCU_MODE_MASK) ? 0x3 : 0x1;
>> +             }
>> +     }
>>       return 0;
>>  }
>
> Please split these into separate functions. e.g. have
> acpi_pmu_probe_active_mcb_mcu_v1_2() and
> acpi_pmu_probe_active_mcb_mcu_v3().
>

Yes, I can do it.

>>  static char *xgene_pmu_dev_name(struct device *dev, u32 type, int id)
>> @@ -997,6 +1439,8 @@ static char *xgene_pmu_dev_name(struct device *dev, u32 type, int id)
>>               return devm_kasprintf(dev, GFP_KERNEL, "l3c%d", id);
>>       case PMU_TYPE_IOB:
>>               return devm_kasprintf(dev, GFP_KERNEL, "iob%d", id);
>> +     case PMU_TYPE_IOB_SLOW:
>> +             return devm_kasprintf(dev, GFP_KERNEL, "iob-slow%d", id);
>
> What is IOB slow? How does it relate to the existing IOB?

IOB SLOW PMU is another PMU, it's not related to IOB PMU. They are
sitting in 2 different clock domains.

Thanks
Hoan

>
> Thanks,
> Mark.

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2017-06-02 21:03 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-03 16:47 [PATCH v2 0/3] perf: xgene: Add support for SoC PMU version 3 Hoan Tran
2017-04-03 16:47 ` Hoan Tran
2017-04-03 16:47 ` [PATCH v2 1/3] perf: xgene: Parse PMU subnode from the match table Hoan Tran
2017-04-03 16:47   ` Hoan Tran
2017-06-02 14:59   ` Mark Rutland
2017-06-02 14:59     ` Mark Rutland
2017-06-02 16:54     ` Hoan Tran
2017-06-02 16:54       ` Hoan Tran
2017-06-02 17:23       ` Mark Rutland
2017-06-02 17:23         ` Mark Rutland
2017-06-02 20:41         ` Hoan Tran
2017-06-02 20:41           ` Hoan Tran
2017-04-03 16:47 ` [PATCH v2 2/3] perf: xgene: Move PMU leaf functions into function pointer structure Hoan Tran
2017-04-03 16:47   ` Hoan Tran
2017-04-03 16:47 ` [PATCH v2 3/3] perf: xgene: Add support for SoC PMU version 3 Hoan Tran
2017-04-03 16:47   ` Hoan Tran
2017-06-02 16:04   ` Mark Rutland
2017-06-02 16:04     ` Mark Rutland
2017-06-02 21:02     ` Hoan Tran
2017-06-02 21:02       ` Hoan Tran
2017-04-17 16:36 ` [PATCH v2 0/3] " Hoan Tran
2017-04-17 16:36   ` Hoan Tran
     [not found] ` <CAFHUOYzJ1PoUV=BmcVxSHZ_8qENm6DW5_Hd3uuFHE9kdqhXCzA@mail.gmail.com>
2017-05-05 15:48   ` Hoan Tran
2017-05-05 15:48     ` Hoan Tran
2017-05-30 17:29     ` Hoan Tran
2017-05-30 17:29       ` Hoan Tran

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.