All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFCv4 0/7]  arm_pmu/perf tools: play nicely with CPU PMU cpumasks
@ 2016-09-08 10:21 ` Mark Rutland
  0 siblings, 0 replies; 36+ messages in thread
From: Mark Rutland @ 2016-09-08 10:21 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel
  Cc: acme, alexander.shishkin, jolsa, mark.rutland, mingo, peterz,
	will.deacon

Hi,

I'm trying to make the perf tool play better with PMUs in heterogeneous systems
(e.g. big.LITTLE), where there are several logical PMUs, each covering a subset
of CPUs.

Currently perf-record doesn't work for these PMUs, unless forced to use
per-thread mmaps. In the absence of a cpumask, it tries to open events on CPUs
not supported by a PMU, and gives up. In the presence of a cpumask, it ends up
failing to mmap, as the evlist->cpus map contains a different set of CPUs from
the evsel->cpus map populated from the cpumask. This is addressed by the
penultimate patch in this series.

Complicating matters, prior to commit 00e727bb389359c8 ("perf stat: Balance
opening and reading events"), from version two of this series, perf-stat would
behave erroneously in the presence of a cpumask file, blocking forever after
the workload completed. While this is now fixed, existing binaries (e.g. those
shipped by distributions) would be broken by the addition of a cpumask file
kernel-side.

To cater for this, this series adds support for a new PMU sysfs file, named
'cpus' rather than 'cpumask', listing a number of CPUs that a logical PMU
covers. As old binaries will not look for this, this can be safely added to the
kernel without risk of breakage.

I've included the kernel and userspace parts in this series as they've proven
difficult to review in isolation.

Thanks,
Mark.

Since v1 [1]:
* Avoid double cpu_map__idx() call in perf_evlist__mmap_per_evsel
* Look for a supported_cpumask file when a cpumask file is not present

Since v2 [2]:
* Drop patches which have been picked up from v2
* Rebase to v4.8-rc1
* Better describe the issue in the supported_cpumask patch

Since v3 [3]:
* Prepend patches exporting the cpus file for ARM PMUs
* s/supported_cpus/cpus/

[1] http://lkml.kernel.org/r/1467907474-3290-1-git-send-email-mark.rutland@arm.com
[2] http://lkml.kernel.org/r/1468577293-19667-1-git-send-email-mark.rutland@arm.com
[3] http://lkml.kernel.org/r/1470933366-1364-1-git-send-email-mark.rutland@arm.com

Mark Rutland (7):
  drivers/perf: arm_pmu: add common attr group fields
  arm64: perf: move to common attr_group fields
  arm: perf: move to common attr_group fields
  drivers/perf: arm_pmu: only use common attr_groups
  drivers/perf: arm_pmu: expose a cpumask in sysfs
  perf: util: only open events on CPUs an evsel permits
  perf: util: support alternative sysfs cpumask

 arch/arm/kernel/perf_event_v7.c | 47 ++++++++++++++++++++++++-----------------
 arch/arm64/kernel/perf_event.c  | 36 ++++++++++++++++++++-----------
 drivers/perf/arm_pmu.c          | 23 ++++++++++++++++++++
 include/linux/perf/arm_pmu.h    | 10 ++++++++-
 tools/perf/util/evlist.c        |  8 ++++++-
 tools/perf/util/pmu.c           | 15 ++++++++++---
 6 files changed, 103 insertions(+), 36 deletions(-)

-- 
1.9.1

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [RFCv4 0/7]  arm_pmu/perf tools: play nicely with CPU PMU cpumasks
@ 2016-09-08 10:21 ` Mark Rutland
  0 siblings, 0 replies; 36+ messages in thread
From: Mark Rutland @ 2016-09-08 10:21 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

I'm trying to make the perf tool play better with PMUs in heterogeneous systems
(e.g. big.LITTLE), where there are several logical PMUs, each covering a subset
of CPUs.

Currently perf-record doesn't work for these PMUs, unless forced to use
per-thread mmaps. In the absence of a cpumask, it tries to open events on CPUs
not supported by a PMU, and gives up. In the presence of a cpumask, it ends up
failing to mmap, as the evlist->cpus map contains a different set of CPUs from
the evsel->cpus map populated from the cpumask. This is addressed by the
penultimate patch in this series.

Complicating matters, prior to commit 00e727bb389359c8 ("perf stat: Balance
opening and reading events"), from version two of this series, perf-stat would
behave erroneously in the presence of a cpumask file, blocking forever after
the workload completed. While this is now fixed, existing binaries (e.g. those
shipped by distributions) would be broken by the addition of a cpumask file
kernel-side.

To cater for this, this series adds support for a new PMU sysfs file, named
'cpus' rather than 'cpumask', listing a number of CPUs that a logical PMU
covers. As old binaries will not look for this, this can be safely added to the
kernel without risk of breakage.

I've included the kernel and userspace parts in this series as they've proven
difficult to review in isolation.

Thanks,
Mark.

Since v1 [1]:
* Avoid double cpu_map__idx() call in perf_evlist__mmap_per_evsel
* Look for a supported_cpumask file when a cpumask file is not present

Since v2 [2]:
* Drop patches which have been picked up from v2
* Rebase to v4.8-rc1
* Better describe the issue in the supported_cpumask patch

Since v3 [3]:
* Prepend patches exporting the cpus file for ARM PMUs
* s/supported_cpus/cpus/

[1] http://lkml.kernel.org/r/1467907474-3290-1-git-send-email-mark.rutland at arm.com
[2] http://lkml.kernel.org/r/1468577293-19667-1-git-send-email-mark.rutland at arm.com
[3] http://lkml.kernel.org/r/1470933366-1364-1-git-send-email-mark.rutland at arm.com

Mark Rutland (7):
  drivers/perf: arm_pmu: add common attr group fields
  arm64: perf: move to common attr_group fields
  arm: perf: move to common attr_group fields
  drivers/perf: arm_pmu: only use common attr_groups
  drivers/perf: arm_pmu: expose a cpumask in sysfs
  perf: util: only open events on CPUs an evsel permits
  perf: util: support alternative sysfs cpumask

 arch/arm/kernel/perf_event_v7.c | 47 ++++++++++++++++++++++++-----------------
 arch/arm64/kernel/perf_event.c  | 36 ++++++++++++++++++++-----------
 drivers/perf/arm_pmu.c          | 23 ++++++++++++++++++++
 include/linux/perf/arm_pmu.h    | 10 ++++++++-
 tools/perf/util/evlist.c        |  8 ++++++-
 tools/perf/util/pmu.c           | 15 ++++++++++---
 6 files changed, 103 insertions(+), 36 deletions(-)

-- 
1.9.1

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [RFCv4 1/7] drivers/perf: arm_pmu: add common attr group fields
  2016-09-08 10:21 ` Mark Rutland
@ 2016-09-08 10:21   ` Mark Rutland
  -1 siblings, 0 replies; 36+ messages in thread
From: Mark Rutland @ 2016-09-08 10:21 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel
  Cc: acme, alexander.shishkin, jolsa, mark.rutland, mingo, peterz,
	will.deacon

In preparation for adding common attribute groups, add an array of
attribute group pointers to arm_pmu, which will be used if the
backend hasn't already set pmu::attr_groups.

Subsequent patches will move backends over to using these, before adding
common fields.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
---
 drivers/perf/arm_pmu.c       | 3 +++
 include/linux/perf/arm_pmu.h | 9 ++++++++-
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
index f5e1008..145caf4 100644
--- a/drivers/perf/arm_pmu.c
+++ b/drivers/perf/arm_pmu.c
@@ -1039,6 +1039,9 @@ int arm_pmu_device_probe(struct platform_device *pdev,
 		goto out_free;
 	}
 
+	if (!pmu->pmu.attr_groups)
+		pmu->pmu.attr_groups = pmu->attr_groups;
+
 	ret = cpu_pmu_init(pmu);
 	if (ret)
 		goto out_free;
diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h
index e188438..8030814 100644
--- a/include/linux/perf/arm_pmu.h
+++ b/include/linux/perf/arm_pmu.h
@@ -14,7 +14,7 @@
 
 #include <linux/interrupt.h>
 #include <linux/perf_event.h>
-
+#include <linux/sysfs.h>
 #include <asm/cputype.h>
 
 /*
@@ -77,6 +77,12 @@ struct pmu_hw_events {
 	struct arm_pmu		*percpu_pmu;
 };
 
+enum armpmu_attr_groups {
+	ARMPMU_ATTR_GROUP_EVENTS,
+	ARMPMU_ATTR_GROUP_FORMATS,
+	ARMPMU_NR_ATTR_GROUPS
+};
+
 struct arm_pmu {
 	struct pmu	pmu;
 	cpumask_t	active_irqs;
@@ -111,6 +117,7 @@ struct arm_pmu {
 	struct pmu_hw_events	__percpu *hw_events;
 	struct list_head	entry;
 	struct notifier_block	cpu_pm_nb;
+	const struct attribute_group *attr_groups[ARMPMU_NR_ATTR_GROUPS + 1];
 };
 
 #define to_arm_pmu(p) (container_of(p, struct arm_pmu, pmu))
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [RFCv4 1/7] drivers/perf: arm_pmu: add common attr group fields
@ 2016-09-08 10:21   ` Mark Rutland
  0 siblings, 0 replies; 36+ messages in thread
From: Mark Rutland @ 2016-09-08 10:21 UTC (permalink / raw)
  To: linux-arm-kernel

In preparation for adding common attribute groups, add an array of
attribute group pointers to arm_pmu, which will be used if the
backend hasn't already set pmu::attr_groups.

Subsequent patches will move backends over to using these, before adding
common fields.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
---
 drivers/perf/arm_pmu.c       | 3 +++
 include/linux/perf/arm_pmu.h | 9 ++++++++-
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
index f5e1008..145caf4 100644
--- a/drivers/perf/arm_pmu.c
+++ b/drivers/perf/arm_pmu.c
@@ -1039,6 +1039,9 @@ int arm_pmu_device_probe(struct platform_device *pdev,
 		goto out_free;
 	}
 
+	if (!pmu->pmu.attr_groups)
+		pmu->pmu.attr_groups = pmu->attr_groups;
+
 	ret = cpu_pmu_init(pmu);
 	if (ret)
 		goto out_free;
diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h
index e188438..8030814 100644
--- a/include/linux/perf/arm_pmu.h
+++ b/include/linux/perf/arm_pmu.h
@@ -14,7 +14,7 @@
 
 #include <linux/interrupt.h>
 #include <linux/perf_event.h>
-
+#include <linux/sysfs.h>
 #include <asm/cputype.h>
 
 /*
@@ -77,6 +77,12 @@ struct pmu_hw_events {
 	struct arm_pmu		*percpu_pmu;
 };
 
+enum armpmu_attr_groups {
+	ARMPMU_ATTR_GROUP_EVENTS,
+	ARMPMU_ATTR_GROUP_FORMATS,
+	ARMPMU_NR_ATTR_GROUPS
+};
+
 struct arm_pmu {
 	struct pmu	pmu;
 	cpumask_t	active_irqs;
@@ -111,6 +117,7 @@ struct arm_pmu {
 	struct pmu_hw_events	__percpu *hw_events;
 	struct list_head	entry;
 	struct notifier_block	cpu_pm_nb;
+	const struct attribute_group *attr_groups[ARMPMU_NR_ATTR_GROUPS + 1];
 };
 
 #define to_arm_pmu(p) (container_of(p, struct arm_pmu, pmu))
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [RFCv4 2/7] arm64: perf: move to common attr_group fields
  2016-09-08 10:21 ` Mark Rutland
@ 2016-09-08 10:21   ` Mark Rutland
  -1 siblings, 0 replies; 36+ messages in thread
From: Mark Rutland @ 2016-09-08 10:21 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel
  Cc: acme, alexander.shishkin, jolsa, mark.rutland, mingo, peterz,
	will.deacon

By using a common attr_groups array, the common arm_pmu code can set up
common files (e.g. cpumask) for us in subsequent patches.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/kernel/perf_event.c | 36 ++++++++++++++++++++++++------------
 1 file changed, 24 insertions(+), 12 deletions(-)

diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index 838ccf1..bcc8dff 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -523,12 +523,6 @@ static struct attribute_group armv8_pmuv3_format_attr_group = {
 	.attrs = armv8_pmuv3_format_attrs,
 };
 
-static const struct attribute_group *armv8_pmuv3_attr_groups[] = {
-	&armv8_pmuv3_events_attr_group,
-	&armv8_pmuv3_format_attr_group,
-	NULL,
-};
-
 /*
  * Perf Events' indices
  */
@@ -985,7 +979,10 @@ static int armv8_pmuv3_init(struct arm_pmu *cpu_pmu)
 	armv8_pmu_init(cpu_pmu);
 	cpu_pmu->name			= "armv8_pmuv3";
 	cpu_pmu->map_event		= armv8_pmuv3_map_event;
-	cpu_pmu->pmu.attr_groups	= armv8_pmuv3_attr_groups;
+	cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_EVENTS] =
+		&armv8_pmuv3_events_attr_group;
+	cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_FORMATS] =
+		&armv8_pmuv3_format_attr_group;
 	return armv8pmu_probe_pmu(cpu_pmu);
 }
 
@@ -994,7 +991,10 @@ static int armv8_a53_pmu_init(struct arm_pmu *cpu_pmu)
 	armv8_pmu_init(cpu_pmu);
 	cpu_pmu->name			= "armv8_cortex_a53";
 	cpu_pmu->map_event		= armv8_a53_map_event;
-	cpu_pmu->pmu.attr_groups	= armv8_pmuv3_attr_groups;
+	cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_EVENTS] =
+		&armv8_pmuv3_events_attr_group;
+	cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_FORMATS] =
+		&armv8_pmuv3_format_attr_group;
 	return armv8pmu_probe_pmu(cpu_pmu);
 }
 
@@ -1003,7 +1003,10 @@ static int armv8_a57_pmu_init(struct arm_pmu *cpu_pmu)
 	armv8_pmu_init(cpu_pmu);
 	cpu_pmu->name			= "armv8_cortex_a57";
 	cpu_pmu->map_event		= armv8_a57_map_event;
-	cpu_pmu->pmu.attr_groups	= armv8_pmuv3_attr_groups;
+	cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_EVENTS] =
+		&armv8_pmuv3_events_attr_group;
+	cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_FORMATS] =
+		&armv8_pmuv3_format_attr_group;
 	return armv8pmu_probe_pmu(cpu_pmu);
 }
 
@@ -1012,7 +1015,10 @@ static int armv8_a72_pmu_init(struct arm_pmu *cpu_pmu)
 	armv8_pmu_init(cpu_pmu);
 	cpu_pmu->name			= "armv8_cortex_a72";
 	cpu_pmu->map_event		= armv8_a57_map_event;
-	cpu_pmu->pmu.attr_groups	= armv8_pmuv3_attr_groups;
+	cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_EVENTS] =
+		&armv8_pmuv3_events_attr_group;
+	cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_FORMATS] =
+		&armv8_pmuv3_format_attr_group;
 	return armv8pmu_probe_pmu(cpu_pmu);
 }
 
@@ -1021,7 +1027,10 @@ static int armv8_thunder_pmu_init(struct arm_pmu *cpu_pmu)
 	armv8_pmu_init(cpu_pmu);
 	cpu_pmu->name			= "armv8_cavium_thunder";
 	cpu_pmu->map_event		= armv8_thunder_map_event;
-	cpu_pmu->pmu.attr_groups	= armv8_pmuv3_attr_groups;
+	cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_EVENTS] =
+		&armv8_pmuv3_events_attr_group;
+	cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_FORMATS] =
+		&armv8_pmuv3_format_attr_group;
 	return armv8pmu_probe_pmu(cpu_pmu);
 }
 
@@ -1030,7 +1039,10 @@ static int armv8_vulcan_pmu_init(struct arm_pmu *cpu_pmu)
 	armv8_pmu_init(cpu_pmu);
 	cpu_pmu->name			= "armv8_brcm_vulcan";
 	cpu_pmu->map_event		= armv8_vulcan_map_event;
-	cpu_pmu->pmu.attr_groups	= armv8_pmuv3_attr_groups;
+	cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_EVENTS] =
+		&armv8_pmuv3_events_attr_group;
+	cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_FORMATS] =
+		&armv8_pmuv3_format_attr_group;
 	return armv8pmu_probe_pmu(cpu_pmu);
 }
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [RFCv4 2/7] arm64: perf: move to common attr_group fields
@ 2016-09-08 10:21   ` Mark Rutland
  0 siblings, 0 replies; 36+ messages in thread
From: Mark Rutland @ 2016-09-08 10:21 UTC (permalink / raw)
  To: linux-arm-kernel

By using a common attr_groups array, the common arm_pmu code can set up
common files (e.g. cpumask) for us in subsequent patches.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/kernel/perf_event.c | 36 ++++++++++++++++++++++++------------
 1 file changed, 24 insertions(+), 12 deletions(-)

diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index 838ccf1..bcc8dff 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -523,12 +523,6 @@ static struct attribute_group armv8_pmuv3_format_attr_group = {
 	.attrs = armv8_pmuv3_format_attrs,
 };
 
-static const struct attribute_group *armv8_pmuv3_attr_groups[] = {
-	&armv8_pmuv3_events_attr_group,
-	&armv8_pmuv3_format_attr_group,
-	NULL,
-};
-
 /*
  * Perf Events' indices
  */
@@ -985,7 +979,10 @@ static int armv8_pmuv3_init(struct arm_pmu *cpu_pmu)
 	armv8_pmu_init(cpu_pmu);
 	cpu_pmu->name			= "armv8_pmuv3";
 	cpu_pmu->map_event		= armv8_pmuv3_map_event;
-	cpu_pmu->pmu.attr_groups	= armv8_pmuv3_attr_groups;
+	cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_EVENTS] =
+		&armv8_pmuv3_events_attr_group;
+	cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_FORMATS] =
+		&armv8_pmuv3_format_attr_group;
 	return armv8pmu_probe_pmu(cpu_pmu);
 }
 
@@ -994,7 +991,10 @@ static int armv8_a53_pmu_init(struct arm_pmu *cpu_pmu)
 	armv8_pmu_init(cpu_pmu);
 	cpu_pmu->name			= "armv8_cortex_a53";
 	cpu_pmu->map_event		= armv8_a53_map_event;
-	cpu_pmu->pmu.attr_groups	= armv8_pmuv3_attr_groups;
+	cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_EVENTS] =
+		&armv8_pmuv3_events_attr_group;
+	cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_FORMATS] =
+		&armv8_pmuv3_format_attr_group;
 	return armv8pmu_probe_pmu(cpu_pmu);
 }
 
@@ -1003,7 +1003,10 @@ static int armv8_a57_pmu_init(struct arm_pmu *cpu_pmu)
 	armv8_pmu_init(cpu_pmu);
 	cpu_pmu->name			= "armv8_cortex_a57";
 	cpu_pmu->map_event		= armv8_a57_map_event;
-	cpu_pmu->pmu.attr_groups	= armv8_pmuv3_attr_groups;
+	cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_EVENTS] =
+		&armv8_pmuv3_events_attr_group;
+	cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_FORMATS] =
+		&armv8_pmuv3_format_attr_group;
 	return armv8pmu_probe_pmu(cpu_pmu);
 }
 
@@ -1012,7 +1015,10 @@ static int armv8_a72_pmu_init(struct arm_pmu *cpu_pmu)
 	armv8_pmu_init(cpu_pmu);
 	cpu_pmu->name			= "armv8_cortex_a72";
 	cpu_pmu->map_event		= armv8_a57_map_event;
-	cpu_pmu->pmu.attr_groups	= armv8_pmuv3_attr_groups;
+	cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_EVENTS] =
+		&armv8_pmuv3_events_attr_group;
+	cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_FORMATS] =
+		&armv8_pmuv3_format_attr_group;
 	return armv8pmu_probe_pmu(cpu_pmu);
 }
 
@@ -1021,7 +1027,10 @@ static int armv8_thunder_pmu_init(struct arm_pmu *cpu_pmu)
 	armv8_pmu_init(cpu_pmu);
 	cpu_pmu->name			= "armv8_cavium_thunder";
 	cpu_pmu->map_event		= armv8_thunder_map_event;
-	cpu_pmu->pmu.attr_groups	= armv8_pmuv3_attr_groups;
+	cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_EVENTS] =
+		&armv8_pmuv3_events_attr_group;
+	cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_FORMATS] =
+		&armv8_pmuv3_format_attr_group;
 	return armv8pmu_probe_pmu(cpu_pmu);
 }
 
@@ -1030,7 +1039,10 @@ static int armv8_vulcan_pmu_init(struct arm_pmu *cpu_pmu)
 	armv8_pmu_init(cpu_pmu);
 	cpu_pmu->name			= "armv8_brcm_vulcan";
 	cpu_pmu->map_event		= armv8_vulcan_map_event;
-	cpu_pmu->pmu.attr_groups	= armv8_pmuv3_attr_groups;
+	cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_EVENTS] =
+		&armv8_pmuv3_events_attr_group;
+	cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_FORMATS] =
+		&armv8_pmuv3_format_attr_group;
 	return armv8pmu_probe_pmu(cpu_pmu);
 }
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [RFCv4 3/7] arm: perf: move to common attr_group fields
  2016-09-08 10:21 ` Mark Rutland
@ 2016-09-08 10:21   ` Mark Rutland
  -1 siblings, 0 replies; 36+ messages in thread
From: Mark Rutland @ 2016-09-08 10:21 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel
  Cc: acme, alexander.shishkin, jolsa, mark.rutland, mingo, peterz,
	will.deacon

By using a common attr_groups array, the common arm_pmu code can set up
common files (e.g. cpumask) for us in subsequent patches.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm/kernel/perf_event_v7.c | 47 ++++++++++++++++++++++++-----------------
 1 file changed, 28 insertions(+), 19 deletions(-)

diff --git a/arch/arm/kernel/perf_event_v7.c b/arch/arm/kernel/perf_event_v7.c
index 1506385..b942349 100644
--- a/arch/arm/kernel/perf_event_v7.c
+++ b/arch/arm/kernel/perf_event_v7.c
@@ -596,12 +596,6 @@ static struct attribute_group armv7_pmuv1_events_attr_group = {
 	.attrs = armv7_pmuv1_event_attrs,
 };
 
-static const struct attribute_group *armv7_pmuv1_attr_groups[] = {
-	&armv7_pmuv1_events_attr_group,
-	&armv7_pmu_format_attr_group,
-	NULL,
-};
-
 ARMV7_EVENT_ATTR(mem_access, ARMV7_PERFCTR_MEM_ACCESS);
 ARMV7_EVENT_ATTR(l1i_cache, ARMV7_PERFCTR_L1_ICACHE_ACCESS);
 ARMV7_EVENT_ATTR(l1d_cache_wb, ARMV7_PERFCTR_L1_DCACHE_WB);
@@ -653,12 +647,6 @@ static struct attribute_group armv7_pmuv2_events_attr_group = {
 	.attrs = armv7_pmuv2_event_attrs,
 };
 
-static const struct attribute_group *armv7_pmuv2_attr_groups[] = {
-	&armv7_pmuv2_events_attr_group,
-	&armv7_pmu_format_attr_group,
-	NULL,
-};
-
 /*
  * Perf Events' indices
  */
@@ -1208,7 +1196,10 @@ static int armv7_a8_pmu_init(struct arm_pmu *cpu_pmu)
 	armv7pmu_init(cpu_pmu);
 	cpu_pmu->name		= "armv7_cortex_a8";
 	cpu_pmu->map_event	= armv7_a8_map_event;
-	cpu_pmu->pmu.attr_groups = armv7_pmuv1_attr_groups;
+	cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_EVENTS] =
+		&armv7_pmuv1_events_attr_group;
+	cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_FORMATS] =
+		&armv7_pmu_format_attr_group;
 	return armv7_probe_num_events(cpu_pmu);
 }
 
@@ -1217,7 +1208,10 @@ static int armv7_a9_pmu_init(struct arm_pmu *cpu_pmu)
 	armv7pmu_init(cpu_pmu);
 	cpu_pmu->name		= "armv7_cortex_a9";
 	cpu_pmu->map_event	= armv7_a9_map_event;
-	cpu_pmu->pmu.attr_groups = armv7_pmuv1_attr_groups;
+	cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_EVENTS] =
+		&armv7_pmuv1_events_attr_group;
+	cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_FORMATS] =
+		&armv7_pmu_format_attr_group;
 	return armv7_probe_num_events(cpu_pmu);
 }
 
@@ -1226,7 +1220,10 @@ static int armv7_a5_pmu_init(struct arm_pmu *cpu_pmu)
 	armv7pmu_init(cpu_pmu);
 	cpu_pmu->name		= "armv7_cortex_a5";
 	cpu_pmu->map_event	= armv7_a5_map_event;
-	cpu_pmu->pmu.attr_groups = armv7_pmuv1_attr_groups;
+	cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_EVENTS] =
+		&armv7_pmuv1_events_attr_group;
+	cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_FORMATS] =
+		&armv7_pmu_format_attr_group;
 	return armv7_probe_num_events(cpu_pmu);
 }
 
@@ -1236,7 +1233,10 @@ static int armv7_a15_pmu_init(struct arm_pmu *cpu_pmu)
 	cpu_pmu->name		= "armv7_cortex_a15";
 	cpu_pmu->map_event	= armv7_a15_map_event;
 	cpu_pmu->set_event_filter = armv7pmu_set_event_filter;
-	cpu_pmu->pmu.attr_groups = armv7_pmuv2_attr_groups;
+	cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_EVENTS] =
+		&armv7_pmuv2_events_attr_group;
+	cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_FORMATS] =
+		&armv7_pmu_format_attr_group;
 	return armv7_probe_num_events(cpu_pmu);
 }
 
@@ -1246,7 +1246,10 @@ static int armv7_a7_pmu_init(struct arm_pmu *cpu_pmu)
 	cpu_pmu->name		= "armv7_cortex_a7";
 	cpu_pmu->map_event	= armv7_a7_map_event;
 	cpu_pmu->set_event_filter = armv7pmu_set_event_filter;
-	cpu_pmu->pmu.attr_groups = armv7_pmuv2_attr_groups;
+	cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_EVENTS] =
+		&armv7_pmuv2_events_attr_group;
+	cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_FORMATS] =
+		&armv7_pmu_format_attr_group;
 	return armv7_probe_num_events(cpu_pmu);
 }
 
@@ -1256,7 +1259,10 @@ static int armv7_a12_pmu_init(struct arm_pmu *cpu_pmu)
 	cpu_pmu->name		= "armv7_cortex_a12";
 	cpu_pmu->map_event	= armv7_a12_map_event;
 	cpu_pmu->set_event_filter = armv7pmu_set_event_filter;
-	cpu_pmu->pmu.attr_groups = armv7_pmuv2_attr_groups;
+	cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_EVENTS] =
+		&armv7_pmuv2_events_attr_group;
+	cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_FORMATS] =
+		&armv7_pmu_format_attr_group;
 	return armv7_probe_num_events(cpu_pmu);
 }
 
@@ -1264,7 +1270,10 @@ static int armv7_a17_pmu_init(struct arm_pmu *cpu_pmu)
 {
 	int ret = armv7_a12_pmu_init(cpu_pmu);
 	cpu_pmu->name = "armv7_cortex_a17";
-	cpu_pmu->pmu.attr_groups = armv7_pmuv2_attr_groups;
+	cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_EVENTS] =
+		&armv7_pmuv2_events_attr_group;
+	cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_FORMATS] =
+		&armv7_pmu_format_attr_group;
 	return ret;
 }
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [RFCv4 3/7] arm: perf: move to common attr_group fields
@ 2016-09-08 10:21   ` Mark Rutland
  0 siblings, 0 replies; 36+ messages in thread
From: Mark Rutland @ 2016-09-08 10:21 UTC (permalink / raw)
  To: linux-arm-kernel

By using a common attr_groups array, the common arm_pmu code can set up
common files (e.g. cpumask) for us in subsequent patches.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm/kernel/perf_event_v7.c | 47 ++++++++++++++++++++++++-----------------
 1 file changed, 28 insertions(+), 19 deletions(-)

diff --git a/arch/arm/kernel/perf_event_v7.c b/arch/arm/kernel/perf_event_v7.c
index 1506385..b942349 100644
--- a/arch/arm/kernel/perf_event_v7.c
+++ b/arch/arm/kernel/perf_event_v7.c
@@ -596,12 +596,6 @@ static struct attribute_group armv7_pmuv1_events_attr_group = {
 	.attrs = armv7_pmuv1_event_attrs,
 };
 
-static const struct attribute_group *armv7_pmuv1_attr_groups[] = {
-	&armv7_pmuv1_events_attr_group,
-	&armv7_pmu_format_attr_group,
-	NULL,
-};
-
 ARMV7_EVENT_ATTR(mem_access, ARMV7_PERFCTR_MEM_ACCESS);
 ARMV7_EVENT_ATTR(l1i_cache, ARMV7_PERFCTR_L1_ICACHE_ACCESS);
 ARMV7_EVENT_ATTR(l1d_cache_wb, ARMV7_PERFCTR_L1_DCACHE_WB);
@@ -653,12 +647,6 @@ static struct attribute_group armv7_pmuv2_events_attr_group = {
 	.attrs = armv7_pmuv2_event_attrs,
 };
 
-static const struct attribute_group *armv7_pmuv2_attr_groups[] = {
-	&armv7_pmuv2_events_attr_group,
-	&armv7_pmu_format_attr_group,
-	NULL,
-};
-
 /*
  * Perf Events' indices
  */
@@ -1208,7 +1196,10 @@ static int armv7_a8_pmu_init(struct arm_pmu *cpu_pmu)
 	armv7pmu_init(cpu_pmu);
 	cpu_pmu->name		= "armv7_cortex_a8";
 	cpu_pmu->map_event	= armv7_a8_map_event;
-	cpu_pmu->pmu.attr_groups = armv7_pmuv1_attr_groups;
+	cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_EVENTS] =
+		&armv7_pmuv1_events_attr_group;
+	cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_FORMATS] =
+		&armv7_pmu_format_attr_group;
 	return armv7_probe_num_events(cpu_pmu);
 }
 
@@ -1217,7 +1208,10 @@ static int armv7_a9_pmu_init(struct arm_pmu *cpu_pmu)
 	armv7pmu_init(cpu_pmu);
 	cpu_pmu->name		= "armv7_cortex_a9";
 	cpu_pmu->map_event	= armv7_a9_map_event;
-	cpu_pmu->pmu.attr_groups = armv7_pmuv1_attr_groups;
+	cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_EVENTS] =
+		&armv7_pmuv1_events_attr_group;
+	cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_FORMATS] =
+		&armv7_pmu_format_attr_group;
 	return armv7_probe_num_events(cpu_pmu);
 }
 
@@ -1226,7 +1220,10 @@ static int armv7_a5_pmu_init(struct arm_pmu *cpu_pmu)
 	armv7pmu_init(cpu_pmu);
 	cpu_pmu->name		= "armv7_cortex_a5";
 	cpu_pmu->map_event	= armv7_a5_map_event;
-	cpu_pmu->pmu.attr_groups = armv7_pmuv1_attr_groups;
+	cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_EVENTS] =
+		&armv7_pmuv1_events_attr_group;
+	cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_FORMATS] =
+		&armv7_pmu_format_attr_group;
 	return armv7_probe_num_events(cpu_pmu);
 }
 
@@ -1236,7 +1233,10 @@ static int armv7_a15_pmu_init(struct arm_pmu *cpu_pmu)
 	cpu_pmu->name		= "armv7_cortex_a15";
 	cpu_pmu->map_event	= armv7_a15_map_event;
 	cpu_pmu->set_event_filter = armv7pmu_set_event_filter;
-	cpu_pmu->pmu.attr_groups = armv7_pmuv2_attr_groups;
+	cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_EVENTS] =
+		&armv7_pmuv2_events_attr_group;
+	cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_FORMATS] =
+		&armv7_pmu_format_attr_group;
 	return armv7_probe_num_events(cpu_pmu);
 }
 
@@ -1246,7 +1246,10 @@ static int armv7_a7_pmu_init(struct arm_pmu *cpu_pmu)
 	cpu_pmu->name		= "armv7_cortex_a7";
 	cpu_pmu->map_event	= armv7_a7_map_event;
 	cpu_pmu->set_event_filter = armv7pmu_set_event_filter;
-	cpu_pmu->pmu.attr_groups = armv7_pmuv2_attr_groups;
+	cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_EVENTS] =
+		&armv7_pmuv2_events_attr_group;
+	cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_FORMATS] =
+		&armv7_pmu_format_attr_group;
 	return armv7_probe_num_events(cpu_pmu);
 }
 
@@ -1256,7 +1259,10 @@ static int armv7_a12_pmu_init(struct arm_pmu *cpu_pmu)
 	cpu_pmu->name		= "armv7_cortex_a12";
 	cpu_pmu->map_event	= armv7_a12_map_event;
 	cpu_pmu->set_event_filter = armv7pmu_set_event_filter;
-	cpu_pmu->pmu.attr_groups = armv7_pmuv2_attr_groups;
+	cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_EVENTS] =
+		&armv7_pmuv2_events_attr_group;
+	cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_FORMATS] =
+		&armv7_pmu_format_attr_group;
 	return armv7_probe_num_events(cpu_pmu);
 }
 
@@ -1264,7 +1270,10 @@ static int armv7_a17_pmu_init(struct arm_pmu *cpu_pmu)
 {
 	int ret = armv7_a12_pmu_init(cpu_pmu);
 	cpu_pmu->name = "armv7_cortex_a17";
-	cpu_pmu->pmu.attr_groups = armv7_pmuv2_attr_groups;
+	cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_EVENTS] =
+		&armv7_pmuv2_events_attr_group;
+	cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_FORMATS] =
+		&armv7_pmu_format_attr_group;
 	return ret;
 }
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [RFCv4 4/7] drivers/perf: arm_pmu: only use common attr_groups
  2016-09-08 10:21 ` Mark Rutland
@ 2016-09-08 10:21   ` Mark Rutland
  -1 siblings, 0 replies; 36+ messages in thread
From: Mark Rutland @ 2016-09-08 10:21 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel
  Cc: acme, alexander.shishkin, jolsa, mark.rutland, mingo, peterz,
	will.deacon

Now that the 32-bit and 64-bit perf backends use the common groups
directly, remove the fallback and no longer allow the groups array to be
overridden.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
---
 drivers/perf/arm_pmu.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
index 145caf4..ac83e1e 100644
--- a/drivers/perf/arm_pmu.c
+++ b/drivers/perf/arm_pmu.c
@@ -549,6 +549,7 @@ static void armpmu_init(struct arm_pmu *armpmu)
 		.stop		= armpmu_stop,
 		.read		= armpmu_read,
 		.filter_match	= armpmu_filter_match,
+		.attr_groups	= armpmu->attr_groups,
 	};
 }
 
@@ -1039,8 +1040,6 @@ int arm_pmu_device_probe(struct platform_device *pdev,
 		goto out_free;
 	}
 
-	if (!pmu->pmu.attr_groups)
-		pmu->pmu.attr_groups = pmu->attr_groups;
 
 	ret = cpu_pmu_init(pmu);
 	if (ret)
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [RFCv4 4/7] drivers/perf: arm_pmu: only use common attr_groups
@ 2016-09-08 10:21   ` Mark Rutland
  0 siblings, 0 replies; 36+ messages in thread
From: Mark Rutland @ 2016-09-08 10:21 UTC (permalink / raw)
  To: linux-arm-kernel

Now that the 32-bit and 64-bit perf backends use the common groups
directly, remove the fallback and no longer allow the groups array to be
overridden.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
---
 drivers/perf/arm_pmu.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
index 145caf4..ac83e1e 100644
--- a/drivers/perf/arm_pmu.c
+++ b/drivers/perf/arm_pmu.c
@@ -549,6 +549,7 @@ static void armpmu_init(struct arm_pmu *armpmu)
 		.stop		= armpmu_stop,
 		.read		= armpmu_read,
 		.filter_match	= armpmu_filter_match,
+		.attr_groups	= armpmu->attr_groups,
 	};
 }
 
@@ -1039,8 +1040,6 @@ int arm_pmu_device_probe(struct platform_device *pdev,
 		goto out_free;
 	}
 
-	if (!pmu->pmu.attr_groups)
-		pmu->pmu.attr_groups = pmu->attr_groups;
 
 	ret = cpu_pmu_init(pmu);
 	if (ret)
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [RFCv4 5/7] drivers/perf: arm_pmu: expose a cpumask in sysfs
  2016-09-08 10:21 ` Mark Rutland
@ 2016-09-08 10:21   ` Mark Rutland
  -1 siblings, 0 replies; 36+ messages in thread
From: Mark Rutland @ 2016-09-08 10:21 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel
  Cc: acme, alexander.shishkin, jolsa, mark.rutland, mingo, peterz,
	will.deacon

In systems with heterogeneous CPUs, there are multiple logical CPU PMUs,
each of which covers a subset of CPUs in the system. In some cases
userspace needs to know which CPUs a given logical PMU covers, so we'd
like to expose a cpumask under sysfs, similar to what is done for uncore
PMUs.

Unfortunately, prior to commit 00e727bb389359c8 ("perf stat: Balance
opening and reading events"), perf stat only correctly handled a cpumask
holding a single CPU, and only when profiling in system-wide mode. In
other cases, the presence of a cpumask file could cause perf stat to
behave erratically.

Thus, exposing a cpumask file would break older perf binaries in cases
where they would otherwise work.

To avoid this issue while still providing userspace with the information
it needs, this patch exposes a differently-named file (cpus) under
sysfs. New tools can look for this and operate correctly, while older
tools will not be adversely affected by its presence.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
---
 drivers/perf/arm_pmu.c       | 21 +++++++++++++++++++++
 include/linux/perf/arm_pmu.h |  1 +
 2 files changed, 22 insertions(+)

diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
index ac83e1e..09e9944 100644
--- a/drivers/perf/arm_pmu.c
+++ b/drivers/perf/arm_pmu.c
@@ -534,6 +534,25 @@ static int armpmu_filter_match(struct perf_event *event)
 	return cpumask_test_cpu(cpu, &armpmu->supported_cpus);
 }
 
+static ssize_t armpmu_cpumask_show(struct device *dev,
+				   struct device_attribute *attr, char *buf)
+{
+	struct arm_pmu *armpmu = to_arm_pmu(dev_get_drvdata(dev));
+	return cpumap_print_to_pagebuf(true, buf, &armpmu->supported_cpus);
+}
+
+static struct device_attribute armpmu_cpumask_attr =
+		__ATTR(cpus, S_IRUGO, armpmu_cpumask_show, NULL);
+
+static struct attribute *armpmu_common_attrs[] = {
+	&armpmu_cpumask_attr.attr,
+	NULL,
+};
+
+static struct attribute_group armpmu_common_attr_group = {
+	.attrs = armpmu_common_attrs,
+};
+
 static void armpmu_init(struct arm_pmu *armpmu)
 {
 	atomic_set(&armpmu->active_events, 0);
@@ -551,6 +570,8 @@ static void armpmu_init(struct arm_pmu *armpmu)
 		.filter_match	= armpmu_filter_match,
 		.attr_groups	= armpmu->attr_groups,
 	};
+	armpmu->attr_groups[ARMPMU_ATTR_GROUP_COMMON] =
+		&armpmu_common_attr_group;
 }
 
 /* Set at runtime when we know what CPU type we are. */
diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h
index 8030814..4040b90 100644
--- a/include/linux/perf/arm_pmu.h
+++ b/include/linux/perf/arm_pmu.h
@@ -78,6 +78,7 @@ struct pmu_hw_events {
 };
 
 enum armpmu_attr_groups {
+	ARMPMU_ATTR_GROUP_COMMON,
 	ARMPMU_ATTR_GROUP_EVENTS,
 	ARMPMU_ATTR_GROUP_FORMATS,
 	ARMPMU_NR_ATTR_GROUPS
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [RFCv4 5/7] drivers/perf: arm_pmu: expose a cpumask in sysfs
@ 2016-09-08 10:21   ` Mark Rutland
  0 siblings, 0 replies; 36+ messages in thread
From: Mark Rutland @ 2016-09-08 10:21 UTC (permalink / raw)
  To: linux-arm-kernel

In systems with heterogeneous CPUs, there are multiple logical CPU PMUs,
each of which covers a subset of CPUs in the system. In some cases
userspace needs to know which CPUs a given logical PMU covers, so we'd
like to expose a cpumask under sysfs, similar to what is done for uncore
PMUs.

Unfortunately, prior to commit 00e727bb389359c8 ("perf stat: Balance
opening and reading events"), perf stat only correctly handled a cpumask
holding a single CPU, and only when profiling in system-wide mode. In
other cases, the presence of a cpumask file could cause perf stat to
behave erratically.

Thus, exposing a cpumask file would break older perf binaries in cases
where they would otherwise work.

To avoid this issue while still providing userspace with the information
it needs, this patch exposes a differently-named file (cpus) under
sysfs. New tools can look for this and operate correctly, while older
tools will not be adversely affected by its presence.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
---
 drivers/perf/arm_pmu.c       | 21 +++++++++++++++++++++
 include/linux/perf/arm_pmu.h |  1 +
 2 files changed, 22 insertions(+)

diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
index ac83e1e..09e9944 100644
--- a/drivers/perf/arm_pmu.c
+++ b/drivers/perf/arm_pmu.c
@@ -534,6 +534,25 @@ static int armpmu_filter_match(struct perf_event *event)
 	return cpumask_test_cpu(cpu, &armpmu->supported_cpus);
 }
 
+static ssize_t armpmu_cpumask_show(struct device *dev,
+				   struct device_attribute *attr, char *buf)
+{
+	struct arm_pmu *armpmu = to_arm_pmu(dev_get_drvdata(dev));
+	return cpumap_print_to_pagebuf(true, buf, &armpmu->supported_cpus);
+}
+
+static struct device_attribute armpmu_cpumask_attr =
+		__ATTR(cpus, S_IRUGO, armpmu_cpumask_show, NULL);
+
+static struct attribute *armpmu_common_attrs[] = {
+	&armpmu_cpumask_attr.attr,
+	NULL,
+};
+
+static struct attribute_group armpmu_common_attr_group = {
+	.attrs = armpmu_common_attrs,
+};
+
 static void armpmu_init(struct arm_pmu *armpmu)
 {
 	atomic_set(&armpmu->active_events, 0);
@@ -551,6 +570,8 @@ static void armpmu_init(struct arm_pmu *armpmu)
 		.filter_match	= armpmu_filter_match,
 		.attr_groups	= armpmu->attr_groups,
 	};
+	armpmu->attr_groups[ARMPMU_ATTR_GROUP_COMMON] =
+		&armpmu_common_attr_group;
 }
 
 /* Set at runtime when we know what CPU type we are. */
diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h
index 8030814..4040b90 100644
--- a/include/linux/perf/arm_pmu.h
+++ b/include/linux/perf/arm_pmu.h
@@ -78,6 +78,7 @@ struct pmu_hw_events {
 };
 
 enum armpmu_attr_groups {
+	ARMPMU_ATTR_GROUP_COMMON,
 	ARMPMU_ATTR_GROUP_EVENTS,
 	ARMPMU_ATTR_GROUP_FORMATS,
 	ARMPMU_NR_ATTR_GROUPS
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [RFCv4 6/7] perf: util: only open events on CPUs an evsel permits
  2016-09-08 10:21 ` Mark Rutland
@ 2016-09-08 10:21   ` Mark Rutland
  -1 siblings, 0 replies; 36+ messages in thread
From: Mark Rutland @ 2016-09-08 10:21 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel
  Cc: acme, alexander.shishkin, jolsa, mark.rutland, mingo, peterz,
	will.deacon

In systems with heterogeneous CPU PMUs, it's possible for each evsel to
cover a distinct set of CPUs, and hence the cpu_map associated with each
evsel may have a distinct idx<->id mapping. Any of these may be distinct from
the evlist's cpu map.

Events can be tied to the same fd so long as they use the same per-cpu
ringbuffer (i.e. so long as they are on the same CPU). To acquire the
correct FDs, we must compare the Linux logical IDs rather than the evsel
or evlist indices.

This path adds logic to perf_evlist__mmap_per_evsel to handle this,
translating IDs as required. As PMUs may cover a subset of CPUs from the
evlist, we skip the CPUs a PMU cannot handle.

Without this patch, perf record may try to mmap erroneous FDs on
heterogeneous systems, and will bail out early rather than running the
workload.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-kernel@vger.kernel.org
---
 tools/perf/util/evlist.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 097b3ed..ea34c5a 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -1032,16 +1032,18 @@ perf_evlist__should_poll(struct perf_evlist *evlist __maybe_unused,
 }
 
 static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
-				       struct mmap_params *mp, int cpu,
+				       struct mmap_params *mp, int cpu_idx,
 				       int thread, int *_output, int *_output_backward)
 {
 	struct perf_evsel *evsel;
 	int revent;
+	int evlist_cpu = cpu_map__cpu(evlist->cpus, cpu_idx);
 
 	evlist__for_each_entry(evlist, evsel) {
 		struct perf_mmap *maps = evlist->mmap;
 		int *output = _output;
 		int fd;
+		int cpu;
 
 		if (evsel->attr.write_backward) {
 			output = _output_backward;
@@ -1060,6 +1062,10 @@ static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
 		if (evsel->system_wide && thread)
 			continue;
 
+		cpu = cpu_map__idx(evsel->cpus, evlist_cpu);
+		if (cpu == -1)
+			continue;
+
 		fd = FD(evsel, cpu, thread);
 
 		if (*output == -1) {
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [RFCv4 6/7] perf: util: only open events on CPUs an evsel permits
@ 2016-09-08 10:21   ` Mark Rutland
  0 siblings, 0 replies; 36+ messages in thread
From: Mark Rutland @ 2016-09-08 10:21 UTC (permalink / raw)
  To: linux-arm-kernel

In systems with heterogeneous CPU PMUs, it's possible for each evsel to
cover a distinct set of CPUs, and hence the cpu_map associated with each
evsel may have a distinct idx<->id mapping. Any of these may be distinct from
the evlist's cpu map.

Events can be tied to the same fd so long as they use the same per-cpu
ringbuffer (i.e. so long as they are on the same CPU). To acquire the
correct FDs, we must compare the Linux logical IDs rather than the evsel
or evlist indices.

This path adds logic to perf_evlist__mmap_per_evsel to handle this,
translating IDs as required. As PMUs may cover a subset of CPUs from the
evlist, we skip the CPUs a PMU cannot handle.

Without this patch, perf record may try to mmap erroneous FDs on
heterogeneous systems, and will bail out early rather than running the
workload.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-kernel at vger.kernel.org
---
 tools/perf/util/evlist.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 097b3ed..ea34c5a 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -1032,16 +1032,18 @@ perf_evlist__should_poll(struct perf_evlist *evlist __maybe_unused,
 }
 
 static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
-				       struct mmap_params *mp, int cpu,
+				       struct mmap_params *mp, int cpu_idx,
 				       int thread, int *_output, int *_output_backward)
 {
 	struct perf_evsel *evsel;
 	int revent;
+	int evlist_cpu = cpu_map__cpu(evlist->cpus, cpu_idx);
 
 	evlist__for_each_entry(evlist, evsel) {
 		struct perf_mmap *maps = evlist->mmap;
 		int *output = _output;
 		int fd;
+		int cpu;
 
 		if (evsel->attr.write_backward) {
 			output = _output_backward;
@@ -1060,6 +1062,10 @@ static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
 		if (evsel->system_wide && thread)
 			continue;
 
+		cpu = cpu_map__idx(evsel->cpus, evlist_cpu);
+		if (cpu == -1)
+			continue;
+
 		fd = FD(evsel, cpu, thread);
 
 		if (*output == -1) {
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [RFCv4 7/7] perf: util: support alternative sysfs cpumask
  2016-09-08 10:21 ` Mark Rutland
@ 2016-09-08 10:21   ` Mark Rutland
  -1 siblings, 0 replies; 36+ messages in thread
From: Mark Rutland @ 2016-09-08 10:21 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel
  Cc: acme, alexander.shishkin, jolsa, mark.rutland, mingo, peterz,
	will.deacon

The perf tools can read a cpumask file for a PMU, describing a subset of
CPUs which that PMU covers. So far this has only been used to cater for
uncore PMUs, which in practice happen to only have a single CPU
described in the mask.

Until recently, the perf tools only correctly handled cpumask containing
a single CPU, and only when monitoring in system-wide mode. For example,
prior to commit 00e727bb389359c8 ("perf stat: Balance opening and
reading events"), a mask with more than a single CPU could cause
perf stat to hang. When a CPU PMU covers a subset of CPUs, but lacks a
cpumask, perf record will fail to open events (on the cores the PMU does
not support), and gives up.

For systems with heterogeneous CPUs such as ARM big.LITTLE systems, this
presents a problem. We have a PMU for each microarchitecture (e.g. a big
PMU and a little PMU), and would like to expose a cpumask for each (so
as to allow perf record and other tools to do the right thing). However,
doing so kernel-side will cause old perf binaries to not function (e.g.
hitting the issue solved by 00e727bb389359c8), and thus commits the
cardinal sin of breaking (existing) userspace.

To address this chicken-and-egg problem, this patch adds support got a
new file, cpus, which is largely identical to the existing cpumask file.
A kernel can expose this file, knowing that new perf binaries will
correctly support it, while old perf binaries will not look for it (and
thus will not be broken).

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-kernel@vger.kernel.org
---
 tools/perf/util/pmu.c | 15 ++++++++++++---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
index ddb0261..2babcdf 100644
--- a/tools/perf/util/pmu.c
+++ b/tools/perf/util/pmu.c
@@ -445,14 +445,23 @@ static struct cpu_map *pmu_cpumask(const char *name)
 	FILE *file;
 	struct cpu_map *cpus;
 	const char *sysfs = sysfs__mountpoint();
+	const char *templates[] = {
+		 "%s/bus/event_source/devices/%s/cpumask",
+		 "%s/bus/event_source/devices/%s/cpus",
+		 NULL
+	};
+	const char **template;
 
 	if (!sysfs)
 		return NULL;
 
-	snprintf(path, PATH_MAX,
-		 "%s/bus/event_source/devices/%s/cpumask", sysfs, name);
+	for (template = templates; *template; template++) {
+		snprintf(path, PATH_MAX, *template, sysfs, name);
+		if (stat(path, &st) == 0)
+			break;
+	}
 
-	if (stat(path, &st) < 0)
+	if (!*template)
 		return NULL;
 
 	file = fopen(path, "r");
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [RFCv4 7/7] perf: util: support alternative sysfs cpumask
@ 2016-09-08 10:21   ` Mark Rutland
  0 siblings, 0 replies; 36+ messages in thread
From: Mark Rutland @ 2016-09-08 10:21 UTC (permalink / raw)
  To: linux-arm-kernel

The perf tools can read a cpumask file for a PMU, describing a subset of
CPUs which that PMU covers. So far this has only been used to cater for
uncore PMUs, which in practice happen to only have a single CPU
described in the mask.

Until recently, the perf tools only correctly handled cpumask containing
a single CPU, and only when monitoring in system-wide mode. For example,
prior to commit 00e727bb389359c8 ("perf stat: Balance opening and
reading events"), a mask with more than a single CPU could cause
perf stat to hang. When a CPU PMU covers a subset of CPUs, but lacks a
cpumask, perf record will fail to open events (on the cores the PMU does
not support), and gives up.

For systems with heterogeneous CPUs such as ARM big.LITTLE systems, this
presents a problem. We have a PMU for each microarchitecture (e.g. a big
PMU and a little PMU), and would like to expose a cpumask for each (so
as to allow perf record and other tools to do the right thing). However,
doing so kernel-side will cause old perf binaries to not function (e.g.
hitting the issue solved by 00e727bb389359c8), and thus commits the
cardinal sin of breaking (existing) userspace.

To address this chicken-and-egg problem, this patch adds support got a
new file, cpus, which is largely identical to the existing cpumask file.
A kernel can expose this file, knowing that new perf binaries will
correctly support it, while old perf binaries will not look for it (and
thus will not be broken).

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-kernel at vger.kernel.org
---
 tools/perf/util/pmu.c | 15 ++++++++++++---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
index ddb0261..2babcdf 100644
--- a/tools/perf/util/pmu.c
+++ b/tools/perf/util/pmu.c
@@ -445,14 +445,23 @@ static struct cpu_map *pmu_cpumask(const char *name)
 	FILE *file;
 	struct cpu_map *cpus;
 	const char *sysfs = sysfs__mountpoint();
+	const char *templates[] = {
+		 "%s/bus/event_source/devices/%s/cpumask",
+		 "%s/bus/event_source/devices/%s/cpus",
+		 NULL
+	};
+	const char **template;
 
 	if (!sysfs)
 		return NULL;
 
-	snprintf(path, PATH_MAX,
-		 "%s/bus/event_source/devices/%s/cpumask", sysfs, name);
+	for (template = templates; *template; template++) {
+		snprintf(path, PATH_MAX, *template, sysfs, name);
+		if (stat(path, &st) == 0)
+			break;
+	}
 
-	if (stat(path, &st) < 0)
+	if (!*template)
 		return NULL;
 
 	file = fopen(path, "r");
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [RFCv4 0/7]  arm_pmu/perf tools: play nicely with CPU PMU cpumasks
  2016-09-08 10:21 ` Mark Rutland
@ 2016-09-08 10:38   ` Will Deacon
  -1 siblings, 0 replies; 36+ messages in thread
From: Will Deacon @ 2016-09-08 10:38 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-arm-kernel, linux-kernel, acme, alexander.shishkin, jolsa,
	mingo, peterz

On Thu, Sep 08, 2016 at 11:21:45AM +0100, Mark Rutland wrote:
> I'm trying to make the perf tool play better with PMUs in heterogeneous systems
> (e.g. big.LITTLE), where there are several logical PMUs, each covering a subset
> of CPUs.
> 
> Currently perf-record doesn't work for these PMUs, unless forced to use
> per-thread mmaps. In the absence of a cpumask, it tries to open events on CPUs
> not supported by a PMU, and gives up. In the presence of a cpumask, it ends up
> failing to mmap, as the evlist->cpus map contains a different set of CPUs from
> the evsel->cpus map populated from the cpumask. This is addressed by the
> penultimate patch in this series.
> 
> Complicating matters, prior to commit 00e727bb389359c8 ("perf stat: Balance
> opening and reading events"), from version two of this series, perf-stat would
> behave erroneously in the presence of a cpumask file, blocking forever after
> the workload completed. While this is now fixed, existing binaries (e.g. those
> shipped by distributions) would be broken by the addition of a cpumask file
> kernel-side.
> 
> To cater for this, this series adds support for a new PMU sysfs file, named
> 'cpus' rather than 'cpumask', listing a number of CPUs that a logical PMU
> covers. As old binaries will not look for this, this can be safely added to the
> kernel without risk of breakage.

On my x86 laptop, I have the following under /sys/bus/event_source/devices:

power/cpumask 0
uncore_arb/cpumask 0
uncore_imc/cpumask 0
uncore_cbox_0/cpumask 0
uncore_cbox_1/cpumask 0
cstate_core/cpumask 0-1
cstate_pkg/cpumask 0

Are you saying that, prior to 00e727bb3893, perf stat blocks forever on
those PMUs? If so, wouldn't we need to rename all those files too?

I don't want to be the odd duck kernel-side, because of a bug that's
been fixed in userspace.

Will

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [RFCv4 0/7]  arm_pmu/perf tools: play nicely with CPU PMU cpumasks
@ 2016-09-08 10:38   ` Will Deacon
  0 siblings, 0 replies; 36+ messages in thread
From: Will Deacon @ 2016-09-08 10:38 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Sep 08, 2016 at 11:21:45AM +0100, Mark Rutland wrote:
> I'm trying to make the perf tool play better with PMUs in heterogeneous systems
> (e.g. big.LITTLE), where there are several logical PMUs, each covering a subset
> of CPUs.
> 
> Currently perf-record doesn't work for these PMUs, unless forced to use
> per-thread mmaps. In the absence of a cpumask, it tries to open events on CPUs
> not supported by a PMU, and gives up. In the presence of a cpumask, it ends up
> failing to mmap, as the evlist->cpus map contains a different set of CPUs from
> the evsel->cpus map populated from the cpumask. This is addressed by the
> penultimate patch in this series.
> 
> Complicating matters, prior to commit 00e727bb389359c8 ("perf stat: Balance
> opening and reading events"), from version two of this series, perf-stat would
> behave erroneously in the presence of a cpumask file, blocking forever after
> the workload completed. While this is now fixed, existing binaries (e.g. those
> shipped by distributions) would be broken by the addition of a cpumask file
> kernel-side.
> 
> To cater for this, this series adds support for a new PMU sysfs file, named
> 'cpus' rather than 'cpumask', listing a number of CPUs that a logical PMU
> covers. As old binaries will not look for this, this can be safely added to the
> kernel without risk of breakage.

On my x86 laptop, I have the following under /sys/bus/event_source/devices:

power/cpumask 0
uncore_arb/cpumask 0
uncore_imc/cpumask 0
uncore_cbox_0/cpumask 0
uncore_cbox_1/cpumask 0
cstate_core/cpumask 0-1
cstate_pkg/cpumask 0

Are you saying that, prior to 00e727bb3893, perf stat blocks forever on
those PMUs? If so, wouldn't we need to rename all those files too?

I don't want to be the odd duck kernel-side, because of a bug that's
been fixed in userspace.

Will

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFCv4 0/7]  arm_pmu/perf tools: play nicely with CPU PMU cpumasks
  2016-09-08 10:38   ` Will Deacon
@ 2016-09-08 11:29     ` Mark Rutland
  -1 siblings, 0 replies; 36+ messages in thread
From: Mark Rutland @ 2016-09-08 11:29 UTC (permalink / raw)
  To: Will Deacon
  Cc: linux-arm-kernel, linux-kernel, acme, alexander.shishkin, jolsa,
	mingo, peterz

On Thu, Sep 08, 2016 at 11:38:49AM +0100, Will Deacon wrote:
> On Thu, Sep 08, 2016 at 11:21:45AM +0100, Mark Rutland wrote:
> > Complicating matters, prior to commit 00e727bb389359c8 ("perf stat: Balance
> > opening and reading events"), from version two of this series, perf-stat would
> > behave erroneously in the presence of a cpumask file, blocking forever after
> > the workload completed. While this is now fixed, existing binaries (e.g. those
> > shipped by distributions) would be broken by the addition of a cpumask file
> > kernel-side.
> > 
> > To cater for this, this series adds support for a new PMU sysfs file, named
> > 'cpus' rather than 'cpumask', listing a number of CPUs that a logical PMU
> > covers. As old binaries will not look for this, this can be safely added to the
> > kernel without risk of breakage.
> 
> On my x86 laptop, I have the following under /sys/bus/event_source/devices:
> 
> power/cpumask 0
> uncore_arb/cpumask 0
> uncore_imc/cpumask 0
> uncore_cbox_0/cpumask 0
> uncore_cbox_1/cpumask 0
> cstate_core/cpumask 0-1
> cstate_pkg/cpumask 0
> 
> Are you saying that, prior to 00e727bb3893, perf stat blocks forever on
> those PMUs? If so, wouldn't we need to rename all those files too?

No. I should have better described the issue above.

The issue fixed by commit 00e727bb3893 only affects task-bound events,
when perf-stat opens a counter per-thread.

All of the above are uncore PMUs, and only support cpu-bound events.
Thus they are not affected.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [RFCv4 0/7]  arm_pmu/perf tools: play nicely with CPU PMU cpumasks
@ 2016-09-08 11:29     ` Mark Rutland
  0 siblings, 0 replies; 36+ messages in thread
From: Mark Rutland @ 2016-09-08 11:29 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Sep 08, 2016 at 11:38:49AM +0100, Will Deacon wrote:
> On Thu, Sep 08, 2016 at 11:21:45AM +0100, Mark Rutland wrote:
> > Complicating matters, prior to commit 00e727bb389359c8 ("perf stat: Balance
> > opening and reading events"), from version two of this series, perf-stat would
> > behave erroneously in the presence of a cpumask file, blocking forever after
> > the workload completed. While this is now fixed, existing binaries (e.g. those
> > shipped by distributions) would be broken by the addition of a cpumask file
> > kernel-side.
> > 
> > To cater for this, this series adds support for a new PMU sysfs file, named
> > 'cpus' rather than 'cpumask', listing a number of CPUs that a logical PMU
> > covers. As old binaries will not look for this, this can be safely added to the
> > kernel without risk of breakage.
> 
> On my x86 laptop, I have the following under /sys/bus/event_source/devices:
> 
> power/cpumask 0
> uncore_arb/cpumask 0
> uncore_imc/cpumask 0
> uncore_cbox_0/cpumask 0
> uncore_cbox_1/cpumask 0
> cstate_core/cpumask 0-1
> cstate_pkg/cpumask 0
> 
> Are you saying that, prior to 00e727bb3893, perf stat blocks forever on
> those PMUs? If so, wouldn't we need to rename all those files too?

No. I should have better described the issue above.

The issue fixed by commit 00e727bb3893 only affects task-bound events,
when perf-stat opens a counter per-thread.

All of the above are uncore PMUs, and only support cpu-bound events.
Thus they are not affected.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFCv4 0/7]  arm_pmu/perf tools: play nicely with CPU PMU cpumasks
  2016-09-08 10:21 ` Mark Rutland
@ 2016-09-08 16:25   ` Arnaldo Carvalho de Melo
  -1 siblings, 0 replies; 36+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-09-08 16:25 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-arm-kernel, linux-kernel, alexander.shishkin, jolsa, mingo,
	peterz, will.deacon

Em Thu, Sep 08, 2016 at 11:21:45AM +0100, Mark Rutland escreveu:
> Hi,
> 
> I'm trying to make the perf tool play better with PMUs in heterogeneous systems
> (e.g. big.LITTLE), where there are several logical PMUs, each covering a subset
> of CPUs.

So I added 6/7 and 7/7 to my local perf/core branch, I think they can go
before the others, Peter, do you want me to take the kernel parts as
well?

- Arnaldo
 
> Currently perf-record doesn't work for these PMUs, unless forced to use
> per-thread mmaps. In the absence of a cpumask, it tries to open events on CPUs
> not supported by a PMU, and gives up. In the presence of a cpumask, it ends up
> failing to mmap, as the evlist->cpus map contains a different set of CPUs from
> the evsel->cpus map populated from the cpumask. This is addressed by the
> penultimate patch in this series.
> 
> Complicating matters, prior to commit 00e727bb389359c8 ("perf stat: Balance
> opening and reading events"), from version two of this series, perf-stat would
> behave erroneously in the presence of a cpumask file, blocking forever after
> the workload completed. While this is now fixed, existing binaries (e.g. those
> shipped by distributions) would be broken by the addition of a cpumask file
> kernel-side.
> 
> To cater for this, this series adds support for a new PMU sysfs file, named
> 'cpus' rather than 'cpumask', listing a number of CPUs that a logical PMU
> covers. As old binaries will not look for this, this can be safely added to the
> kernel without risk of breakage.
> 
> I've included the kernel and userspace parts in this series as they've proven
> difficult to review in isolation.
> 
> Thanks,
> Mark.
> 
> Since v1 [1]:
> * Avoid double cpu_map__idx() call in perf_evlist__mmap_per_evsel
> * Look for a supported_cpumask file when a cpumask file is not present
> 
> Since v2 [2]:
> * Drop patches which have been picked up from v2
> * Rebase to v4.8-rc1
> * Better describe the issue in the supported_cpumask patch
> 
> Since v3 [3]:
> * Prepend patches exporting the cpus file for ARM PMUs
> * s/supported_cpus/cpus/
> 
> [1] http://lkml.kernel.org/r/1467907474-3290-1-git-send-email-mark.rutland@arm.com
> [2] http://lkml.kernel.org/r/1468577293-19667-1-git-send-email-mark.rutland@arm.com
> [3] http://lkml.kernel.org/r/1470933366-1364-1-git-send-email-mark.rutland@arm.com
> 
> Mark Rutland (7):
>   drivers/perf: arm_pmu: add common attr group fields
>   arm64: perf: move to common attr_group fields
>   arm: perf: move to common attr_group fields
>   drivers/perf: arm_pmu: only use common attr_groups
>   drivers/perf: arm_pmu: expose a cpumask in sysfs
>   perf: util: only open events on CPUs an evsel permits
>   perf: util: support alternative sysfs cpumask
> 
>  arch/arm/kernel/perf_event_v7.c | 47 ++++++++++++++++++++++++-----------------
>  arch/arm64/kernel/perf_event.c  | 36 ++++++++++++++++++++-----------
>  drivers/perf/arm_pmu.c          | 23 ++++++++++++++++++++
>  include/linux/perf/arm_pmu.h    | 10 ++++++++-
>  tools/perf/util/evlist.c        |  8 ++++++-
>  tools/perf/util/pmu.c           | 15 ++++++++++---
>  6 files changed, 103 insertions(+), 36 deletions(-)
> 
> -- 
> 1.9.1

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [RFCv4 0/7]  arm_pmu/perf tools: play nicely with CPU PMU cpumasks
@ 2016-09-08 16:25   ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 36+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-09-08 16:25 UTC (permalink / raw)
  To: linux-arm-kernel

Em Thu, Sep 08, 2016 at 11:21:45AM +0100, Mark Rutland escreveu:
> Hi,
> 
> I'm trying to make the perf tool play better with PMUs in heterogeneous systems
> (e.g. big.LITTLE), where there are several logical PMUs, each covering a subset
> of CPUs.

So I added 6/7 and 7/7 to my local perf/core branch, I think they can go
before the others, Peter, do you want me to take the kernel parts as
well?

- Arnaldo
 
> Currently perf-record doesn't work for these PMUs, unless forced to use
> per-thread mmaps. In the absence of a cpumask, it tries to open events on CPUs
> not supported by a PMU, and gives up. In the presence of a cpumask, it ends up
> failing to mmap, as the evlist->cpus map contains a different set of CPUs from
> the evsel->cpus map populated from the cpumask. This is addressed by the
> penultimate patch in this series.
> 
> Complicating matters, prior to commit 00e727bb389359c8 ("perf stat: Balance
> opening and reading events"), from version two of this series, perf-stat would
> behave erroneously in the presence of a cpumask file, blocking forever after
> the workload completed. While this is now fixed, existing binaries (e.g. those
> shipped by distributions) would be broken by the addition of a cpumask file
> kernel-side.
> 
> To cater for this, this series adds support for a new PMU sysfs file, named
> 'cpus' rather than 'cpumask', listing a number of CPUs that a logical PMU
> covers. As old binaries will not look for this, this can be safely added to the
> kernel without risk of breakage.
> 
> I've included the kernel and userspace parts in this series as they've proven
> difficult to review in isolation.
> 
> Thanks,
> Mark.
> 
> Since v1 [1]:
> * Avoid double cpu_map__idx() call in perf_evlist__mmap_per_evsel
> * Look for a supported_cpumask file when a cpumask file is not present
> 
> Since v2 [2]:
> * Drop patches which have been picked up from v2
> * Rebase to v4.8-rc1
> * Better describe the issue in the supported_cpumask patch
> 
> Since v3 [3]:
> * Prepend patches exporting the cpus file for ARM PMUs
> * s/supported_cpus/cpus/
> 
> [1] http://lkml.kernel.org/r/1467907474-3290-1-git-send-email-mark.rutland at arm.com
> [2] http://lkml.kernel.org/r/1468577293-19667-1-git-send-email-mark.rutland at arm.com
> [3] http://lkml.kernel.org/r/1470933366-1364-1-git-send-email-mark.rutland at arm.com
> 
> Mark Rutland (7):
>   drivers/perf: arm_pmu: add common attr group fields
>   arm64: perf: move to common attr_group fields
>   arm: perf: move to common attr_group fields
>   drivers/perf: arm_pmu: only use common attr_groups
>   drivers/perf: arm_pmu: expose a cpumask in sysfs
>   perf: util: only open events on CPUs an evsel permits
>   perf: util: support alternative sysfs cpumask
> 
>  arch/arm/kernel/perf_event_v7.c | 47 ++++++++++++++++++++++++-----------------
>  arch/arm64/kernel/perf_event.c  | 36 ++++++++++++++++++++-----------
>  drivers/perf/arm_pmu.c          | 23 ++++++++++++++++++++
>  include/linux/perf/arm_pmu.h    | 10 ++++++++-
>  tools/perf/util/evlist.c        |  8 ++++++-
>  tools/perf/util/pmu.c           | 15 ++++++++++---
>  6 files changed, 103 insertions(+), 36 deletions(-)
> 
> -- 
> 1.9.1

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFCv4 0/7]  arm_pmu/perf tools: play nicely with CPU PMU cpumasks
  2016-09-08 16:25   ` Arnaldo Carvalho de Melo
@ 2016-09-08 18:16     ` Peter Zijlstra
  -1 siblings, 0 replies; 36+ messages in thread
From: Peter Zijlstra @ 2016-09-08 18:16 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Mark Rutland, linux-arm-kernel, linux-kernel, alexander.shishkin,
	jolsa, mingo, will.deacon

On Thu, Sep 08, 2016 at 01:25:02PM -0300, Arnaldo Carvalho de Melo wrote:
> Em Thu, Sep 08, 2016 at 11:21:45AM +0100, Mark Rutland escreveu:
> > Hi,
> > 
> > I'm trying to make the perf tool play better with PMUs in heterogeneous systems
> > (e.g. big.LITTLE), where there are several logical PMUs, each covering a subset
> > of CPUs.
> 
> So I added 6/7 and 7/7 to my local perf/core branch, I think they can go
> before the others, Peter, do you want me to take the kernel parts as
> well?

arm pmu stuff usually goes through the arm tree. Up to Mark I suppose.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [RFCv4 0/7]  arm_pmu/perf tools: play nicely with CPU PMU cpumasks
@ 2016-09-08 18:16     ` Peter Zijlstra
  0 siblings, 0 replies; 36+ messages in thread
From: Peter Zijlstra @ 2016-09-08 18:16 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Sep 08, 2016 at 01:25:02PM -0300, Arnaldo Carvalho de Melo wrote:
> Em Thu, Sep 08, 2016 at 11:21:45AM +0100, Mark Rutland escreveu:
> > Hi,
> > 
> > I'm trying to make the perf tool play better with PMUs in heterogeneous systems
> > (e.g. big.LITTLE), where there are several logical PMUs, each covering a subset
> > of CPUs.
> 
> So I added 6/7 and 7/7 to my local perf/core branch, I think they can go
> before the others, Peter, do you want me to take the kernel parts as
> well?

arm pmu stuff usually goes through the arm tree. Up to Mark I suppose.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [tip:perf/core] perf evlist: Only open events on CPUs an evsel permits
  2016-09-08 10:21   ` Mark Rutland
  (?)
@ 2016-09-09  5:53   ` tip-bot for Mark Rutland
  -1 siblings, 0 replies; 36+ messages in thread
From: tip-bot for Mark Rutland @ 2016-09-09  5:53 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: peterz, mark.rutland, jolsa, acme, will.deacon,
	alexander.shishkin, hpa, tglx, mingo, linux-kernel

Commit-ID:  9f21b815be863218192f42f9f5bf78b75f8738e0
Gitweb:     http://git.kernel.org/tip/9f21b815be863218192f42f9f5bf78b75f8738e0
Author:     Mark Rutland <mark.rutland@arm.com>
AuthorDate: Thu, 8 Sep 2016 11:21:51 +0100
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Thu, 8 Sep 2016 13:44:06 -0300

perf evlist: Only open events on CPUs an evsel permits

In systems with heterogeneous CPU PMUs, it's possible for each evsel to
cover a distinct set of CPUs, and hence the cpu_map associated with each
evsel may have a distinct idx<->id mapping. Any of these may be distinct
from the evlist's cpu map.

Events can be tied to the same fd so long as they use the same per-cpu
ringbuffer (i.e. so long as they are on the same CPU). To acquire the
correct FDs, we must compare the Linux logical IDs rather than the evsel
or evlist indices.

This path adds logic to perf_evlist__mmap_per_evsel to handle this,
translating IDs as required. As PMUs may cover a subset of CPUs from the
evlist, we skip the CPUs a PMU cannot handle.

Without this patch, perf record may try to mmap erroneous FDs on
heterogeneous systems, and will bail out early rather than running the
workload.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will.deacon@arm.com>
Link: http://lkml.kernel.org/r/1473330112-28528-7-git-send-email-mark.rutland@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/evlist.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 097b3ed..ea34c5a 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -1032,16 +1032,18 @@ perf_evlist__should_poll(struct perf_evlist *evlist __maybe_unused,
 }
 
 static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
-				       struct mmap_params *mp, int cpu,
+				       struct mmap_params *mp, int cpu_idx,
 				       int thread, int *_output, int *_output_backward)
 {
 	struct perf_evsel *evsel;
 	int revent;
+	int evlist_cpu = cpu_map__cpu(evlist->cpus, cpu_idx);
 
 	evlist__for_each_entry(evlist, evsel) {
 		struct perf_mmap *maps = evlist->mmap;
 		int *output = _output;
 		int fd;
+		int cpu;
 
 		if (evsel->attr.write_backward) {
 			output = _output_backward;
@@ -1060,6 +1062,10 @@ static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
 		if (evsel->system_wide && thread)
 			continue;
 
+		cpu = cpu_map__idx(evsel->cpus, evlist_cpu);
+		if (cpu == -1)
+			continue;
+
 		fd = FD(evsel, cpu, thread);
 
 		if (*output == -1) {

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [tip:perf/core] perf pmu: Support alternative sysfs cpumask
  2016-09-08 10:21   ` Mark Rutland
  (?)
@ 2016-09-09  5:54   ` tip-bot for Mark Rutland
  -1 siblings, 0 replies; 36+ messages in thread
From: tip-bot for Mark Rutland @ 2016-09-09  5:54 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: peterz, mingo, tglx, hpa, jolsa, alexander.shishkin, acme,
	will.deacon, linux-kernel, mark.rutland

Commit-ID:  7e3fcffe955440101493cd8f32f75840ddf87b6f
Gitweb:     http://git.kernel.org/tip/7e3fcffe955440101493cd8f32f75840ddf87b6f
Author:     Mark Rutland <mark.rutland@arm.com>
AuthorDate: Thu, 8 Sep 2016 11:21:52 +0100
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Thu, 8 Sep 2016 13:44:06 -0300

perf pmu: Support alternative sysfs cpumask

The perf tools can read a cpumask file for a PMU, describing a subset of
CPUs which that PMU covers. So far this has only been used to cater for
uncore PMUs, which in practice happen to only have a single CPU
described in the mask.

Until recently, the perf tools only correctly handled cpumask containing
a single CPU, and only when monitoring in system-wide mode. For example,
prior to commit 00e727bb389359c8 ("perf stat: Balance opening and
reading events"), a mask with more than a single CPU could cause perf
stat to hang. When a CPU PMU covers a subset of CPUs, but lacks a
cpumask, perf record will fail to open events (on the cores the PMU does
not support), and gives up.

For systems with heterogeneous CPUs such as ARM big.LITTLE systems, this
presents a problem. We have a PMU for each microarchitecture (e.g. a big
PMU and a little PMU), and would like to expose a cpumask for each (so
as to allow perf record and other tools to do the right thing). However,
doing so kernel-side will cause old perf binaries to not function (e.g.
hitting the issue solved by 00e727bb389359c8), and thus commits the
cardinal sin of breaking (existing) userspace.

To address this chicken-and-egg problem, this patch adds support got a
new file, cpus, which is largely identical to the existing cpumask file.
A kernel can expose this file, knowing that new perf binaries will
correctly support it, while old perf binaries will not look for it (and
thus will not be broken).

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will.deacon@arm.com>
Link: http://lkml.kernel.org/r/1473330112-28528-8-git-send-email-mark.rutland@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/pmu.c | 15 ++++++++++++---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
index ddb0261..2babcdf 100644
--- a/tools/perf/util/pmu.c
+++ b/tools/perf/util/pmu.c
@@ -445,14 +445,23 @@ static struct cpu_map *pmu_cpumask(const char *name)
 	FILE *file;
 	struct cpu_map *cpus;
 	const char *sysfs = sysfs__mountpoint();
+	const char *templates[] = {
+		 "%s/bus/event_source/devices/%s/cpumask",
+		 "%s/bus/event_source/devices/%s/cpus",
+		 NULL
+	};
+	const char **template;
 
 	if (!sysfs)
 		return NULL;
 
-	snprintf(path, PATH_MAX,
-		 "%s/bus/event_source/devices/%s/cpumask", sysfs, name);
+	for (template = templates; *template; template++) {
+		snprintf(path, PATH_MAX, *template, sysfs, name);
+		if (stat(path, &st) == 0)
+			break;
+	}
 
-	if (stat(path, &st) < 0)
+	if (!*template)
 		return NULL;
 
 	file = fopen(path, "r");

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [RFCv4 0/7]  arm_pmu/perf tools: play nicely with CPU PMU cpumasks
  2016-09-08 18:16     ` Peter Zijlstra
@ 2016-09-09  9:31       ` Will Deacon
  -1 siblings, 0 replies; 36+ messages in thread
From: Will Deacon @ 2016-09-09  9:31 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Arnaldo Carvalho de Melo, Mark Rutland, linux-arm-kernel,
	linux-kernel, alexander.shishkin, jolsa, mingo

On Thu, Sep 08, 2016 at 08:16:57PM +0200, Peter Zijlstra wrote:
> On Thu, Sep 08, 2016 at 01:25:02PM -0300, Arnaldo Carvalho de Melo wrote:
> > Em Thu, Sep 08, 2016 at 11:21:45AM +0100, Mark Rutland escreveu:
> > > Hi,
> > > 
> > > I'm trying to make the perf tool play better with PMUs in heterogeneous systems
> > > (e.g. big.LITTLE), where there are several logical PMUs, each covering a subset
> > > of CPUs.
> > 
> > So I added 6/7 and 7/7 to my local perf/core branch, I think they can go
> > before the others, Peter, do you want me to take the kernel parts as
> > well?
> 
> arm pmu stuff usually goes through the arm tree. Up to Mark I suppose.

I can queue the kernel bits once I've reviewed them (will try to take a
look today).

Will

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [RFCv4 0/7]  arm_pmu/perf tools: play nicely with CPU PMU cpumasks
@ 2016-09-09  9:31       ` Will Deacon
  0 siblings, 0 replies; 36+ messages in thread
From: Will Deacon @ 2016-09-09  9:31 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Sep 08, 2016 at 08:16:57PM +0200, Peter Zijlstra wrote:
> On Thu, Sep 08, 2016 at 01:25:02PM -0300, Arnaldo Carvalho de Melo wrote:
> > Em Thu, Sep 08, 2016 at 11:21:45AM +0100, Mark Rutland escreveu:
> > > Hi,
> > > 
> > > I'm trying to make the perf tool play better with PMUs in heterogeneous systems
> > > (e.g. big.LITTLE), where there are several logical PMUs, each covering a subset
> > > of CPUs.
> > 
> > So I added 6/7 and 7/7 to my local perf/core branch, I think they can go
> > before the others, Peter, do you want me to take the kernel parts as
> > well?
> 
> arm pmu stuff usually goes through the arm tree. Up to Mark I suppose.

I can queue the kernel bits once I've reviewed them (will try to take a
look today).

Will

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFCv4 5/7] drivers/perf: arm_pmu: expose a cpumask in sysfs
  2016-09-08 10:21   ` Mark Rutland
@ 2016-09-09 10:24     ` Will Deacon
  -1 siblings, 0 replies; 36+ messages in thread
From: Will Deacon @ 2016-09-09 10:24 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-arm-kernel, linux-kernel, acme, alexander.shishkin, jolsa,
	mingo, peterz

On Thu, Sep 08, 2016 at 11:21:50AM +0100, Mark Rutland wrote:
> In systems with heterogeneous CPUs, there are multiple logical CPU PMUs,
> each of which covers a subset of CPUs in the system. In some cases
> userspace needs to know which CPUs a given logical PMU covers, so we'd
> like to expose a cpumask under sysfs, similar to what is done for uncore
> PMUs.
> 
> Unfortunately, prior to commit 00e727bb389359c8 ("perf stat: Balance
> opening and reading events"), perf stat only correctly handled a cpumask
> holding a single CPU, and only when profiling in system-wide mode. In
> other cases, the presence of a cpumask file could cause perf stat to
> behave erratically.
> 
> Thus, exposing a cpumask file would break older perf binaries in cases
> where they would otherwise work.
> 
> To avoid this issue while still providing userspace with the information
> it needs, this patch exposes a differently-named file (cpus) under
> sysfs. New tools can look for this and operate correctly, while older
> tools will not be adversely affected by its presence.
> 
> Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> Cc: Will Deacon <will.deacon@arm.com>
> ---
>  drivers/perf/arm_pmu.c       | 21 +++++++++++++++++++++
>  include/linux/perf/arm_pmu.h |  1 +
>  2 files changed, 22 insertions(+)
> 
> diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
> index ac83e1e..09e9944 100644
> --- a/drivers/perf/arm_pmu.c
> +++ b/drivers/perf/arm_pmu.c
> @@ -534,6 +534,25 @@ static int armpmu_filter_match(struct perf_event *event)
>  	return cpumask_test_cpu(cpu, &armpmu->supported_cpus);
>  }
>  
> +static ssize_t armpmu_cpumask_show(struct device *dev,
> +				   struct device_attribute *attr, char *buf)
> +{
> +	struct arm_pmu *armpmu = to_arm_pmu(dev_get_drvdata(dev));
> +	return cpumap_print_to_pagebuf(true, buf, &armpmu->supported_cpus);
> +}
> +
> +static struct device_attribute armpmu_cpumask_attr =
> +		__ATTR(cpus, S_IRUGO, armpmu_cpumask_show, NULL);

You can use the DEVICE_ATTR macro for this.

Will

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [RFCv4 5/7] drivers/perf: arm_pmu: expose a cpumask in sysfs
@ 2016-09-09 10:24     ` Will Deacon
  0 siblings, 0 replies; 36+ messages in thread
From: Will Deacon @ 2016-09-09 10:24 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Sep 08, 2016 at 11:21:50AM +0100, Mark Rutland wrote:
> In systems with heterogeneous CPUs, there are multiple logical CPU PMUs,
> each of which covers a subset of CPUs in the system. In some cases
> userspace needs to know which CPUs a given logical PMU covers, so we'd
> like to expose a cpumask under sysfs, similar to what is done for uncore
> PMUs.
> 
> Unfortunately, prior to commit 00e727bb389359c8 ("perf stat: Balance
> opening and reading events"), perf stat only correctly handled a cpumask
> holding a single CPU, and only when profiling in system-wide mode. In
> other cases, the presence of a cpumask file could cause perf stat to
> behave erratically.
> 
> Thus, exposing a cpumask file would break older perf binaries in cases
> where they would otherwise work.
> 
> To avoid this issue while still providing userspace with the information
> it needs, this patch exposes a differently-named file (cpus) under
> sysfs. New tools can look for this and operate correctly, while older
> tools will not be adversely affected by its presence.
> 
> Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> Cc: Will Deacon <will.deacon@arm.com>
> ---
>  drivers/perf/arm_pmu.c       | 21 +++++++++++++++++++++
>  include/linux/perf/arm_pmu.h |  1 +
>  2 files changed, 22 insertions(+)
> 
> diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
> index ac83e1e..09e9944 100644
> --- a/drivers/perf/arm_pmu.c
> +++ b/drivers/perf/arm_pmu.c
> @@ -534,6 +534,25 @@ static int armpmu_filter_match(struct perf_event *event)
>  	return cpumask_test_cpu(cpu, &armpmu->supported_cpus);
>  }
>  
> +static ssize_t armpmu_cpumask_show(struct device *dev,
> +				   struct device_attribute *attr, char *buf)
> +{
> +	struct arm_pmu *armpmu = to_arm_pmu(dev_get_drvdata(dev));
> +	return cpumap_print_to_pagebuf(true, buf, &armpmu->supported_cpus);
> +}
> +
> +static struct device_attribute armpmu_cpumask_attr =
> +		__ATTR(cpus, S_IRUGO, armpmu_cpumask_show, NULL);

You can use the DEVICE_ATTR macro for this.

Will

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFCv4 1/7] drivers/perf: arm_pmu: add common attr group fields
  2016-09-08 10:21   ` Mark Rutland
@ 2016-09-09 10:25     ` Will Deacon
  -1 siblings, 0 replies; 36+ messages in thread
From: Will Deacon @ 2016-09-09 10:25 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-arm-kernel, linux-kernel, acme, alexander.shishkin, jolsa,
	mingo, peterz

On Thu, Sep 08, 2016 at 11:21:46AM +0100, Mark Rutland wrote:
> In preparation for adding common attribute groups, add an array of
> attribute group pointers to arm_pmu, which will be used if the
> backend hasn't already set pmu::attr_groups.
> 
> Subsequent patches will move backends over to using these, before adding
> common fields.
> 
> Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> Cc: Will Deacon <will.deacon@arm.com>
> ---
>  drivers/perf/arm_pmu.c       | 3 +++
>  include/linux/perf/arm_pmu.h | 9 ++++++++-
>  2 files changed, 11 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
> index f5e1008..145caf4 100644
> --- a/drivers/perf/arm_pmu.c
> +++ b/drivers/perf/arm_pmu.c
> @@ -1039,6 +1039,9 @@ int arm_pmu_device_probe(struct platform_device *pdev,
>  		goto out_free;
>  	}
>  
> +	if (!pmu->pmu.attr_groups)
> +		pmu->pmu.attr_groups = pmu->attr_groups;
> +
>  	ret = cpu_pmu_init(pmu);
>  	if (ret)
>  		goto out_free;
> diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h
> index e188438..8030814 100644
> --- a/include/linux/perf/arm_pmu.h
> +++ b/include/linux/perf/arm_pmu.h
> @@ -14,7 +14,7 @@
>  
>  #include <linux/interrupt.h>
>  #include <linux/perf_event.h>
> -
> +#include <linux/sysfs.h>
>  #include <asm/cputype.h>
>  
>  /*
> @@ -77,6 +77,12 @@ struct pmu_hw_events {
>  	struct arm_pmu		*percpu_pmu;
>  };
>  
> +enum armpmu_attr_groups {
> +	ARMPMU_ATTR_GROUP_EVENTS,
> +	ARMPMU_ATTR_GROUP_FORMATS,
> +	ARMPMU_NR_ATTR_GROUPS
> +};
> +
>  struct arm_pmu {
>  	struct pmu	pmu;
>  	cpumask_t	active_irqs;
> @@ -111,6 +117,7 @@ struct arm_pmu {
>  	struct pmu_hw_events	__percpu *hw_events;
>  	struct list_head	entry;
>  	struct notifier_block	cpu_pm_nb;
> +	const struct attribute_group *attr_groups[ARMPMU_NR_ATTR_GROUPS + 1];

Is the '+ 1' because the array has to be NULL terminated? Probably worth
a comment to disuade people from sending "obvious" fixes.

Will

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [RFCv4 1/7] drivers/perf: arm_pmu: add common attr group fields
@ 2016-09-09 10:25     ` Will Deacon
  0 siblings, 0 replies; 36+ messages in thread
From: Will Deacon @ 2016-09-09 10:25 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Sep 08, 2016 at 11:21:46AM +0100, Mark Rutland wrote:
> In preparation for adding common attribute groups, add an array of
> attribute group pointers to arm_pmu, which will be used if the
> backend hasn't already set pmu::attr_groups.
> 
> Subsequent patches will move backends over to using these, before adding
> common fields.
> 
> Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> Cc: Will Deacon <will.deacon@arm.com>
> ---
>  drivers/perf/arm_pmu.c       | 3 +++
>  include/linux/perf/arm_pmu.h | 9 ++++++++-
>  2 files changed, 11 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
> index f5e1008..145caf4 100644
> --- a/drivers/perf/arm_pmu.c
> +++ b/drivers/perf/arm_pmu.c
> @@ -1039,6 +1039,9 @@ int arm_pmu_device_probe(struct platform_device *pdev,
>  		goto out_free;
>  	}
>  
> +	if (!pmu->pmu.attr_groups)
> +		pmu->pmu.attr_groups = pmu->attr_groups;
> +
>  	ret = cpu_pmu_init(pmu);
>  	if (ret)
>  		goto out_free;
> diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h
> index e188438..8030814 100644
> --- a/include/linux/perf/arm_pmu.h
> +++ b/include/linux/perf/arm_pmu.h
> @@ -14,7 +14,7 @@
>  
>  #include <linux/interrupt.h>
>  #include <linux/perf_event.h>
> -
> +#include <linux/sysfs.h>
>  #include <asm/cputype.h>
>  
>  /*
> @@ -77,6 +77,12 @@ struct pmu_hw_events {
>  	struct arm_pmu		*percpu_pmu;
>  };
>  
> +enum armpmu_attr_groups {
> +	ARMPMU_ATTR_GROUP_EVENTS,
> +	ARMPMU_ATTR_GROUP_FORMATS,
> +	ARMPMU_NR_ATTR_GROUPS
> +};
> +
>  struct arm_pmu {
>  	struct pmu	pmu;
>  	cpumask_t	active_irqs;
> @@ -111,6 +117,7 @@ struct arm_pmu {
>  	struct pmu_hw_events	__percpu *hw_events;
>  	struct list_head	entry;
>  	struct notifier_block	cpu_pm_nb;
> +	const struct attribute_group *attr_groups[ARMPMU_NR_ATTR_GROUPS + 1];

Is the '+ 1' because the array has to be NULL terminated? Probably worth
a comment to disuade people from sending "obvious" fixes.

Will

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFCv4 5/7] drivers/perf: arm_pmu: expose a cpumask in sysfs
  2016-09-09 10:24     ` Will Deacon
@ 2016-09-09 11:04       ` Mark Rutland
  -1 siblings, 0 replies; 36+ messages in thread
From: Mark Rutland @ 2016-09-09 11:04 UTC (permalink / raw)
  To: Will Deacon
  Cc: linux-arm-kernel, linux-kernel, acme, alexander.shishkin, jolsa,
	mingo, peterz

On Fri, Sep 09, 2016 at 11:24:43AM +0100, Will Deacon wrote:
> On Thu, Sep 08, 2016 at 11:21:50AM +0100, Mark Rutland wrote:
> > +static ssize_t armpmu_cpumask_show(struct device *dev,
> > +				   struct device_attribute *attr, char *buf)
> > +{
> > +	struct arm_pmu *armpmu = to_arm_pmu(dev_get_drvdata(dev));
> > +	return cpumap_print_to_pagebuf(true, buf, &armpmu->supported_cpus);
> > +}
> > +
> > +static struct device_attribute armpmu_cpumask_attr =
> > +		__ATTR(cpus, S_IRUGO, armpmu_cpumask_show, NULL);
> 
> You can use the DEVICE_ATTR macro for this.

Ok. I've made use of this locally.

I'll send an updated version of the patches shortly, once I've given
this a spin on HW 

Mark.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [RFCv4 5/7] drivers/perf: arm_pmu: expose a cpumask in sysfs
@ 2016-09-09 11:04       ` Mark Rutland
  0 siblings, 0 replies; 36+ messages in thread
From: Mark Rutland @ 2016-09-09 11:04 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Sep 09, 2016 at 11:24:43AM +0100, Will Deacon wrote:
> On Thu, Sep 08, 2016 at 11:21:50AM +0100, Mark Rutland wrote:
> > +static ssize_t armpmu_cpumask_show(struct device *dev,
> > +				   struct device_attribute *attr, char *buf)
> > +{
> > +	struct arm_pmu *armpmu = to_arm_pmu(dev_get_drvdata(dev));
> > +	return cpumap_print_to_pagebuf(true, buf, &armpmu->supported_cpus);
> > +}
> > +
> > +static struct device_attribute armpmu_cpumask_attr =
> > +		__ATTR(cpus, S_IRUGO, armpmu_cpumask_show, NULL);
> 
> You can use the DEVICE_ATTR macro for this.

Ok. I've made use of this locally.

I'll send an updated version of the patches shortly, once I've given
this a spin on HW 

Mark.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFCv4 1/7] drivers/perf: arm_pmu: add common attr group fields
  2016-09-09 10:25     ` Will Deacon
@ 2016-09-09 11:05       ` Mark Rutland
  -1 siblings, 0 replies; 36+ messages in thread
From: Mark Rutland @ 2016-09-09 11:05 UTC (permalink / raw)
  To: Will Deacon
  Cc: linux-arm-kernel, linux-kernel, acme, alexander.shishkin, jolsa,
	mingo, peterz

On Fri, Sep 09, 2016 at 11:25:54AM +0100, Will Deacon wrote:
> On Thu, Sep 08, 2016 at 11:21:46AM +0100, Mark Rutland wrote:
> > +enum armpmu_attr_groups {
> > +	ARMPMU_ATTR_GROUP_EVENTS,
> > +	ARMPMU_ATTR_GROUP_FORMATS,
> > +	ARMPMU_NR_ATTR_GROUPS
> > +};
> > +
> >  struct arm_pmu {
> >  	struct pmu	pmu;
> >  	cpumask_t	active_irqs;
> > @@ -111,6 +117,7 @@ struct arm_pmu {
> >  	struct pmu_hw_events	__percpu *hw_events;
> >  	struct list_head	entry;
> >  	struct notifier_block	cpu_pm_nb;
> > +	const struct attribute_group *attr_groups[ARMPMU_NR_ATTR_GROUPS + 1];
> 
> Is the '+ 1' because the array has to be NULL terminated?

Yes.

> Probably worth a comment to disuade people from sending "obvious"
> fixes.

Sure. I had meant to do this already, but forgot. :/

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [RFCv4 1/7] drivers/perf: arm_pmu: add common attr group fields
@ 2016-09-09 11:05       ` Mark Rutland
  0 siblings, 0 replies; 36+ messages in thread
From: Mark Rutland @ 2016-09-09 11:05 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Sep 09, 2016 at 11:25:54AM +0100, Will Deacon wrote:
> On Thu, Sep 08, 2016 at 11:21:46AM +0100, Mark Rutland wrote:
> > +enum armpmu_attr_groups {
> > +	ARMPMU_ATTR_GROUP_EVENTS,
> > +	ARMPMU_ATTR_GROUP_FORMATS,
> > +	ARMPMU_NR_ATTR_GROUPS
> > +};
> > +
> >  struct arm_pmu {
> >  	struct pmu	pmu;
> >  	cpumask_t	active_irqs;
> > @@ -111,6 +117,7 @@ struct arm_pmu {
> >  	struct pmu_hw_events	__percpu *hw_events;
> >  	struct list_head	entry;
> >  	struct notifier_block	cpu_pm_nb;
> > +	const struct attribute_group *attr_groups[ARMPMU_NR_ATTR_GROUPS + 1];
> 
> Is the '+ 1' because the array has to be NULL terminated?

Yes.

> Probably worth a comment to disuade people from sending "obvious"
> fixes.

Sure. I had meant to do this already, but forgot. :/

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2016-09-09 11:06 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-09-08 10:21 [RFCv4 0/7] arm_pmu/perf tools: play nicely with CPU PMU cpumasks Mark Rutland
2016-09-08 10:21 ` Mark Rutland
2016-09-08 10:21 ` [RFCv4 1/7] drivers/perf: arm_pmu: add common attr group fields Mark Rutland
2016-09-08 10:21   ` Mark Rutland
2016-09-09 10:25   ` Will Deacon
2016-09-09 10:25     ` Will Deacon
2016-09-09 11:05     ` Mark Rutland
2016-09-09 11:05       ` Mark Rutland
2016-09-08 10:21 ` [RFCv4 2/7] arm64: perf: move to common attr_group fields Mark Rutland
2016-09-08 10:21   ` Mark Rutland
2016-09-08 10:21 ` [RFCv4 3/7] arm: " Mark Rutland
2016-09-08 10:21   ` Mark Rutland
2016-09-08 10:21 ` [RFCv4 4/7] drivers/perf: arm_pmu: only use common attr_groups Mark Rutland
2016-09-08 10:21   ` Mark Rutland
2016-09-08 10:21 ` [RFCv4 5/7] drivers/perf: arm_pmu: expose a cpumask in sysfs Mark Rutland
2016-09-08 10:21   ` Mark Rutland
2016-09-09 10:24   ` Will Deacon
2016-09-09 10:24     ` Will Deacon
2016-09-09 11:04     ` Mark Rutland
2016-09-09 11:04       ` Mark Rutland
2016-09-08 10:21 ` [RFCv4 6/7] perf: util: only open events on CPUs an evsel permits Mark Rutland
2016-09-08 10:21   ` Mark Rutland
2016-09-09  5:53   ` [tip:perf/core] perf evlist: Only " tip-bot for Mark Rutland
2016-09-08 10:21 ` [RFCv4 7/7] perf: util: support alternative sysfs cpumask Mark Rutland
2016-09-08 10:21   ` Mark Rutland
2016-09-09  5:54   ` [tip:perf/core] perf pmu: Support " tip-bot for Mark Rutland
2016-09-08 10:38 ` [RFCv4 0/7] arm_pmu/perf tools: play nicely with CPU PMU cpumasks Will Deacon
2016-09-08 10:38   ` Will Deacon
2016-09-08 11:29   ` Mark Rutland
2016-09-08 11:29     ` Mark Rutland
2016-09-08 16:25 ` Arnaldo Carvalho de Melo
2016-09-08 16:25   ` Arnaldo Carvalho de Melo
2016-09-08 18:16   ` Peter Zijlstra
2016-09-08 18:16     ` Peter Zijlstra
2016-09-09  9:31     ` Will Deacon
2016-09-09  9:31       ` Will Deacon

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.