All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFCv3 0/2] perf tools: play nicely with CPU PMU cpumasks
@ 2016-08-11 16:36 Mark Rutland
  2016-08-11 16:36 ` [RFCv3 1/2] perf: util: only open events on CPUs an evsel permits Mark Rutland
  2016-08-11 16:36 ` [RFCv3 2/2] perf: util: support sysfs supported_cpumask file Mark Rutland
  0 siblings, 2 replies; 5+ messages in thread
From: Mark Rutland @ 2016-08-11 16:36 UTC (permalink / raw)
  To: linux-kernel
  Cc: acme, adrian.hunter, alexander.shishkin, hekuang, jolsa,
	mark.rutland, mingo, peterz, wangnan0

Hi,

I'm trying to make the perf tool play better with PMUs in heterogeneous systems
(e.g. big.LITTLE). These patches fix some brokenness that exists today, but
they require the addition of a cpumask file to each CPU PMU sysfs directory,
and this happens to break existing perf-stat binaries. Due to this, I have not
yet added a cpumask attribute to the ARM PMU code.

In these system we have separate logical PMUs for discrete sets of CPUs. For
example, on an ARM Juno system we have a logical PMU for all Cortex-A53 CPUs,
and a logical PMU for all the Cortex-A57 CPUs. The logical PMUs allow
task-bound events, but reject CPU-bound events for CPUs they do not cover.

Currently perf-record doesn't work for these PMUs, unless forced to use
per-thread mmaps. In the absence of a cpumask, it tries to open events on CPUs
not supported by a PMU, and gives up. In the presence of a cpumask, it ends up
failing to mmap, as the evlist->cpus map contains a different set of CPUs from
the evsel->cpus map populated from the cpumask.

Prior to commit 00e727bb389359c8 ("perf stat: Balance opening and reading
events"), from version two of this series, perf-stat would behave erroneously
in the presence of a cpumask file, blocking forever after the workload
completed. While this is now fixed, existing binaries (e.g. those shipped by
distributions) would be broken by the addition of a cpumask file kernel-side.

To cater for this, this series adds support for a new PMU sysfs file,
supported_cpus, listing a number of CPUs that a logical PMU covers. As old
binaries will not look for this, this can be safely added to the kernel without
risk of breakage.

Does using a sysfs cpumask to handle (heterogeneous) CPU PMUs feel like the
right approach? Is there perhaps a better way to do this without using a
cpumask?

Does it make sense to have a differently-named cpumask file that only new tools
will look at?

Thanks,
Mark.

Since v1 [1]:
* Avoid double cpu_map__idx() call in perf_evlist__mmap_per_evsel
* Look for a supported_cpumask file when a cpumask file is not present

Since v2 [2]:
* Drop patches which have been picked up from v2
* Rebase to v4.8-rc1
* Better describe the issue in the supported_cpumask patch

[1] http://lkml.kernel.org/r/1467907474-3290-1-git-send-email-mark.rutland@arm.com
[2] http://lkml.kernel.org/r/1468577293-19667-1-git-send-email-mark.rutland@arm.com

Mark Rutland (2):
  perf: util: only open events on CPUs an evsel permits
  perf: util: support sysfs supported_cpumask file

 tools/perf/util/evlist.c |  8 +++++++-
 tools/perf/util/pmu.c    | 15 ++++++++++++---
 2 files changed, 19 insertions(+), 4 deletions(-)

-- 
1.9.1

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [RFCv3 1/2] perf: util: only open events on CPUs an evsel permits
  2016-08-11 16:36 [RFCv3 0/2] perf tools: play nicely with CPU PMU cpumasks Mark Rutland
@ 2016-08-11 16:36 ` Mark Rutland
  2016-08-11 16:36 ` [RFCv3 2/2] perf: util: support sysfs supported_cpumask file Mark Rutland
  1 sibling, 0 replies; 5+ messages in thread
From: Mark Rutland @ 2016-08-11 16:36 UTC (permalink / raw)
  To: linux-kernel
  Cc: acme, adrian.hunter, alexander.shishkin, hekuang, jolsa,
	mark.rutland, mingo, peterz, wangnan0

In systems with heterogeneous CPU PMUs, it's possible for each evsel to
cover a distinct set of CPUs, and hence the cpu_map associated with each
evsel may have a distinct idx<->id mapping. Any of these may be distinct from
the evlist's cpu map.

Events can be tied to the same fd so long as they use the same per-cpu
ringbuffer (i.e. so long as they are on the same CPU). To acquire the
correct FDs, we must compare the Linux logical IDs rather than the evsel
or evlist indices.

This path adds logic to perf_evlist__mmap_per_evsel to handle this,
translating IDs as required. As PMUs may cover a subset of CPUs from the
evlist, we skip the CPUs a PMU cannot handle.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: He Kuang <hekuang@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: linux-kernel@vger.kernel.org
---
 tools/perf/util/evlist.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 097b3ed..ea34c5a 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -1032,16 +1032,18 @@ perf_evlist__should_poll(struct perf_evlist *evlist __maybe_unused,
 }
 
 static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
-				       struct mmap_params *mp, int cpu,
+				       struct mmap_params *mp, int cpu_idx,
 				       int thread, int *_output, int *_output_backward)
 {
 	struct perf_evsel *evsel;
 	int revent;
+	int evlist_cpu = cpu_map__cpu(evlist->cpus, cpu_idx);
 
 	evlist__for_each_entry(evlist, evsel) {
 		struct perf_mmap *maps = evlist->mmap;
 		int *output = _output;
 		int fd;
+		int cpu;
 
 		if (evsel->attr.write_backward) {
 			output = _output_backward;
@@ -1060,6 +1062,10 @@ static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
 		if (evsel->system_wide && thread)
 			continue;
 
+		cpu = cpu_map__idx(evsel->cpus, evlist_cpu);
+		if (cpu == -1)
+			continue;
+
 		fd = FD(evsel, cpu, thread);
 
 		if (*output == -1) {
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [RFCv3 2/2] perf: util: support sysfs supported_cpumask file
  2016-08-11 16:36 [RFCv3 0/2] perf tools: play nicely with CPU PMU cpumasks Mark Rutland
  2016-08-11 16:36 ` [RFCv3 1/2] perf: util: only open events on CPUs an evsel permits Mark Rutland
@ 2016-08-11 16:36 ` Mark Rutland
  2016-08-12 10:01   ` Jiri Olsa
  1 sibling, 1 reply; 5+ messages in thread
From: Mark Rutland @ 2016-08-11 16:36 UTC (permalink / raw)
  To: linux-kernel
  Cc: acme, adrian.hunter, alexander.shishkin, hekuang, jolsa,
	mark.rutland, mingo, peterz, wangnan0

The perf tools can read a cpumask file for a PMU, describing a subset of
CPUs which that PMU covers. So far this has only been used to cater for
uncore PMUs, which in practice happen to only have a single CPU
described in the mask.

Until recently, the perf tools only correctly handled cpumask containing
a single CPU, and only when monitoring in system-wide mode. For example,
prior to commit 00e727bb389359c8 ("perf stat: Balance opening and
reading events"), a mask with more than a single CPU could cause
perf stat to hang. When a CPU PMU covers a subset of CPUs, but lacks a
cpumask, perf record will fail to open events (on the cores the PMU does
not support), and gives up.

For systems with heterogeneous CPUs such as ARM big.LITTLE systems, this
presents a problem. We have a PMU for each microarchitecture (e.g. a big
PMU and a little PMU), and would like to expose a cpumask for each (so
as to allow perf record and other tools to do the right thing). However,
doing so kernel-side will cause old perf binaries to not function (e.g.
hitting the issue solved by 00e727bb389359c8), and thus commits the
cardinal sin of breaking (existing) userspace.

To address this chicken-and-egg problem, this patch adds support got a
new file, supported_cpumask, which is largely identical to the existing
cpumask file. A kernel can expose this file, knowing that new perf
binaries will correctly support it, while old perf binaries will not
look for it (and thus will not be broken).

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
---
 tools/perf/util/pmu.c | 15 ++++++++++++---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
index ddb0261..ded0cb2 100644
--- a/tools/perf/util/pmu.c
+++ b/tools/perf/util/pmu.c
@@ -445,14 +445,23 @@ static struct cpu_map *pmu_cpumask(const char *name)
 	FILE *file;
 	struct cpu_map *cpus;
 	const char *sysfs = sysfs__mountpoint();
+	const char *templates[] = {
+		 "%s/bus/event_source/devices/%s/cpumask",
+		 "%s/bus/event_source/devices/%s/supported_cpumask",
+		 NULL
+	};
+	const char **template;
 
 	if (!sysfs)
 		return NULL;
 
-	snprintf(path, PATH_MAX,
-		 "%s/bus/event_source/devices/%s/cpumask", sysfs, name);
+	for (template = templates; *template; template++) {
+		snprintf(path, PATH_MAX, *template, sysfs, name);
+		if (stat(path, &st) == 0)
+			break;
+	}
 
-	if (stat(path, &st) < 0)
+	if (!*template)
 		return NULL;
 
 	file = fopen(path, "r");
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [RFCv3 2/2] perf: util: support sysfs supported_cpumask file
  2016-08-11 16:36 ` [RFCv3 2/2] perf: util: support sysfs supported_cpumask file Mark Rutland
@ 2016-08-12 10:01   ` Jiri Olsa
  2016-08-31 10:46     ` Mark Rutland
  0 siblings, 1 reply; 5+ messages in thread
From: Jiri Olsa @ 2016-08-12 10:01 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-kernel, acme, adrian.hunter, alexander.shishkin, hekuang,
	jolsa, mingo, peterz, wangnan0

On Thu, Aug 11, 2016 at 05:36:06PM +0100, Mark Rutland wrote:
> The perf tools can read a cpumask file for a PMU, describing a subset of
> CPUs which that PMU covers. So far this has only been used to cater for
> uncore PMUs, which in practice happen to only have a single CPU
> described in the mask.
> 
> Until recently, the perf tools only correctly handled cpumask containing
> a single CPU, and only when monitoring in system-wide mode. For example,
> prior to commit 00e727bb389359c8 ("perf stat: Balance opening and
> reading events"), a mask with more than a single CPU could cause
> perf stat to hang. When a CPU PMU covers a subset of CPUs, but lacks a
> cpumask, perf record will fail to open events (on the cores the PMU does
> not support), and gives up.
> 
> For systems with heterogeneous CPUs such as ARM big.LITTLE systems, this
> presents a problem. We have a PMU for each microarchitecture (e.g. a big
> PMU and a little PMU), and would like to expose a cpumask for each (so
> as to allow perf record and other tools to do the right thing). However,
> doing so kernel-side will cause old perf binaries to not function (e.g.
> hitting the issue solved by 00e727bb389359c8), and thus commits the
> cardinal sin of breaking (existing) userspace.
> 
> To address this chicken-and-egg problem, this patch adds support got a
> new file, supported_cpumask, which is largely identical to the existing
> cpumask file. A kernel can expose this file, knowing that new perf
> binaries will correctly support it, while old perf binaries will not
> look for it (and thus will not be broken).

I might have asked before, but what's the kernel side state of this?

thanks,
jirka

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFCv3 2/2] perf: util: support sysfs supported_cpumask file
  2016-08-12 10:01   ` Jiri Olsa
@ 2016-08-31 10:46     ` Mark Rutland
  0 siblings, 0 replies; 5+ messages in thread
From: Mark Rutland @ 2016-08-31 10:46 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: linux-kernel, acme, adrian.hunter, alexander.shishkin, hekuang,
	jolsa, mingo, peterz, wangnan0

Hi,

Apologies for the delay in replying.

On Fri, Aug 12, 2016 at 12:01:23PM +0200, Jiri Olsa wrote:
> On Thu, Aug 11, 2016 at 05:36:06PM +0100, Mark Rutland wrote:
> > The perf tools can read a cpumask file for a PMU, describing a subset of
> > CPUs which that PMU covers. So far this has only been used to cater for
> > uncore PMUs, which in practice happen to only have a single CPU
> > described in the mask.
> > 
> > Until recently, the perf tools only correctly handled cpumask containing
> > a single CPU, and only when monitoring in system-wide mode. For example,
> > prior to commit 00e727bb389359c8 ("perf stat: Balance opening and
> > reading events"), a mask with more than a single CPU could cause
> > perf stat to hang. When a CPU PMU covers a subset of CPUs, but lacks a
> > cpumask, perf record will fail to open events (on the cores the PMU does
> > not support), and gives up.
> > 
> > For systems with heterogeneous CPUs such as ARM big.LITTLE systems, this
> > presents a problem. We have a PMU for each microarchitecture (e.g. a big
> > PMU and a little PMU), and would like to expose a cpumask for each (so
> > as to allow perf record and other tools to do the right thing). However,
> > doing so kernel-side will cause old perf binaries to not function (e.g.
> > hitting the issue solved by 00e727bb389359c8), and thus commits the
> > cardinal sin of breaking (existing) userspace.
> > 
> > To address this chicken-and-egg problem, this patch adds support got a
> > new file, supported_cpumask, which is largely identical to the existing
> > cpumask file. A kernel can expose this file, knowing that new perf
> > binaries will correctly support it, while old perf binaries will not
> > look for it (and thus will not be broken).
> 
> I might have asked before, but what's the kernel side state of this?

Kernel-side, we do not currently expose a cpumask, and I do not have a
current patch series to do so. I wanted to figure out if this was the
right direction or whether I was going off into the weeds.

Clearly that's jsut confusing, so I guess I should respin this long with
the kernel-side patches?

Implementation wise, it's fairly trivial to add (e.g. [1]).

Thanks,
Mark.

[1] http://lkml.kernel.org/r/1466529109-21715-9-git-send-email-jeremy.linton@arm.com

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-08-31 10:46 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-08-11 16:36 [RFCv3 0/2] perf tools: play nicely with CPU PMU cpumasks Mark Rutland
2016-08-11 16:36 ` [RFCv3 1/2] perf: util: only open events on CPUs an evsel permits Mark Rutland
2016-08-11 16:36 ` [RFCv3 2/2] perf: util: support sysfs supported_cpumask file Mark Rutland
2016-08-12 10:01   ` Jiri Olsa
2016-08-31 10:46     ` Mark Rutland

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.