linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 0/3] perf tools: play nicely with CPU PMU cpumasks
@ 2016-07-07 16:04 Mark Rutland
  2016-07-07 16:04 ` [RFC PATCH 1/3] perf stat: balance opening and reading events Mark Rutland
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Mark Rutland @ 2016-07-07 16:04 UTC (permalink / raw)
  To: linux-kernel
  Cc: acme, adrian.hunter, alexander.shishkin, hekuang, jolsa,
	kan.liang, mark.rutland, mingo, peterz, wangnan0

Hi,

I'm trying to make the perf tool play better with PMUs in heterogeneous systems
(e.g. big.LITTLE). These patches fix some brokenness that exists today, but
they require the addition of a cpumask file to each CPU PMU sysfs directory,
and this happens to break prior versions of perf-stat. Due to this, I have not
yet added a cpumask attribute to the ARM PMU code.

In these system we have separate logical PMUs for discrete sets of CPUs. For
example, on an ARM Juno system we have a logical PMU for all Cortex-A53 CPUs,
and a logical PMU for all the Cortex-A57 CPUs. The logical PMUs allow
task-bound events, but reject CPU-bound events for CPUs they do not cover.

Currently perf-record doesn't work for these PMUs, unless forced to use
per-thread mmaps. In the absence of a cpumask, it tries to open events on CPUs
not supported by a PMU, and gives up. In the presence of a cpumask, it ends up
failing to mmap, as the evlist->cpus map contains a different set of cpus from
the evsel->cpus map populated from the cpumask.

Today's perf-stat can profile a task in the absence of a cpumask file, but in
the presence of one ends up blocking after the profiled task exits. Due to an
inconsistency between __perf_evsel__open and read_counter, it ends up treating
some uninitialised memory as a file descriptor, and typically ends up blocked
on stdin. That can avoided as in patch 1, but existing binaries would be broken
by the addition of the cpumask kernel-side.

I understand that we need a sysfs cpumask for the tools to work with uncore
PMUs in system-wide mode, but it's not clear to me whether we expect/want a
cpumask for the heterogeneous CPU PMU case, given the issue with perf-stat.

Does using a sysfs cpumask to handle (heterogeneous) CPU PMUs feel like the
right approach?

If using a cpumask is the right approach, how can I avoid breaking existing
perf-stat? Does it make sense to have a differently-named cpumask file that
only new tools will look at?

Thanks,
Mark.

Mark Rutland (3):
  perf stat: balance opening and reading events
  perf: util: Add more cpu_map helpers
  perf: util: only open events on CPUs an evsel permits

 tools/perf/builtin-stat.c |  8 ++++++--
 tools/perf/util/cpumap.c  | 14 ++++++++++++--
 tools/perf/util/cpumap.h  |  2 ++
 tools/perf/util/evlist.c  |  9 ++++++++-
 4 files changed, 28 insertions(+), 5 deletions(-)

-- 
1.9.1

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [RFC PATCH 1/3] perf stat: balance opening and reading events
  2016-07-07 16:04 [RFC PATCH 0/3] perf tools: play nicely with CPU PMU cpumasks Mark Rutland
@ 2016-07-07 16:04 ` Mark Rutland
  2016-07-07 16:04 ` [RFC PATCH 2/3] perf: util: Add more cpu_map helpers Mark Rutland
  2016-07-07 16:04 ` [RFC PATCH 3/3] perf: util: only open events on CPUs an evsel permits Mark Rutland
  2 siblings, 0 replies; 6+ messages in thread
From: Mark Rutland @ 2016-07-07 16:04 UTC (permalink / raw)
  To: linux-kernel
  Cc: acme, adrian.hunter, alexander.shishkin, hekuang, jolsa,
	kan.liang, mark.rutland, mingo, peterz, wangnan0

In create_perf_stat_counter, when a target CPU has not been provided, we
call __perf_evsel__open with empty_cpu_map, and open a single FD per
thread. However, in read_counter we assume that we opened events for
the product of threads and CPUs described in the evsel's cpu_map.

Thus, if an evsel has a cpu_map with more than one entry, we will
attempt to access FDs that we didn't open. This could result in a number
of problems (e.g. blocking while reading from STDIN if the fd memory
happened to be initialised to zero).

This is problematic for systems were a logical CPU PMU covers some
arbitrary subset of CPUs. The cpu_map of any evsel for that PMU will be
initialised based on the cpumask exposed through sysfs, even if the user
requests per-thread events.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-kernel@vger.kernel.org
---
 tools/perf/builtin-stat.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index ee7ada7..f3e21a2 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -276,8 +276,12 @@ perf_evsel__write_stat_event(struct perf_evsel *counter, u32 cpu, u32 thread,
 static int read_counter(struct perf_evsel *counter)
 {
 	int nthreads = thread_map__nr(evsel_list->threads);
-	int ncpus = perf_evsel__nr_cpus(counter);
-	int cpu, thread;
+	int ncpus, cpu, thread;
+
+	if (target__has_cpu(&target))
+		ncpus = perf_evsel__nr_cpus(counter);
+	else
+		ncpus = 1;
 
 	if (!counter->supported)
 		return -ENOENT;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [RFC PATCH 2/3] perf: util: Add more cpu_map helpers
  2016-07-07 16:04 [RFC PATCH 0/3] perf tools: play nicely with CPU PMU cpumasks Mark Rutland
  2016-07-07 16:04 ` [RFC PATCH 1/3] perf stat: balance opening and reading events Mark Rutland
@ 2016-07-07 16:04 ` Mark Rutland
  2016-07-07 16:04 ` [RFC PATCH 3/3] perf: util: only open events on CPUs an evsel permits Mark Rutland
  2 siblings, 0 replies; 6+ messages in thread
From: Mark Rutland @ 2016-07-07 16:04 UTC (permalink / raw)
  To: linux-kernel
  Cc: acme, adrian.hunter, alexander.shishkin, hekuang, jolsa,
	kan.liang, mark.rutland, mingo, peterz, wangnan0

In some cases it's necessry to figure out the map-local index of a given
Linux logical CPU ID. Add a new helper, cpu_map__idx, to acquire this.
As the logic is largely the same as the existing cpu_map__has, this is
rewritten in terms of the new helper.

At the same time, add the inverse operation, cpu_map__cpu, which yields
the logical CPU id for a map-local index. While this can be performed
manually, wrapping this in a helper can make code more legible.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-kernel@vger.kernel.org
---
 tools/perf/util/cpumap.c | 14 ++++++++++++--
 tools/perf/util/cpumap.h |  2 ++
 2 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/cpumap.c b/tools/perf/util/cpumap.c
index 02d8016..efc88fe 100644
--- a/tools/perf/util/cpumap.c
+++ b/tools/perf/util/cpumap.c
@@ -590,12 +590,22 @@ int cpu__setup_cpunode_map(void)
 
 bool cpu_map__has(struct cpu_map *cpus, int cpu)
 {
+	return cpu_map__idx(cpus, cpu) != -1;
+}
+
+int cpu_map__idx(struct cpu_map *cpus, int cpu)
+{
 	int i;
 
 	for (i = 0; i < cpus->nr; ++i) {
 		if (cpus->map[i] == cpu)
-			return true;
+			return i;
 	}
 
-	return false;
+	return -1;
+}
+
+int cpu_map__cpu(struct cpu_map *cpus, int idx)
+{
+	return cpus->map[idx];
 }
diff --git a/tools/perf/util/cpumap.h b/tools/perf/util/cpumap.h
index 1a0a350..46f7642 100644
--- a/tools/perf/util/cpumap.h
+++ b/tools/perf/util/cpumap.h
@@ -67,5 +67,7 @@ int cpu_map__build_map(struct cpu_map *cpus, struct cpu_map **res,
 		       int (*f)(struct cpu_map *map, int cpu, void *data),
 		       void *data);
 
+int cpu_map__cpu(struct cpu_map *cpus, int idx);
 bool cpu_map__has(struct cpu_map *cpus, int cpu);
+int cpu_map__idx(struct cpu_map *cpus, int cpu);
 #endif /* __PERF_CPUMAP_H */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [RFC PATCH 3/3] perf: util: only open events on CPUs an evsel permits
  2016-07-07 16:04 [RFC PATCH 0/3] perf tools: play nicely with CPU PMU cpumasks Mark Rutland
  2016-07-07 16:04 ` [RFC PATCH 1/3] perf stat: balance opening and reading events Mark Rutland
  2016-07-07 16:04 ` [RFC PATCH 2/3] perf: util: Add more cpu_map helpers Mark Rutland
@ 2016-07-07 16:04 ` Mark Rutland
  2016-07-08  7:55   ` Jiri Olsa
  2 siblings, 1 reply; 6+ messages in thread
From: Mark Rutland @ 2016-07-07 16:04 UTC (permalink / raw)
  To: linux-kernel
  Cc: acme, adrian.hunter, alexander.shishkin, hekuang, jolsa,
	kan.liang, mark.rutland, mingo, peterz, wangnan0

In systems with heterogeneous CPU PMUs, it's possible for each evsel to
cover a distinct set of CPUs, and hence the cpu_map associated with each
evsel may have a distinct idx<->id mapping. Any of these may be distinct from
the evlist's cpu map.

Events can be tied to the same fd so long as they use the same per-cpu
ringbuffer (i.e. so long as they are on the same CPU). To acquire the
correct FDs, we must compare the Linux logical IDs rather than the evsel
or evlist indices.

This path adds logic to perf_evlist__mmap_per_evsel to handle this,
translating IDs as required. As PMUs may cover a subset of CPUs from the
evlist, we skip the CPUs a PMU cannot handle.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: He Kuang <hekuang@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: linux-kernel@vger.kernel.org
---
 tools/perf/util/evlist.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index e82ba90..0b5b1be 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -984,17 +984,24 @@ static int __perf_evlist__mmap(struct perf_evlist *evlist, int idx,
 }
 
 static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
-				       struct mmap_params *mp, int cpu,
+				       struct mmap_params *mp, int cpu_idx,
 				       int thread, int *output)
 {
 	struct perf_evsel *evsel;
+	int evlist_cpu = cpu_map__cpu(evlist->cpus, cpu_idx);
 
 	evlist__for_each(evlist, evsel) {
 		int fd;
+		int cpu;
 
 		if (evsel->system_wide && thread)
 			continue;
 
+		if (!cpu_map__has(evsel->cpus, evlist_cpu))
+			continue;
+
+		cpu = cpu_map__idx(evsel->cpus, evlist_cpu);
+
 		fd = FD(evsel, cpu, thread);
 
 		if (*output == -1) {
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [RFC PATCH 3/3] perf: util: only open events on CPUs an evsel permits
  2016-07-07 16:04 ` [RFC PATCH 3/3] perf: util: only open events on CPUs an evsel permits Mark Rutland
@ 2016-07-08  7:55   ` Jiri Olsa
  2016-07-11 10:50     ` Mark Rutland
  0 siblings, 1 reply; 6+ messages in thread
From: Jiri Olsa @ 2016-07-08  7:55 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-kernel, acme, adrian.hunter, alexander.shishkin, hekuang,
	jolsa, kan.liang, mingo, peterz, wangnan0

On Thu, Jul 07, 2016 at 05:04:34PM +0100, Mark Rutland wrote:
> In systems with heterogeneous CPU PMUs, it's possible for each evsel to
> cover a distinct set of CPUs, and hence the cpu_map associated with each
> evsel may have a distinct idx<->id mapping. Any of these may be distinct from
> the evlist's cpu map.
> 
> Events can be tied to the same fd so long as they use the same per-cpu
> ringbuffer (i.e. so long as they are on the same CPU). To acquire the
> correct FDs, we must compare the Linux logical IDs rather than the evsel
> or evlist indices.
> 
> This path adds logic to perf_evlist__mmap_per_evsel to handle this,
> translating IDs as required. As PMUs may cover a subset of CPUs from the
> evlist, we skip the CPUs a PMU cannot handle.
> 
> Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> Cc: Adrian Hunter <adrian.hunter@intel.com>
> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
> Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
> Cc: He Kuang <hekuang@huawei.com>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Jiri Olsa <jolsa@kernel.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Wang Nan <wangnan0@huawei.com>
> Cc: linux-kernel@vger.kernel.org
> ---
>  tools/perf/util/evlist.c | 9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
> index e82ba90..0b5b1be 100644
> --- a/tools/perf/util/evlist.c
> +++ b/tools/perf/util/evlist.c
> @@ -984,17 +984,24 @@ static int __perf_evlist__mmap(struct perf_evlist *evlist, int idx,
>  }
>  
>  static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
> -				       struct mmap_params *mp, int cpu,
> +				       struct mmap_params *mp, int cpu_idx,
>  				       int thread, int *output)
>  {
>  	struct perf_evsel *evsel;
> +	int evlist_cpu = cpu_map__cpu(evlist->cpus, cpu_idx);
>  
>  	evlist__for_each(evlist, evsel) {
>  		int fd;
> +		int cpu;
>  
>  		if (evsel->system_wide && thread)
>  			continue;
>  
> +		if (!cpu_map__has(evsel->cpus, evlist_cpu))
> +			continue;
> +
> +		cpu = cpu_map__idx(evsel->cpus, evlist_cpu);

you basicaly call cpu_map__idx twice in here,
I think it might be better call it just once
and check the cpu for -1

jirka

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC PATCH 3/3] perf: util: only open events on CPUs an evsel permits
  2016-07-08  7:55   ` Jiri Olsa
@ 2016-07-11 10:50     ` Mark Rutland
  0 siblings, 0 replies; 6+ messages in thread
From: Mark Rutland @ 2016-07-11 10:50 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: linux-kernel, acme, adrian.hunter, alexander.shishkin, hekuang,
	jolsa, kan.liang, mingo, peterz, wangnan0

On Fri, Jul 08, 2016 at 09:55:14AM +0200, Jiri Olsa wrote:
> On Thu, Jul 07, 2016 at 05:04:34PM +0100, Mark Rutland wrote:
> > +		if (!cpu_map__has(evsel->cpus, evlist_cpu))
> > +			continue;
> > +
> > +		cpu = cpu_map__idx(evsel->cpus, evlist_cpu);
> 
> you basicaly call cpu_map__idx twice in here,
> I think it might be better call it just once
> and check the cpu for -1

Sure, I can change the patch to do that.

Mark.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2016-07-11 10:50 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-07-07 16:04 [RFC PATCH 0/3] perf tools: play nicely with CPU PMU cpumasks Mark Rutland
2016-07-07 16:04 ` [RFC PATCH 1/3] perf stat: balance opening and reading events Mark Rutland
2016-07-07 16:04 ` [RFC PATCH 2/3] perf: util: Add more cpu_map helpers Mark Rutland
2016-07-07 16:04 ` [RFC PATCH 3/3] perf: util: only open events on CPUs an evsel permits Mark Rutland
2016-07-08  7:55   ` Jiri Olsa
2016-07-11 10:50     ` Mark Rutland

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).