linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/5] perf,core: allow invalid context events to be part of sw/hw groups
@ 2015-03-03  8:54 kan.liang
  2015-03-03  8:54 ` [PATCH 2/5] perf,tools: check and re-organize evsel cpu maps kan.liang
                   ` (3 more replies)
  0 siblings, 4 replies; 10+ messages in thread
From: kan.liang @ 2015-03-03  8:54 UTC (permalink / raw)
  To: a.p.zijlstra, acme, linux-kernel; +Cc: ak, Kan Liang

From: Kan Liang <kan.liang@intel.com>

The pmu marked as perf_invalid_context don't have any state to switch on
context switch. Everything is global. So it is OK to be part of sw/hw
groups.
In sched_out/sched_in, del/add must be called, so the
perf_invalid_context event can be disabled/enabled accordingly during
context switch. The event count only be read when the event is already
sched_in.

However group read doesn't work with mix events.

For example,
perf record -e '{cycles,uncore_imc_0/cas_count_read/}:S' -a sleep 1
It always gets EINVAL.

This patch set intends to fix this issue.
perf record -e '{cycles,uncore_imc_0/cas_count_read/}:S' -a sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.202 MB perf.data (12 samples) ]

This patch special case invalid context events and allow them to be part
of sw/hw groups.

Signed-off-by: Kan Liang <kan.liang@intel.com>
---
 include/linux/perf_event.h |  8 ++++++
 kernel/events/core.c       | 72 ++++++++++++++++++++++++++++++++++------------
 2 files changed, 61 insertions(+), 19 deletions(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index b8f69d3..6775e6c 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -711,6 +711,14 @@ static inline bool is_sampling_event(struct perf_event *event)
 /*
  * Return 1 for a software event, 0 for a hardware event
  */
+static inline int is_invalid_context_event(struct perf_event *event)
+{
+	return event->pmu->task_ctx_nr == perf_invalid_context;
+}
+
+/*
+ * Return 1 for a software event, 0 for a hardware event
+ */
 static inline int is_software_event(struct perf_event *event)
 {
 	return event->pmu->task_ctx_nr == perf_sw_context;
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 89f0f16..9a709ab 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -1344,7 +1344,7 @@ static void perf_group_attach(struct perf_event *event)
 	WARN_ON_ONCE(group_leader->ctx != event->ctx);
 
 	if (group_leader->group_flags & PERF_GROUP_SOFTWARE &&
-			!is_software_event(event))
+			!is_software_event(event) && !is_invalid_context_event(event))
 		group_leader->group_flags &= ~PERF_GROUP_SOFTWARE;
 
 	list_add_tail(&event->group_entry, &group_leader->sibling_list);
@@ -7549,31 +7549,65 @@ SYSCALL_DEFINE5(perf_event_open,
 	account_event(event);
 
 	/*
-	 * Special case software events and allow them to be part of
-	 * any hardware group.
+	 * Special case for software events and invalid context events.
+	 * Allow software events to be part of any hardware group.
+	 * Invalid context events can only be the group leader for pure
+	 * invalid context event group, but could be part of any
+	 * software/hardware group.
 	 */
 	pmu = event->pmu;
 
 	if (group_leader &&
-	    (is_software_event(event) != is_software_event(group_leader))) {
-		if (is_software_event(event)) {
+	   (group_leader->pmu->task_ctx_nr != event->pmu->task_ctx_nr)) {
+		if (is_invalid_context_event(group_leader)) {
+			err = -EINVAL;
+			goto err_alloc;
+		} else if (is_software_event(group_leader)) {
+			if (is_invalid_context_event(event)) {
+				if (group_leader->group_flags & PERF_GROUP_SOFTWARE) {
+					/*
+					 * If group_leader is software event
+					 * and event is invalid context event
+					 * allow the addition of invalid
+					 * context event to software groups.
+					 */
+					pmu = group_leader->pmu;
+				} else {
+					/*
+					 * Group leader is software event,
+					 * but the group is not software event.
+					 * There must be hardware event in group,
+					 * find it and set it's pmu to event->pmu.
+					 */
+					struct perf_event *tmp;
+
+					list_for_each_entry(tmp, &group_leader->sibling_list, group_entry) {
+						if (tmp->pmu->task_ctx_nr == perf_hw_context) {
+							pmu = tmp->pmu;
+							break;
+						}
+					}
+					if (pmu == event->pmu)
+						goto err_alloc;
+				}
+			} else {
+				if (group_leader->group_flags & PERF_GROUP_SOFTWARE) {
+					/*
+					 * In case the group is pure software group,
+					 * and we try to add a hardware event,
+					 * move the whole group to hardware context.
+					 */
+					move_group = 1;
+				}
+			}
+		} else {
 			/*
-			 * If event and group_leader are not both a software
-			 * event, and event is, then group leader is not.
-			 *
-			 * Allow the addition of software events to !software
-			 * groups, this is safe because software events never
-			 * fail to schedule.
+			 * If group_leader is hardware event and event is not,
+			 * allow the addition of !hardware events to hardware
+			 * groups. This is safe because software events and
+			 * invalid context events never fail to schedule.
 			 */
 			pmu = group_leader->pmu;
-		} else if (is_software_event(group_leader) &&
-			   (group_leader->group_flags & PERF_GROUP_SOFTWARE)) {
-			/*
-			 * In case the group is a pure software group, and we
-			 * try to add a hardware event, move the whole group to
-			 * the hardware context.
-			 */
-			move_group = 1;
 		}
 	}
 
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/5] perf,tools: check and re-organize evsel cpu maps
  2015-03-03  8:54 [PATCH 1/5] perf,core: allow invalid context events to be part of sw/hw groups kan.liang
@ 2015-03-03  8:54 ` kan.liang
  2015-03-03 16:09   ` Arnaldo Carvalho de Melo
  2015-03-03  8:54 ` [PATCH 3/5] perf,tools: change perf stat to use event's cpu map kan.liang
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 10+ messages in thread
From: kan.liang @ 2015-03-03  8:54 UTC (permalink / raw)
  To: a.p.zijlstra, acme, linux-kernel; +Cc: ak, Kan Liang

From: Kan Liang <kan.liang@intel.com>

With the patch 1/5, it's possible to group read events from different
pmus. "-C" can be used to set cpu list. The cpu list may be incompatible
with pmu's cpumask.
This patch checks the event's cpu maps, and discard the incompatible cpu
maps.
event's cpu maps is saved in evsel->cpus during option parse. Then the
evlist's cpu maps is created in perf_evlist__create_maps. So the cpu
maps can be check and re-organized in perf_evlist__create_maps.
Only cpu_list need to check the cpu maps.

Signed-off-by: Kan Liang <kan.liang@intel.com>
---
 tools/perf/builtin-top.c |  6 ++--
 tools/perf/util/evlist.c | 72 ++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 75 insertions(+), 3 deletions(-)

diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 5fb8723..f40d1d6 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -1218,9 +1218,6 @@ int cmd_top(int argc, const char **argv, const char *prefix __maybe_unused)
 	if (target__none(target))
 		target->system_wide = true;
 
-	if (perf_evlist__create_maps(top.evlist, target) < 0)
-		usage_with_options(top_usage, options);
-
 	if (!top.evlist->nr_entries &&
 	    perf_evlist__add_default(top.evlist) < 0) {
 		ui__error("Not enough memory for event selector list\n");
@@ -1229,6 +1226,9 @@ int cmd_top(int argc, const char **argv, const char *prefix __maybe_unused)
 
 	symbol_conf.nr_events = top.evlist->nr_entries;
 
+	if (perf_evlist__create_maps(top.evlist, target) < 0)
+		usage_with_options(top_usage, options);
+
 	if (top.delay_secs < 1)
 		top.delay_secs = 1;
 
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 8d0b623..3c6115c 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -1026,6 +1026,74 @@ int perf_evlist__mmap(struct perf_evlist *evlist, unsigned int pages,
 	return perf_evlist__mmap_per_cpu(evlist, &mp);
 }
 
+static int cmp_ids(const void *a, const void *b)
+{
+	return *(int *)a - *(int *)b;
+}
+
+/*
+ * Check evsel cpu map according to pmu cpumask and input
+ * Only available cpu can be stored in evsel->cpus->map.
+ */
+static int perf_evlist__check_evsel_cpus(struct perf_evlist *evlist)
+{
+	const struct cpu_map *cpus = evlist->cpus;
+	const int ncpus = cpu_map__nr(evlist->cpus);
+	struct perf_evsel *evsel;
+	int i, j, cpu_nr, tmp;
+
+	/* ensure we process id in increasing order */
+	qsort(evlist->cpus->map, evlist->cpus->nr, sizeof(int), cmp_ids);
+
+	evlist__for_each(evlist, evsel) {
+		if (!evsel->cpus)
+			continue;
+
+		cpu_nr = 0;
+		j = 0;
+		for (i = 0; i < cpu_map__nr(evsel->cpus);)  {
+
+			if (j >= ncpus) {
+				evsel->cpus->map[i++] = -1;
+				continue;
+			}
+			for (; j < ncpus; j++) {
+				if (cpus->map[j] < evsel->cpus->map[i])
+					continue;
+				if (cpus->map[j] == evsel->cpus->map[i]) {
+					cpu_nr++;
+					j++;
+					i++;
+				} else
+					evsel->cpus->map[i++] = -1;
+				break;
+			}
+		}
+
+		if (cpu_nr == cpu_map__nr(evsel->cpus))
+			continue;
+		if (cpu_nr == 0) {
+			perror("failed to create CPUs map, please check cpumask");
+			return -1;
+		}
+
+		tmp = 0;
+		for (i = 0; i < cpu_nr; i++) {
+			if (evsel->cpus->map[i] == -1) {
+				while (evsel->cpus->map[tmp] == -1) {
+					tmp++;
+					BUG_ON(tmp >= cpu_map__nr(evsel->cpus));
+				}
+				evsel->cpus->map[i] = evsel->cpus->map[tmp];
+				evsel->cpus->map[tmp] = -1;
+			}
+			tmp++;
+		}
+		evsel->cpus->nr = cpu_nr;
+	}
+	return 0;
+}
+
 int perf_evlist__create_maps(struct perf_evlist *evlist, struct target *target)
 {
 	evlist->threads = thread_map__new_str(target->pid, target->tid,
@@ -1042,6 +1110,10 @@ int perf_evlist__create_maps(struct perf_evlist *evlist, struct target *target)
 	if (evlist->cpus == NULL)
 		goto out_delete_threads;
 
+	if (target->cpu_list &&
+	   (perf_evlist__check_evsel_cpus(evlist) < 0))
+		goto out_delete_threads;
+
 	return 0;
 
 out_delete_threads:
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 3/5] perf,tools: change perf stat to use event's cpu map
  2015-03-03  8:54 [PATCH 1/5] perf,core: allow invalid context events to be part of sw/hw groups kan.liang
  2015-03-03  8:54 ` [PATCH 2/5] perf,tools: check and re-organize evsel cpu maps kan.liang
@ 2015-03-03  8:54 ` kan.liang
  2015-03-03  8:54 ` [PATCH 4/5] perf,tools: open/mmap event according to event's cpu map not evlist's kan.liang
  2015-03-03  8:54 ` [PATCH 5/5] perf/x86/intel/uncore: do not implicitly set uncore event cpu kan.liang
  3 siblings, 0 replies; 10+ messages in thread
From: kan.liang @ 2015-03-03  8:54 UTC (permalink / raw)
  To: a.p.zijlstra, acme, linux-kernel; +Cc: ak, Kan Liang

From: Kan Liang <kan.liang@intel.com>

With cpu_list, perf stat should use event's cpu map.

Different events may have different cpu map. So the same index in cpu
map does not point to the same CPU id. perf_evsel__get_cpumap_index is
introduced to find out the CPU id's index for a given cpu map.

Signed-off-by: Kan Liang <kan.liang@intel.com>
---
 tools/perf/builtin-stat.c |  5 +++--
 tools/perf/util/evsel.h   | 24 ++++++++++++++++++++++++
 2 files changed, 27 insertions(+), 2 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index e598e4e..5ae8686 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -165,7 +165,7 @@ static inline void diff_timespec(struct timespec *r, struct timespec *a,
 
 static inline struct cpu_map *perf_evsel__cpus(struct perf_evsel *evsel)
 {
-	return (evsel->cpus && !target.cpu_list) ? evsel->cpus : evsel_list->cpus;
+	return evsel->cpus ? evsel->cpus : evsel_list->cpus;
 }
 
 static inline int perf_evsel__nr_cpus(struct perf_evsel *evsel)
@@ -302,7 +302,7 @@ static int create_perf_stat_counter(struct perf_evsel *evsel)
 	attr->inherit = !no_inherit;
 
 	if (target__has_cpu(&target))
-		return perf_evsel__open_per_cpu(evsel, perf_evsel__cpus(evsel));
+		return perf_evsel__open_per_cpu(evsel, evsel_list->cpus);
 
 	if (!target__has_task(&target) && perf_evsel__is_group_leader(evsel)) {
 		attr->disabled = 1;
@@ -1218,6 +1218,7 @@ static void print_aggr(char *prefix)
 			nr = 0;
 			for (cpu = 0; cpu < perf_evsel__nr_cpus(counter); cpu++) {
 				cpu2 = perf_evsel__cpus(counter)->map[cpu];
+				cpu2 = perf_evsel__get_cpumap_index(NULL, cpu2, evsel_list->cpus);
 				s2 = aggr_get_id(evsel_list->cpus, cpu2);
 				if (s2 != id)
 					continue;
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index dcf202a..a633320 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -8,6 +8,7 @@
 #include <linux/types.h>
 #include "xyarray.h"
 #include "symbol.h"
+#include "cpumap.h"
 
 struct perf_counts_values {
 	union {
@@ -359,4 +360,27 @@ static inline bool has_branch_callstack(struct perf_evsel *evsel)
 {
 	return evsel->attr.branch_sample_type & PERF_SAMPLE_BRANCH_CALL_STACK;
 }
+
+static inline int perf_evsel__get_cpumap_index(struct perf_evsel *evsel,
+					       int cpu,
+					       struct cpu_map *evlist_cpus)
+{
+	struct cpu_map *cpus;
+	int i;
+
+	if (evlist_cpus == NULL)
+		return -1;
+
+	if (evsel && evsel->cpus)
+		cpus = evsel->cpus;
+	else
+		cpus = evlist_cpus;
+
+	for (i = 0; i < cpus->nr; i++) {
+		if (cpu == cpus->map[i])
+			return i;
+	}
+	return -1;
+}
+
 #endif /* __PERF_EVSEL_H */
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 4/5] perf,tools: open/mmap event according to event's cpu map not evlist's
  2015-03-03  8:54 [PATCH 1/5] perf,core: allow invalid context events to be part of sw/hw groups kan.liang
  2015-03-03  8:54 ` [PATCH 2/5] perf,tools: check and re-organize evsel cpu maps kan.liang
  2015-03-03  8:54 ` [PATCH 3/5] perf,tools: change perf stat to use event's cpu map kan.liang
@ 2015-03-03  8:54 ` kan.liang
  2015-03-03  8:54 ` [PATCH 5/5] perf/x86/intel/uncore: do not implicitly set uncore event cpu kan.liang
  3 siblings, 0 replies; 10+ messages in thread
From: kan.liang @ 2015-03-03  8:54 UTC (permalink / raw)
  To: a.p.zijlstra, acme, linux-kernel; +Cc: ak, Kan Liang

From: Kan Liang <kan.liang@intel.com>

perf tool can open/mmap events per cpu, but the cpu list is from
evlist's cpu map. That means all events share the same cpu map. However,
some events like uncore events have cpu mask. So the global cpu map
doesn't work well with mixed pmu events.

In this patch, the event's cpu map will be used to provide the available
cpu list for perf open. If it's unavailable, evlist's map will be used.
Since different event's cpu list could vary, we cannot rely on the index
to get group leader's fd. In get_group_fd, the cpu id is used to get
correct fd.
perf_evlist__mmap also need to be changed, since Leader's fd has to be
mmaped before member's fd. So evlist__for_each must be the outermost of
the loop.

Signed-off-by: Kan Liang <kan.liang@intel.com>
---
 tools/perf/util/evlist.c | 110 ++++++++++++++++++++++++++++-------------------
 tools/perf/util/evsel.c  |  30 +++++++++----
 2 files changed, 87 insertions(+), 53 deletions(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 3c6115c..85c2ae0 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -792,51 +792,48 @@ static int __perf_evlist__mmap(struct perf_evlist *evlist, int idx,
 	return 0;
 }
 
-static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
-				       struct mmap_params *mp, int cpu,
-				       int thread, int *output)
+static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist,
+				       struct perf_evsel *evsel,
+				       int idx, struct mmap_params *mp,
+				       int cpu, int thread, int *output)
 {
-	struct perf_evsel *evsel;
+	int fd;
 
-	evlist__for_each(evlist, evsel) {
-		int fd;
+	if (evsel->system_wide && thread)
+		return 0;
 
-		if (evsel->system_wide && thread)
-			continue;
+	fd = FD(evsel, cpu, thread);
 
-		fd = FD(evsel, cpu, thread);
+	if (*output == -1) {
+		*output = fd;
+		if (__perf_evlist__mmap(evlist, idx, mp, *output) < 0)
+			return -1;
+	} else {
+		if (ioctl(fd, PERF_EVENT_IOC_SET_OUTPUT, *output) != 0)
+			return -1;
 
-		if (*output == -1) {
-			*output = fd;
-			if (__perf_evlist__mmap(evlist, idx, mp, *output) < 0)
-				return -1;
-		} else {
-			if (ioctl(fd, PERF_EVENT_IOC_SET_OUTPUT, *output) != 0)
-				return -1;
+		perf_evlist__mmap_get(evlist, idx);
+	}
 
-			perf_evlist__mmap_get(evlist, idx);
-		}
+	/*
+	 * The system_wide flag causes a selected event to be opened
+	 * always without a pid.  Consequently it will never get a
+	 * POLLHUP, but it is used for tracking in combination with
+	 * other events, so it should not need to be polled anyway.
+	 * Therefore don't add it for polling.
+	 */
+	if (!evsel->system_wide &&
+	    __perf_evlist__add_pollfd(evlist, fd, idx) < 0) {
+		perf_evlist__mmap_put(evlist, idx);
+		return -1;
+	}
 
-		/*
-		 * The system_wide flag causes a selected event to be opened
-		 * always without a pid.  Consequently it will never get a
-		 * POLLHUP, but it is used for tracking in combination with
-		 * other events, so it should not need to be polled anyway.
-		 * Therefore don't add it for polling.
-		 */
-		if (!evsel->system_wide &&
-		    __perf_evlist__add_pollfd(evlist, fd, idx) < 0) {
-			perf_evlist__mmap_put(evlist, idx);
+	if (evsel->attr.read_format & PERF_FORMAT_ID) {
+		if (perf_evlist__id_add_fd(evlist, evsel, cpu, thread,
+					   fd) < 0)
 			return -1;
-		}
-
-		if (evsel->attr.read_format & PERF_FORMAT_ID) {
-			if (perf_evlist__id_add_fd(evlist, evsel, cpu, thread,
-						   fd) < 0)
-				return -1;
-			perf_evlist__set_sid_idx(evlist, evsel, idx, cpu,
-						 thread);
-		}
+		perf_evlist__set_sid_idx(evlist, evsel, idx, cpu,
+					 thread);
 	}
 
 	return 0;
@@ -848,23 +845,43 @@ static int perf_evlist__mmap_per_cpu(struct perf_evlist *evlist,
 	int cpu, thread;
 	int nr_cpus = cpu_map__nr(evlist->cpus);
 	int nr_threads = thread_map__nr(evlist->threads);
+	int *output = malloc(nr_cpus * sizeof(int));
+	struct cpu_map *map;
+	int evlist_cpu;
+	struct perf_evsel *evsel;
 
 	pr_debug2("perf event ring buffer mmapped per cpu\n");
-	for (cpu = 0; cpu < nr_cpus; cpu++) {
-		int output = -1;
 
-		for (thread = 0; thread < nr_threads; thread++) {
-			if (perf_evlist__mmap_per_evsel(evlist, cpu, mp, cpu,
-							thread, &output))
+	for (cpu = 0; cpu < nr_cpus; cpu++)
+		output[cpu] = -1;
+
+	evlist__for_each(evlist, evsel) {
+		if (evsel->cpus)
+			map = evsel->cpus;
+		else
+			map = evlist->cpus;
+
+		for (cpu = 0; cpu < map->nr; cpu++) {
+			evlist_cpu = perf_evsel__get_cpumap_index(NULL, map->map[cpu], evlist->cpus);
+			if (evlist_cpu < 0)
 				goto out_unmap;
+
+			for (thread = 0; thread < nr_threads; thread++) {
+				if (perf_evlist__mmap_per_evsel(evlist, evsel, evlist_cpu,
+								mp, cpu, thread,
+								&output[evlist_cpu]))
+					goto out_unmap;
+			}
 		}
 	}
 
+	free(output);
 	return 0;
 
 out_unmap:
 	for (cpu = 0; cpu < nr_cpus; cpu++)
 		__perf_evlist__munmap(evlist, cpu);
+	free(output);
 	return -1;
 }
 
@@ -873,14 +890,17 @@ static int perf_evlist__mmap_per_thread(struct perf_evlist *evlist,
 {
 	int thread;
 	int nr_threads = thread_map__nr(evlist->threads);
+	struct perf_evsel *evsel;
 
 	pr_debug2("perf event ring buffer mmapped per thread\n");
 	for (thread = 0; thread < nr_threads; thread++) {
 		int output = -1;
 
-		if (perf_evlist__mmap_per_evsel(evlist, thread, mp, 0, thread,
-						&output))
-			goto out_unmap;
+		evlist__for_each(evlist, evsel) {
+			if (perf_evlist__mmap_per_evsel(evlist, evsel, thread,
+							mp, 0, thread, &output))
+				goto out_unmap;
+		}
 	}
 
 	return 0;
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index bb4eff2..6077a83 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -981,10 +981,11 @@ int __perf_evsel__read_on_cpu(struct perf_evsel *evsel,
 	return 0;
 }
 
-static int get_group_fd(struct perf_evsel *evsel, int cpu, int thread)
+static int get_group_fd(struct perf_evsel *evsel, int cpu,
+			int thread, struct cpu_map *cpus)
 {
 	struct perf_evsel *leader = evsel->leader;
-	int fd;
+	int fd, leader_cpu;
 
 	if (perf_evsel__is_group_leader(evsel))
 		return -1;
@@ -995,9 +996,16 @@ static int get_group_fd(struct perf_evsel *evsel, int cpu, int thread)
 	 */
 	BUG_ON(!leader->fd);
 
-	fd = FD(leader, cpu, thread);
+	if (cpu < 0)
+		fd = FD(leader, 0, thread);
+	else {
+		leader_cpu = perf_evsel__get_cpumap_index(leader, cpu, cpus);
+		if (leader_cpu >= 0)
+			fd = FD(leader, leader_cpu, thread);
+		else
+			return -1;
+	}
 	BUG_ON(fd == -1);
-
 	return fd;
 }
 
@@ -1068,6 +1076,7 @@ static int __perf_evsel__open(struct perf_evsel *evsel, struct cpu_map *cpus,
 	int cpu, thread, nthreads;
 	unsigned long flags = PERF_FLAG_FD_CLOEXEC;
 	int pid = -1, err;
+	struct cpu_map *cpumap;
 	enum { NO_CHANGE, SET_TO_MAX, INCREASED_MAX } set_rlimit = NO_CHANGE;
 
 	if (evsel->system_wide)
@@ -1084,6 +1093,11 @@ static int __perf_evsel__open(struct perf_evsel *evsel, struct cpu_map *cpus,
 		pid = evsel->cgrp->fd;
 	}
 
+	if (evsel->cpus)
+		cpumap = evsel->cpus;
+	else
+		cpumap = cpus;
+
 fallback_missing_features:
 	if (perf_missing_features.cloexec)
 		flags &= ~(unsigned long)PERF_FLAG_FD_CLOEXEC;
@@ -1098,7 +1112,7 @@ retry_sample_id:
 	if (verbose >= 2)
 		perf_event_attr__fprintf(&evsel->attr, stderr);
 
-	for (cpu = 0; cpu < cpus->nr; cpu++) {
+	for (cpu = 0; cpu < cpumap->nr; cpu++) {
 
 		for (thread = 0; thread < nthreads; thread++) {
 			int group_fd;
@@ -1106,14 +1120,14 @@ retry_sample_id:
 			if (!evsel->cgrp && !evsel->system_wide)
 				pid = threads->map[thread];
 
-			group_fd = get_group_fd(evsel, cpu, thread);
+			group_fd = get_group_fd(evsel, cpumap->map[cpu], thread, cpus);
 retry_open:
 			pr_debug2("sys_perf_event_open: pid %d  cpu %d  group_fd %d  flags %#lx\n",
-				  pid, cpus->map[cpu], group_fd, flags);
+				  pid, cpumap->map[cpu], group_fd, flags);
 
 			FD(evsel, cpu, thread) = sys_perf_event_open(&evsel->attr,
 								     pid,
-								     cpus->map[cpu],
+								     cpumap->map[cpu],
 								     group_fd, flags);
 			if (FD(evsel, cpu, thread) < 0) {
 				err = -errno;
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 5/5] perf/x86/intel/uncore: do not implicitly set uncore event cpu
  2015-03-03  8:54 [PATCH 1/5] perf,core: allow invalid context events to be part of sw/hw groups kan.liang
                   ` (2 preceding siblings ...)
  2015-03-03  8:54 ` [PATCH 4/5] perf,tools: open/mmap event according to event's cpu map not evlist's kan.liang
@ 2015-03-03  8:54 ` kan.liang
  3 siblings, 0 replies; 10+ messages in thread
From: kan.liang @ 2015-03-03  8:54 UTC (permalink / raw)
  To: a.p.zijlstra, acme, linux-kernel; +Cc: ak, Kan Liang

From: Kan Liang <kan.liang@intel.com>

There is cpumask exposed to the uncore pmu sysfs directory. User should
set the cpu according to the cpumask. Kernel should not implicitly
change the event->cpu.

Signed-off-by: Kan Liang <kan.liang@intel.com>
---
 arch/x86/kernel/cpu/perf_event_intel_uncore.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event_intel_uncore.c b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
index c635b8b..cd80731 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_uncore.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
@@ -621,9 +621,8 @@ static int uncore_pmu_event_init(struct perf_event *event)
 	if (event->cpu < 0)
 		return -EINVAL;
 	box = uncore_pmu_to_box(pmu, event->cpu);
-	if (!box || box->cpu < 0)
+	if (!box || box->cpu < 0 || (box->cpu != event->cpu))
 		return -EINVAL;
-	event->cpu = box->cpu;
 
 	event->hw.idx = -1;
 	event->hw.last_tag = ~0ULL;
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/5] perf,tools: check and re-organize evsel cpu maps
  2015-03-03  8:54 ` [PATCH 2/5] perf,tools: check and re-organize evsel cpu maps kan.liang
@ 2015-03-03 16:09   ` Arnaldo Carvalho de Melo
  2015-03-03 16:11     ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 10+ messages in thread
From: Arnaldo Carvalho de Melo @ 2015-03-03 16:09 UTC (permalink / raw)
  To: kan.liang; +Cc: a.p.zijlstra, linux-kernel, ak, Stephane Eranian

Em Tue, Mar 03, 2015 at 03:54:43AM -0500, kan.liang@intel.com escreveu:
> From: Kan Liang <kan.liang@intel.com>
> 
> With the patch 1/5, it's possible to group read events from different
> pmus. "-C" can be used to set cpu list. The cpu list may be incompatible
> with pmu's cpumask.
> This patch checks the event's cpu maps, and discard the incompatible cpu
> maps.
> event's cpu maps is saved in evsel->cpus during option parse. Then the
> evlist's cpu maps is created in perf_evlist__create_maps. So the cpu
> maps can be check and re-organized in perf_evlist__create_maps.
> Only cpu_list need to check the cpu maps.

Humm, I had something done in this area...

Stephane complained about the confusion about which cpumap to use with
pmus, so I wrote a patch and sent an RFC, which I think I got no
comments, lemme dig it...

- Arnaldo
 
> Signed-off-by: Kan Liang <kan.liang@intel.com>
> ---
>  tools/perf/builtin-top.c |  6 ++--
>  tools/perf/util/evlist.c | 72 ++++++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 75 insertions(+), 3 deletions(-)
> 
> diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
> index 5fb8723..f40d1d6 100644
> --- a/tools/perf/builtin-top.c
> +++ b/tools/perf/builtin-top.c
> @@ -1218,9 +1218,6 @@ int cmd_top(int argc, const char **argv, const char *prefix __maybe_unused)
>  	if (target__none(target))
>  		target->system_wide = true;
>  
> -	if (perf_evlist__create_maps(top.evlist, target) < 0)
> -		usage_with_options(top_usage, options);
> -
>  	if (!top.evlist->nr_entries &&
>  	    perf_evlist__add_default(top.evlist) < 0) {
>  		ui__error("Not enough memory for event selector list\n");
> @@ -1229,6 +1226,9 @@ int cmd_top(int argc, const char **argv, const char *prefix __maybe_unused)
>  
>  	symbol_conf.nr_events = top.evlist->nr_entries;
>  
> +	if (perf_evlist__create_maps(top.evlist, target) < 0)
> +		usage_with_options(top_usage, options);
> +
>  	if (top.delay_secs < 1)
>  		top.delay_secs = 1;
>  
> diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
> index 8d0b623..3c6115c 100644
> --- a/tools/perf/util/evlist.c
> +++ b/tools/perf/util/evlist.c
> @@ -1026,6 +1026,74 @@ int perf_evlist__mmap(struct perf_evlist *evlist, unsigned int pages,
>  	return perf_evlist__mmap_per_cpu(evlist, &mp);
>  }
>  
> +static int cmp_ids(const void *a, const void *b)
> +{
> +	return *(int *)a - *(int *)b;
> +}
> +
> +/*
> + * Check evsel cpu map according to pmu cpumask and input
> + * Only available cpu can be stored in evsel->cpus->map.
> + */
> +static int perf_evlist__check_evsel_cpus(struct perf_evlist *evlist)
> +{
> +	const struct cpu_map *cpus = evlist->cpus;
> +	const int ncpus = cpu_map__nr(evlist->cpus);
> +	struct perf_evsel *evsel;
> +	int i, j, cpu_nr, tmp;
> +
> +	/* ensure we process id in increasing order */
> +	qsort(evlist->cpus->map, evlist->cpus->nr, sizeof(int), cmp_ids);
> +
> +	evlist__for_each(evlist, evsel) {
> +		if (!evsel->cpus)
> +			continue;
> +
> +		cpu_nr = 0;
> +		j = 0;
> +		for (i = 0; i < cpu_map__nr(evsel->cpus);)  {
> +
> +			if (j >= ncpus) {
> +				evsel->cpus->map[i++] = -1;
> +				continue;
> +			}
> +			for (; j < ncpus; j++) {
> +				if (cpus->map[j] < evsel->cpus->map[i])
> +					continue;
> +				if (cpus->map[j] == evsel->cpus->map[i]) {
> +					cpu_nr++;
> +					j++;
> +					i++;
> +				} else
> +					evsel->cpus->map[i++] = -1;
> +				break;
> +			}
> +		}
> +
> +		if (cpu_nr == cpu_map__nr(evsel->cpus))
> +			continue;
> +		if (cpu_nr == 0) {
> +			perror("failed to create CPUs map, please check cpumask");
> +			return -1;
> +		}
> +
> +		tmp = 0;
> +		for (i = 0; i < cpu_nr; i++) {
> +			if (evsel->cpus->map[i] == -1) {
> +				while (evsel->cpus->map[tmp] == -1) {
> +					tmp++;
> +					BUG_ON(tmp >= cpu_map__nr(evsel->cpus));
> +				}
> +				evsel->cpus->map[i] = evsel->cpus->map[tmp];
> +				evsel->cpus->map[tmp] = -1;
> +			}
> +			tmp++;
> +		}
> +		evsel->cpus->nr = cpu_nr;
> +	}
> +	return 0;
> +}
> +
>  int perf_evlist__create_maps(struct perf_evlist *evlist, struct target *target)
>  {
>  	evlist->threads = thread_map__new_str(target->pid, target->tid,
> @@ -1042,6 +1110,10 @@ int perf_evlist__create_maps(struct perf_evlist *evlist, struct target *target)
>  	if (evlist->cpus == NULL)
>  		goto out_delete_threads;
>  
> +	if (target->cpu_list &&
> +	   (perf_evlist__check_evsel_cpus(evlist) < 0))
> +		goto out_delete_threads;
> +
>  	return 0;
>  
>  out_delete_threads:
> -- 
> 1.8.3.1

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/5] perf,tools: check and re-organize evsel cpu maps
  2015-03-03 16:09   ` Arnaldo Carvalho de Melo
@ 2015-03-03 16:11     ` Arnaldo Carvalho de Melo
  2015-03-03 17:09       ` Liang, Kan
  0 siblings, 1 reply; 10+ messages in thread
From: Arnaldo Carvalho de Melo @ 2015-03-03 16:11 UTC (permalink / raw)
  To: kan.liang; +Cc: a.p.zijlstra, linux-kernel, ak, Stephane Eranian

Em Tue, Mar 03, 2015 at 01:09:29PM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Tue, Mar 03, 2015 at 03:54:43AM -0500, kan.liang@intel.com escreveu:
> > From: Kan Liang <kan.liang@intel.com>
> > 
> > With the patch 1/5, it's possible to group read events from different
> > pmus. "-C" can be used to set cpu list. The cpu list may be incompatible
> > with pmu's cpumask.
> > This patch checks the event's cpu maps, and discard the incompatible cpu
> > maps.
> > event's cpu maps is saved in evsel->cpus during option parse. Then the
> > evlist's cpu maps is created in perf_evlist__create_maps. So the cpu
> > maps can be check and re-organized in perf_evlist__create_maps.
> > Only cpu_list need to check the cpu maps.
> 
> Humm, I had something done in this area...
> 
> Stephane complained about the confusion about which cpumap to use with
> pmus, so I wrote a patch and sent an RFC, which I think I got no
> comments, lemme dig it...

Here it is, can you take a look? Stephane?

- Arnaldo

commit 9ecdd9b9bf0a7fd5645957ba4e6a98b6ee526109
Author: Arnaldo Carvalho de Melo <acme@redhat.com>
Date:   Mon Jan 26 13:43:42 2015 -0300

    perf evsel: Set evsel->cpus to the evlist->cpus when not constrained
    
    So that we don't need to know about the evlist all the time and can cope
    with cases where evsel->cpus were set because it was for an event on a
    PMU with a cpumask.
    
    Reported-by: Stephane Eranian <eranian@google.com>
    Cc: Adrian Hunter <adrian.hunter@intel.com>
    Cc: Andi Kleen <ak@linux.intel.com>
    Cc: Borislav Petkov <bp@suse.de>
    Cc: David Ahern <dsahern@gmail.com>
    Cc: Don Zickus <dzickus@redhat.com>
    Cc: Frederic Weisbecker <fweisbec@gmail.com>
    Cc: Jiri Olsa <jolsa@redhat.com>
    Cc: Namhyung Kim <namhyung@kernel.org>
    Cc: Yan, Zheng <zheng.z.yan@intel.com>
    echo Link: http://lkml.kernel.org/n/tip-`ranpwd -l 24`@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index e598e4e98170..ddf41bede0b8 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -163,16 +163,6 @@ static inline void diff_timespec(struct timespec *r, struct timespec *a,
 	}
 }
 
-static inline struct cpu_map *perf_evsel__cpus(struct perf_evsel *evsel)
-{
-	return (evsel->cpus && !target.cpu_list) ? evsel->cpus : evsel_list->cpus;
-}
-
-static inline int perf_evsel__nr_cpus(struct perf_evsel *evsel)
-{
-	return perf_evsel__cpus(evsel)->nr;
-}
-
 static void perf_evsel__reset_stat_priv(struct perf_evsel *evsel)
 {
 	int i;
@@ -202,7 +192,7 @@ static int perf_evsel__alloc_prev_raw_counts(struct perf_evsel *evsel)
 	size_t sz;
 
 	sz = sizeof(*evsel->counts) +
-	     (perf_evsel__nr_cpus(evsel) * sizeof(struct perf_counts_values));
+	     (cpu_map__nr(evsel->cpus) * sizeof(struct perf_counts_values));
 
 	addr = zalloc(sz);
 	if (!addr)
@@ -235,7 +225,7 @@ static int perf_evlist__alloc_stats(struct perf_evlist *evlist, bool alloc_raw)
 
 	evlist__for_each(evlist, evsel) {
 		if (perf_evsel__alloc_stat_priv(evsel) < 0 ||
-		    perf_evsel__alloc_counts(evsel, perf_evsel__nr_cpus(evsel)) < 0 ||
+		    perf_evsel__alloc_counts(evsel, cpu_map__nr(evsel->cpus)) < 0 ||
 		    (alloc_raw && perf_evsel__alloc_prev_raw_counts(evsel) < 0))
 			goto out_free;
 	}
@@ -269,7 +259,7 @@ static void perf_stat__reset_stats(struct perf_evlist *evlist)
 
 	evlist__for_each(evlist, evsel) {
 		perf_evsel__reset_stat_priv(evsel);
-		perf_evsel__reset_counts(evsel, perf_evsel__nr_cpus(evsel));
+		perf_evsel__reset_counts(evsel, cpu_map__nr(evsel->cpus));
 	}
 
 	memset(runtime_nsecs_stats, 0, sizeof(runtime_nsecs_stats));
@@ -302,7 +292,7 @@ static int create_perf_stat_counter(struct perf_evsel *evsel)
 	attr->inherit = !no_inherit;
 
 	if (target__has_cpu(&target))
-		return perf_evsel__open_per_cpu(evsel, perf_evsel__cpus(evsel));
+		return perf_evsel__open_per_cpu(evsel, evsel->cpus);
 
 	if (!target__has_task(&target) && perf_evsel__is_group_leader(evsel)) {
 		attr->disabled = 1;
@@ -397,7 +387,7 @@ static void zero_per_pkg(struct perf_evsel *counter)
 static int check_per_pkg(struct perf_evsel *counter, int cpu, bool *skip)
 {
 	unsigned long *mask = counter->per_pkg_mask;
-	struct cpu_map *cpus = perf_evsel__cpus(counter);
+	struct cpu_map *cpus = counter->cpus;
 	int s;
 
 	*skip = false;
@@ -507,7 +497,7 @@ static int read_counter_aggr(struct perf_evsel *counter)
 static int read_counter(struct perf_evsel *counter)
 {
 	int nthreads = thread_map__nr(evsel_list->threads);
-	int ncpus = perf_evsel__nr_cpus(counter);
+	int ncpus = cpu_map__nr(counter->cpus);
 	int cpu, thread;
 
 	if (counter->system_wide)
@@ -727,13 +717,13 @@ static int __run_perf_stat(int argc, const char **argv)
 	if (aggr_mode == AGGR_GLOBAL) {
 		evlist__for_each(evsel_list, counter) {
 			read_counter_aggr(counter);
-			perf_evsel__close_fd(counter, perf_evsel__nr_cpus(counter),
+			perf_evsel__close_fd(counter, cpu_map__nr(counter->cpus),
 					     thread_map__nr(evsel_list->threads));
 		}
 	} else {
 		evlist__for_each(evsel_list, counter) {
 			read_counter(counter);
-			perf_evsel__close_fd(counter, perf_evsel__nr_cpus(counter), 1);
+			perf_evsel__close_fd(counter, cpu_map__nr(counter->cpus), 1);
 		}
 	}
 
@@ -812,7 +802,7 @@ static void aggr_printout(struct perf_evsel *evsel, int id, int nr)
 	case AGGR_NONE:
 		fprintf(output, "CPU%*d%s",
 			csv_output ? 0 : -4,
-			perf_evsel__cpus(evsel)->map[id], csv_sep);
+			evsel->cpus->map[id], csv_sep);
 		break;
 	case AGGR_GLOBAL:
 	default:
@@ -1216,8 +1206,8 @@ static void print_aggr(char *prefix)
 		evlist__for_each(evsel_list, counter) {
 			val = ena = run = 0;
 			nr = 0;
-			for (cpu = 0; cpu < perf_evsel__nr_cpus(counter); cpu++) {
-				cpu2 = perf_evsel__cpus(counter)->map[cpu];
+			for (cpu = 0; cpu < cpu_map__nr(counter->cpus); cpu++) {
+				cpu2 = counter->cpus->map[cpu];
 				s2 = aggr_get_id(evsel_list->cpus, cpu2);
 				if (s2 != id)
 					continue;
@@ -1339,7 +1329,7 @@ static void print_counter(struct perf_evsel *counter, char *prefix)
 	double uval;
 	int cpu;
 
-	for (cpu = 0; cpu < perf_evsel__nr_cpus(counter); cpu++) {
+	for (cpu = 0; cpu < cpu_map__nr(counter->cpus); cpu++) {
 		val = counter->counts->cpu[cpu].val;
 		ena = counter->counts->cpu[cpu].ena;
 		run = counter->counts->cpu[cpu].run;
@@ -1350,7 +1340,7 @@ static void print_counter(struct perf_evsel *counter, char *prefix)
 		if (run == 0 || ena == 0) {
 			fprintf(output, "CPU%*d%s%*s%s",
 				csv_output ? 0 : -4,
-				perf_evsel__cpus(counter)->map[cpu], csv_sep,
+				counter->cpus->map[cpu], csv_sep,
 				csv_output ? 0 : 18,
 				counter->supported ? CNTR_NOT_COUNTED : CNTR_NOT_SUPPORTED,
 				csv_sep);
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 28b8ce86bf12..202d1e9842e7 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -111,13 +111,45 @@ void perf_evlist__exit(struct perf_evlist *evlist)
 	fdarray__exit(&evlist->pollfd);
 }
 
+static void perf_evlist__reset_cpus(struct perf_evlist *evlist)
+{
+	struct perf_evsel *evsel;
+
+	evlist__for_each(evlist, evsel) {
+		if (evsel->cpus == evlist->cpus)
+			evsel->cpus = NULL;
+	}
+
+	evlist->cpus = NULL;
+}
+
+static void perf_evlist__set_cpus(struct perf_evlist *evlist, struct cpu_map *cpus)
+{
+	struct perf_evsel *evsel;
+
+	if (evlist->cpus != NULL)
+		perf_evlist__reset_cpus(evlist);
+	/*
+	 * If, when parsing events, the evsel->cpus wasn't constrained to a
+	 * cpulist, say, because it is on a PMU that has a cpumask, then set it
+	 * to the evlist cpu_map, so that we can access evsel->cpus and get the
+	 * cpu_map this evsel works with.
+	 */
+	evlist__for_each(evlist, evsel) {
+		if (evsel->cpus == NULL)
+			evsel->cpus = cpus;
+	}
+
+	evlist->cpus = cpus;
+}
+
 void perf_evlist__delete(struct perf_evlist *evlist)
 {
 	perf_evlist__munmap(evlist);
 	perf_evlist__close(evlist);
 	cpu_map__delete(evlist->cpus);
+	perf_evlist__reset_cpus(evlist);
 	thread_map__delete(evlist->threads);
-	evlist->cpus = NULL;
 	evlist->threads = NULL;
 	perf_evlist__purge(evlist);
 	perf_evlist__exit(evlist);
@@ -129,6 +161,14 @@ void perf_evlist__add(struct perf_evlist *evlist, struct perf_evsel *entry)
 	list_add_tail(&entry->node, &evlist->entries);
 	entry->idx = evlist->nr_entries;
 	entry->tracking = !entry->idx;
+	/*
+	 * If, when parsing events, the evsel->cpus wasn't constrained to a
+	 * cpulist, say, because it is on a PMU that has a cpumask, then set it
+	 * to the evlist cpu_map, so that we can access evsel->cpus and get the
+	 * cpu_map this evsel works with.
+	 */
+	if (entry->cpus == NULL)
+		entry->cpus = evlist->cpus;
 
 	if (!evlist->nr_entries++)
 		perf_evlist__set_id_pos(evlist);
@@ -1029,6 +1069,8 @@ int perf_evlist__mmap(struct perf_evlist *evlist, unsigned int pages,
 
 int perf_evlist__create_maps(struct perf_evlist *evlist, struct target *target)
 {
+	struct cpu_map *cpus;
+
 	evlist->threads = thread_map__new_str(target->pid, target->tid,
 					      target->uid);
 
@@ -1036,13 +1078,15 @@ int perf_evlist__create_maps(struct perf_evlist *evlist, struct target *target)
 		return -1;
 
 	if (target__uses_dummy_map(target))
-		evlist->cpus = cpu_map__dummy_new();
+		cpus = cpu_map__dummy_new();
 	else
-		evlist->cpus = cpu_map__new(target->cpu_list);
+		cpus = cpu_map__new(target->cpu_list);
 
-	if (evlist->cpus == NULL)
+	if (cpus == NULL)
 		goto out_delete_threads;
 
+	perf_evlist__set_cpus(evlist, cpus);
+
 	return 0;
 
 out_delete_threads:
@@ -1222,6 +1266,7 @@ void perf_evlist__close(struct perf_evlist *evlist)
 
 static int perf_evlist__create_syswide_maps(struct perf_evlist *evlist)
 {
+	struct cpu_map *cpus;
 	int err = -ENOMEM;
 
 	/*
@@ -1233,20 +1278,20 @@ static int perf_evlist__create_syswide_maps(struct perf_evlist *evlist)
 	 * error, and we may not want to do that fallback to a
 	 * default cpu identity map :-\
 	 */
-	evlist->cpus = cpu_map__new(NULL);
-	if (evlist->cpus == NULL)
+	cpus = cpu_map__new(NULL);
+	if (cpus == NULL)
 		goto out;
 
 	evlist->threads = thread_map__new_dummy();
 	if (evlist->threads == NULL)
 		goto out_free_cpus;
 
+	perf_evlist__set_cpus(evlist, cpus);
 	err = 0;
 out:
 	return err;
 out_free_cpus:
-	cpu_map__delete(evlist->cpus);
-	evlist->cpus = NULL;
+	cpu_map__delete(cpus);
 	goto out;
 }
 

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* RE: [PATCH 2/5] perf,tools: check and re-organize evsel cpu maps
  2015-03-03 16:11     ` Arnaldo Carvalho de Melo
@ 2015-03-03 17:09       ` Liang, Kan
  2015-03-04  0:15         ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 10+ messages in thread
From: Liang, Kan @ 2015-03-03 17:09 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: a.p.zijlstra, linux-kernel, ak, Stephane Eranian



> Em Tue, Mar 03, 2015 at 01:09:29PM -0300, Arnaldo Carvalho de Melo
> escreveu:
> > Em Tue, Mar 03, 2015 at 03:54:43AM -0500, kan.liang@intel.com escreveu:
> > > From: Kan Liang <kan.liang@intel.com>
> > >
> > > With the patch 1/5, it's possible to group read events from
> > > different pmus. "-C" can be used to set cpu list. The cpu list may
> > > be incompatible with pmu's cpumask.
> > > This patch checks the event's cpu maps, and discard the incompatible
> > > cpu maps.
> > > event's cpu maps is saved in evsel->cpus during option parse. Then
> > > the evlist's cpu maps is created in perf_evlist__create_maps. So the
> > > cpu maps can be check and re-organized in perf_evlist__create_maps.
> > > Only cpu_list need to check the cpu maps.
> >
> > Humm, I had something done in this area...
> >
> > Stephane complained about the confusion about which cpumap to use
> with
> > pmus, so I wrote a patch and sent an RFC, which I think I got no
> > comments, lemme dig it...
> 
> Here it is, can you take a look? Stephane?
> 

Your patch is more like my 3/5 patch. The difference is your patch force
the evsel->cpus = evlist->cpus, if evsel->cpus == NULL. 
My patch handle the evsel->cpus == NULL case when using it.

> @@ -1216,8 +1206,8 @@ static void print_aggr(char *prefix)
>  		evlist__for_each(evsel_list, counter) {
>  			val = ena = run = 0;
>  			nr = 0;
> -			for (cpu = 0; cpu < perf_evsel__nr_cpus(counter);
> cpu++) {
> -				cpu2 = perf_evsel__cpus(counter)-
> >map[cpu];
> +			for (cpu = 0; cpu < cpu_map__nr(counter->cpus);
> cpu++) {
> +				cpu2 = counter->cpus->map[cpu];
>  				s2 = aggr_get_id(evsel_list->cpus, cpu2);
>  				if (s2 != id)
>  					continue;

print_aggr also need to be special handled. In the past, all events use
evlist's cpu map,so it uses index to find the real cpu id.
Now, event's cpu map are different. The s2 could be wrong.
For example, evlist's cpu map is 0,4,5,18. Event's cpu map could be 0,18.
When cpu == 1, the return of aggr_get_id must be wrong, since it
still use index to find s2.
My 3/5 patch introduce a function perf_evsel__get_cpumap_index
to handle it.

Only your patch is not enough, we still need 2/5 and 4/5.
2/5 is used to check if the event's cpu maps are compatible as evlist's
cpu map. For example, evlist's cpu map is 1,2,17. Event's cpu map
could be 0,18. We can error out earlier.
4/5 is used to special handle the open and mmap. We need to do
the same thing as what we did in print_aggr.

Thanks,
Kan

 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/5] perf,tools: check and re-organize evsel cpu maps
  2015-03-03 17:09       ` Liang, Kan
@ 2015-03-04  0:15         ` Arnaldo Carvalho de Melo
  2015-03-18 12:31           ` Liang, Kan
  0 siblings, 1 reply; 10+ messages in thread
From: Arnaldo Carvalho de Melo @ 2015-03-04  0:15 UTC (permalink / raw)
  To: Liang, Kan; +Cc: a.p.zijlstra, linux-kernel, ak, Stephane Eranian

Em Tue, Mar 03, 2015 at 05:09:29PM +0000, Liang, Kan escreveu:
> 
> 
> > Em Tue, Mar 03, 2015 at 01:09:29PM -0300, Arnaldo Carvalho de Melo
> > escreveu:
> > > Em Tue, Mar 03, 2015 at 03:54:43AM -0500, kan.liang@intel.com escreveu:
> > > > From: Kan Liang <kan.liang@intel.com>
> > > >
> > > > With the patch 1/5, it's possible to group read events from
> > > > different pmus. "-C" can be used to set cpu list. The cpu list may
> > > > be incompatible with pmu's cpumask.
> > > > This patch checks the event's cpu maps, and discard the incompatible
> > > > cpu maps.
> > > > event's cpu maps is saved in evsel->cpus during option parse. Then
> > > > the evlist's cpu maps is created in perf_evlist__create_maps. So the
> > > > cpu maps can be check and re-organized in perf_evlist__create_maps.
> > > > Only cpu_list need to check the cpu maps.
> > >
> > > Humm, I had something done in this area...
> > >
> > > Stephane complained about the confusion about which cpumap to use
> > with
> > > pmus, so I wrote a patch and sent an RFC, which I think I got no
> > > comments, lemme dig it...
> > 
> > Here it is, can you take a look? Stephane?
> > 
> 
> Your patch is more like my 3/5 patch. The difference is your patch force
> the evsel->cpus = evlist->cpus, if evsel->cpus == NULL. 
> My patch handle the evsel->cpus == NULL case when using it.

Idea is to use evsel->cpus always, not having to special case it and
fallback to evlist->cpus, so that we don't have to pass evlist around
that often.
 
> > @@ -1216,8 +1206,8 @@ static void print_aggr(char *prefix)
> >  		evlist__for_each(evsel_list, counter) {
> >  			val = ena = run = 0;
> >  			nr = 0;
> > -			for (cpu = 0; cpu < perf_evsel__nr_cpus(counter);
> > cpu++) {
> > -				cpu2 = perf_evsel__cpus(counter)-
> > >map[cpu];
> > +			for (cpu = 0; cpu < cpu_map__nr(counter->cpus);
> > cpu++) {
> > +				cpu2 = counter->cpus->map[cpu];
> >  				s2 = aggr_get_id(evsel_list->cpus, cpu2);
> >  				if (s2 != id)
> >  					continue;
> 
> print_aggr also need to be special handled. In the past, all events use
> evlist's cpu map,so it uses index to find the real cpu id.
> Now, event's cpu map are different. The s2 could be wrong.
> For example, evlist's cpu map is 0,4,5,18. Event's cpu map could be 0,18.
> When cpu == 1, the return of aggr_get_id must be wrong, since it
> still use index to find s2.
> My 3/5 patch introduce a function perf_evsel__get_cpumap_index
> to handle it.
> 
> Only your patch is not enough, we still need 2/5 and 4/5.
> 2/5 is used to check if the event's cpu maps are compatible as evlist's
> cpu map. For example, evlist's cpu map is 1,2,17. Event's cpu map
> could be 0,18. We can error out earlier.
> 4/5 is used to special handle the open and mmap. We need to do
> the same thing as what we did in print_aggr.

I'll try to go thru this tomorrow, thanks for checking.

- Arnaldo

^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: [PATCH 2/5] perf,tools: check and re-organize evsel cpu maps
  2015-03-04  0:15         ` Arnaldo Carvalho de Melo
@ 2015-03-18 12:31           ` Liang, Kan
  0 siblings, 0 replies; 10+ messages in thread
From: Liang, Kan @ 2015-03-18 12:31 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: a.p.zijlstra, linux-kernel, ak, Stephane Eranian



> 
> Em Tue, Mar 03, 2015 at 05:09:29PM +0000, Liang, Kan escreveu:
> >
> >
> > > Em Tue, Mar 03, 2015 at 01:09:29PM -0300, Arnaldo Carvalho de Melo
> > > escreveu:
> > > > Em Tue, Mar 03, 2015 at 03:54:43AM -0500, kan.liang@intel.com
> escreveu:
> > > > > From: Kan Liang <kan.liang@intel.com>
> > > > >
> > > > > With the patch 1/5, it's possible to group read events from
> > > > > different pmus. "-C" can be used to set cpu list. The cpu list
> > > > > may be incompatible with pmu's cpumask.
> > > > > This patch checks the event's cpu maps, and discard the
> > > > > incompatible cpu maps.
> > > > > event's cpu maps is saved in evsel->cpus during option parse.
> > > > > Then the evlist's cpu maps is created in
> > > > > perf_evlist__create_maps. So the cpu maps can be check and re-
> organized in perf_evlist__create_maps.
> > > > > Only cpu_list need to check the cpu maps.
> > > >
> > > > Humm, I had something done in this area...
> > > >
> > > > Stephane complained about the confusion about which cpumap to
> use
> > > with
> > > > pmus, so I wrote a patch and sent an RFC, which I think I got no
> > > > comments, lemme dig it...
> > >
> > > Here it is, can you take a look? Stephane?
> > >
> >
> > Your patch is more like my 3/5 patch. The difference is your patch
> > force the evsel->cpus = evlist->cpus, if evsel->cpus == NULL.
> > My patch handle the evsel->cpus == NULL case when using it.
> 
> Idea is to use evsel->cpus always, not having to special case it and fallback
> to evlist->cpus, so that we don't have to pass evlist around that often.
> 
> > > @@ -1216,8 +1206,8 @@ static void print_aggr(char *prefix)
> > >  		evlist__for_each(evsel_list, counter) {
> > >  			val = ena = run = 0;
> > >  			nr = 0;
> > > -			for (cpu = 0; cpu < perf_evsel__nr_cpus(counter);
> > > cpu++) {
> > > -				cpu2 = perf_evsel__cpus(counter)-
> > > >map[cpu];
> > > +			for (cpu = 0; cpu < cpu_map__nr(counter->cpus);
> > > cpu++) {
> > > +				cpu2 = counter->cpus->map[cpu];
> > >  				s2 = aggr_get_id(evsel_list->cpus, cpu2);
> > >  				if (s2 != id)
> > >  					continue;
> >
> > print_aggr also need to be special handled. In the past, all events
> > use evlist's cpu map,so it uses index to find the real cpu id.
> > Now, event's cpu map are different. The s2 could be wrong.
> > For example, evlist's cpu map is 0,4,5,18. Event's cpu map could be 0,18.
> > When cpu == 1, the return of aggr_get_id must be wrong, since it still
> > use index to find s2.
> > My 3/5 patch introduce a function perf_evsel__get_cpumap_index to
> > handle it.
> >
> > Only your patch is not enough, we still need 2/5 and 4/5.
> > 2/5 is used to check if the event's cpu maps are compatible as
> > evlist's cpu map. For example, evlist's cpu map is 1,2,17. Event's cpu
> > map could be 0,18. We can error out earlier.
> > 4/5 is used to special handle the open and mmap. We need to do the
> > same thing as what we did in print_aggr.
> 
> I'll try to go thru this tomorrow, thanks for checking.

Hi Arnaldo,

Have you got a chance to check the patch?

Thanks,
Kan



> 
> - Arnaldo

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2015-03-18 12:31 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-03-03  8:54 [PATCH 1/5] perf,core: allow invalid context events to be part of sw/hw groups kan.liang
2015-03-03  8:54 ` [PATCH 2/5] perf,tools: check and re-organize evsel cpu maps kan.liang
2015-03-03 16:09   ` Arnaldo Carvalho de Melo
2015-03-03 16:11     ` Arnaldo Carvalho de Melo
2015-03-03 17:09       ` Liang, Kan
2015-03-04  0:15         ` Arnaldo Carvalho de Melo
2015-03-18 12:31           ` Liang, Kan
2015-03-03  8:54 ` [PATCH 3/5] perf,tools: change perf stat to use event's cpu map kan.liang
2015-03-03  8:54 ` [PATCH 4/5] perf,tools: open/mmap event according to event's cpu map not evlist's kan.liang
2015-03-03  8:54 ` [PATCH 5/5] perf/x86/intel/uncore: do not implicitly set uncore event cpu kan.liang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).