linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V2 1/3] perf,tools: get correct cpu id for print_aggr
@ 2015-07-02  7:08 kan.liang
  2015-07-02  7:08 ` [PATCH V2 2/3] perf,tools: check and re-organize evsel cpu maps kan.liang
                   ` (3 more replies)
  0 siblings, 4 replies; 14+ messages in thread
From: kan.liang @ 2015-07-02  7:08 UTC (permalink / raw)
  To: acme, jolsa
  Cc: ak, namhyung, eranian, adrian.hunter, dsahern, a.p.zijlstra,
	mingo, linux-kernel, Kan Liang

From: Kan Liang <kan.liang@intel.com>

print_aggr fails to print per-core/per-socket statistics after commit
582ec0829b3d ("perf stat: Fix per-socket output bug for uncore events")
if events have differnt cpus. Because in print_aggr, aggr_get_id needs
index (not cpu id) to find core/pkg id. Also, evsel cpu maps should be
used to get aggregated id.

Here is an example.
Counting events cycles,uncore_imc_0/cas_count_read/. (Uncore event has
cpumask 0,18)

"perf stat -e cycles,uncore_imc_0/cas_count_read/ -C0,18 --per-core
sleep 2"

Without this patch, it failes to get CPU 18 result.
 Performance counter stats for 'CPU(s) 0,18':

S0-C0           1            7526851      cycles
S0-C0           1               1.05 MiB  uncore_imc_0/cas_count_read/
S1-C0           0      <not counted>      cycles
S1-C0           0      <not counted> MiB  uncore_imc_0/cas_count_read/

With this patch, it can get both CPU0 and CPU18 result.
 Performance counter stats for 'CPU(s) 0,18':

S0-C0           1            6327768      cycles
S0-C0           1               0.47 MiB  uncore_imc_0/cas_count_read/
S1-C0           1             330228      cycles
S1-C0           1               0.29 MiB  uncore_imc_0/cas_count_read/

Signed-off-by: Kan Liang <kan.liang@intel.com>
---

Changes since V1
 - Modify changelog
 - Remove perf_evsel__get_cpumap_index 
 
 tools/perf/builtin-stat.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 37e301a..47c3c1f 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -694,7 +694,7 @@ static void abs_printout(int id, int nr, struct perf_evsel *evsel, double avg)
 static void print_aggr(char *prefix)
 {
 	struct perf_evsel *counter;
-	int cpu, cpu2, s, s2, id, nr;
+	int cpu, s, s2, id, nr;
 	double uval;
 	u64 ena, run, val;
 
@@ -707,8 +707,7 @@ static void print_aggr(char *prefix)
 			val = ena = run = 0;
 			nr = 0;
 			for (cpu = 0; cpu < perf_evsel__nr_cpus(counter); cpu++) {
-				cpu2 = perf_evsel__cpus(counter)->map[cpu];
-				s2 = aggr_get_id(evsel_list->cpus, cpu2);
+				s2 = aggr_get_id(perf_evsel__cpus(counter), cpu);
 				if (s2 != id)
 					continue;
 				val += perf_counts(counter->counts, cpu, 0)->val;
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH V2 2/3] perf,tools: check and re-organize evsel cpu maps
  2015-07-02  7:08 [PATCH V2 1/3] perf,tools: get correct cpu id for print_aggr kan.liang
@ 2015-07-02  7:08 ` kan.liang
  2015-07-15 21:14   ` Liang, Kan
  2015-07-02  7:08 ` [PATCH V2 3/3] perf,tests: Add cpu_map tests kan.liang
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 14+ messages in thread
From: kan.liang @ 2015-07-02  7:08 UTC (permalink / raw)
  To: acme, jolsa
  Cc: ak, namhyung, eranian, adrian.hunter, dsahern, a.p.zijlstra,
	mingo, linux-kernel, Kan Liang

From: Kan Liang <kan.liang@intel.com>

Some PMU events have cpumask, e.g uncore events. The cpu list set by
user may be incompatible with event's cpumask.
This patch will check the user defined cpu list. If the incompatible cpu
is found, it will warn the user and discard the incompatible cpu. Only
available cpu can be stored in evsel->cpus->map. If there is no cpu from
cpu list compatible with event's cpumask. It will error out.

Here is an example.
According to cpumask, uncore should be default on CPU0 and CPU18.
So the S0-C1 for uncore should show "<not counted>".

Without this patch
 $ sudo ./perf stat -e cycles,uncore_imc_0/cas_count_read/ -C0,1,18
--per-core sleep 2

 Performance counter stats for 'CPU(s) 0,1,18':

S0-C0           1            6749638      cycles
S0-C0           1               0.83 MiB  uncore_imc_0/cas_count_read/
(100.00%)
S0-C1           1             232421      cycles
S0-C1           1               0.83 MiB  uncore_imc_0/cas_count_read/
S1-C0           1             236997      cycles
S1-C0           1               0.35 MiB  uncore_imc_0/cas_count_read/

       2.001094019 seconds time elapsed

With this patch
 $ perf stat -e cycles,uncore_imc_0/cas_count_read/ -C0,1,18 --per-core
sleep 2
event uncore_imc_0/cas_count_read/ can only be monitored on CPU 0 18.
Other CPUs will be discard.

 Performance counter stats for 'CPU(s) 0,1,18':

S0-C0           1            5557406      cycles
S0-C0           1               0.21 MiB  uncore_imc_0/cas_count_read/
S0-C1           1            1012534      cycles
S0-C1           0      <not counted> MiB  uncore_imc_0/cas_count_read/
S1-C0           1             916130      cycles
S1-C0           1               0.08 MiB  uncore_imc_0/cas_count_read/

       2.001110843 seconds time elapsed

Signed-off-by: Kan Liang <kan.liang@intel.com>
---

Changes since V1
 - Re-implement the way to find the common cpus
 - Add test case

 tools/perf/util/evlist.c | 72 ++++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 67 insertions(+), 5 deletions(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 6cfdee6..d67414f 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -1101,6 +1101,66 @@ int perf_evlist__mmap(struct perf_evlist *evlist, unsigned int pages,
 	return perf_evlist__mmap_ex(evlist, pages, overwrite, 0, false);
 }
 
+static int cmp_ids(const void *a, const void *b)
+{
+	return *(int *)a - *(int *)b;
+}
+
+static int perf_evlist__check_evsel_cpus(struct perf_evlist *evlist, struct perf_evsel *evsel)
+{
+	const int ncpus = cpu_map__nr(evlist->cpus);
+	int i = 0, j = 0, cpu_nr = 0, tmp;
+
+	/* ensure we process id in increasing order */
+	qsort(evlist->cpus->map, evlist->cpus->nr, sizeof(int), cmp_ids);
+
+	/* find the common cpus between evsel and evlist. */
+	while ((i < cpu_map__nr(evlist->cpus)) &&
+	       (j < cpu_map__nr(evsel->cpus))) {
+
+		if (evlist->cpus->map[i] == evsel->cpus->map[j]) {
+			cpu_nr++;
+			i++;
+			j++;
+		} else if (evlist->cpus->map[i] > evsel->cpus->map[j]) {
+			evsel->cpus->map[j++] = -1;
+		} else
+			i++;
+
+	}
+	while (j < cpu_map__nr(evsel->cpus))
+		evsel->cpus->map[j++] = -1;
+
+	if (cpu_nr == 0) {
+		pr_warning("event %s cannot be monitored on the given cpus."
+			   "Please check cpumask\n", evsel->name);
+		return -1;
+	}
+
+	if (ncpus > cpu_nr)
+		pr_warning("event %s can only be monitored on CPU", evsel->name);
+
+	/* order evsel cpus */
+	for (i = 0, tmp = 0; i < cpu_nr; i++) {
+		if (evsel->cpus->map[i] == -1) {
+			while (evsel->cpus->map[tmp] == -1) {
+				tmp++;
+				BUG_ON(tmp >= cpu_map__nr(evsel->cpus));
+			}
+			evsel->cpus->map[i] = evsel->cpus->map[tmp];
+			evsel->cpus->map[tmp] = -1;
+		}
+		if (ncpus > cpu_nr)
+			pr_warning(" %d", evsel->cpus->map[i]);
+		tmp++;
+	}
+	evsel->cpus->nr = cpu_nr;
+	if (ncpus > cpu_nr)
+		pr_warning(". Other CPUs will be discard.\n");
+
+	return 0;
+}
+
 static int perf_evlist__propagate_maps(struct perf_evlist *evlist,
 				       struct target *target)
 {
@@ -1108,13 +1168,15 @@ static int perf_evlist__propagate_maps(struct perf_evlist *evlist,
 
 	evlist__for_each(evlist, evsel) {
 		/*
-		 * We already have cpus for evsel (via PMU sysfs) so
-		 * keep it, if there's no target cpu list defined.
+		 * We already have cpus for evsel (via PMU sysfs)
+		 * and target cpu list defined, check if they are
+		 * compatible. If not, discard incompatible cpus.
 		 */
-		if (evsel->cpus && target->cpu_list)
-			cpu_map__put(evsel->cpus);
+		if (evsel->cpus && target->cpu_list &&
+		    perf_evlist__check_evsel_cpus(evlist, evsel))
+			return -EINVAL;
 
-		if (!evsel->cpus || target->cpu_list)
+		if (!evsel->cpus)
 			evsel->cpus = cpu_map__get(evlist->cpus);
 
 		evsel->threads = thread_map__get(evlist->threads);
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH V2 3/3] perf,tests: Add cpu_map tests
  2015-07-02  7:08 [PATCH V2 1/3] perf,tools: get correct cpu id for print_aggr kan.liang
  2015-07-02  7:08 ` [PATCH V2 2/3] perf,tools: check and re-organize evsel cpu maps kan.liang
@ 2015-07-02  7:08 ` kan.liang
  2015-07-02 16:12   ` Jiri Olsa
  2015-07-02 16:13 ` [PATCH V2 1/3] perf,tools: get correct cpu id for print_aggr Jiri Olsa
  2015-08-31  8:27 ` [tip:perf/urgent] perf stat: Get " tip-bot for Kan Liang
  3 siblings, 1 reply; 14+ messages in thread
From: kan.liang @ 2015-07-02  7:08 UTC (permalink / raw)
  To: acme, jolsa
  Cc: ak, namhyung, eranian, adrian.hunter, dsahern, a.p.zijlstra,
	mingo, linux-kernel, Kan Liang

From: Kan Liang <kan.liang@intel.com>

Adding cpu_map tests to check evlist and evsel cpu map

Signed-off-by: Kan Liang <kan.liang@intel.com>
---
 tools/perf/tests/Build          |   1 +
 tools/perf/tests/builtin-test.c |   4 ++
 tools/perf/tests/cpu-map.c      | 113 ++++++++++++++++++++++++++++++++++++++++
 tools/perf/tests/tests.h        |   1 +
 4 files changed, 119 insertions(+)
 create mode 100644 tools/perf/tests/cpu-map.c

diff --git a/tools/perf/tests/Build b/tools/perf/tests/Build
index d20d6e6..abc848e 100644
--- a/tools/perf/tests/Build
+++ b/tools/perf/tests/Build
@@ -32,6 +32,7 @@ perf-y += sample-parsing.o
 perf-y += parse-no-sample-id-all.o
 perf-y += kmod-path.o
 perf-y += thread-map.o
+perf-y += cpu-map.o
 
 perf-$(CONFIG_X86) += perf-time-to-tsc.o
 
diff --git a/tools/perf/tests/builtin-test.c b/tools/perf/tests/builtin-test.c
index c1dde73..5f279e1 100644
--- a/tools/perf/tests/builtin-test.c
+++ b/tools/perf/tests/builtin-test.c
@@ -175,6 +175,10 @@ static struct test {
 		.func = test__thread_map,
 	},
 	{
+		.desc = "Test cpu map",
+		.func = test__cpu_map,
+	},
+	{
 		.func = NULL,
 	},
 };
diff --git a/tools/perf/tests/cpu-map.c b/tools/perf/tests/cpu-map.c
new file mode 100644
index 0000000..fa69aca
--- /dev/null
+++ b/tools/perf/tests/cpu-map.c
@@ -0,0 +1,113 @@
+#include <sys/types.h>
+#include <unistd.h>
+#include "evsel.h"
+#include "evlist.h"
+#include "tests.h"
+#include "debug.h"
+
+static int do_cpumap(struct target *target,
+		     struct perf_evlist *evlist,
+		      const char *cpu_list)
+{
+	struct perf_evsel *evsel;
+
+	evsel = zalloc(sizeof(struct perf_evsel));
+	if (evsel == NULL) {
+		pr_debug("failed to alloc evsel\n");
+		return -1;
+	}
+	perf_evlist__add(evlist, evsel);
+	if (cpu_list != NULL)
+		evsel->cpus = cpu_map__new(cpu_list);
+	else
+		evsel->cpus = NULL;
+
+	if (perf_evlist__create_maps(evlist, target) < 0)
+		return -1;
+
+	return 0;
+}
+
+int test__cpu_map(void)
+{
+	struct perf_evlist *evlist;
+	struct perf_evsel *evsel;
+	struct target target = {
+		.uid = UINT_MAX,
+	};
+	int i, err;
+
+	/* system wide */
+	evlist = perf_evlist__new();
+	target.system_wide = true;
+	err = do_cpumap(&target, evlist, NULL);
+	TEST_ASSERT_VAL("wrong ret of do_cpumap", 0 == err);
+
+	evsel = perf_evlist__first(evlist);
+	TEST_ASSERT_VAL("wrong number of cpu", cpu_map__nr(evlist->cpus) == cpu_map__nr(evsel->cpus));
+	for (i = 0; i < cpu_map__nr(evsel->cpus); i++)
+		TEST_ASSERT_VAL("wrong cpu map", evlist->cpus->map[i] == evsel->cpus->map[i]);
+	perf_evlist__delete(evlist);
+
+
+	/* system wide + evsel cpumask */
+	evlist = perf_evlist__new();
+	target.system_wide = true;
+	err = do_cpumap(&target, evlist, "0");
+
+	evsel = perf_evlist__first(evlist);
+	TEST_ASSERT_VAL("wrong number of cpu", 1 == cpu_map__nr(evsel->cpus));
+	TEST_ASSERT_VAL("wrong cpu map", 0 == evsel->cpus->map[0]);
+	perf_evlist__delete(evlist);
+
+
+	/* defined cpu list */
+	evlist = perf_evlist__new();
+	target.cpu_list = "0,1,18,26";
+	err = do_cpumap(&target, evlist, NULL);
+	TEST_ASSERT_VAL("wrong ret of do_cpumap", 0 == err);
+
+	evsel = perf_evlist__first(evlist);
+	TEST_ASSERT_VAL("wrong number of cpu", 4 == cpu_map__nr(evlist->cpus));
+	TEST_ASSERT_VAL("wrong number of cpu", cpu_map__nr(evlist->cpus) == cpu_map__nr(evsel->cpus));
+	for (i = 0; i < cpu_map__nr(evsel->cpus); i++)
+		TEST_ASSERT_VAL("wrong cpu map", evlist->cpus->map[i] == evsel->cpus->map[i]);
+	perf_evlist__delete(evlist);
+
+
+	/* defined cpu list + evsel cpumask (all match) */
+	evlist = perf_evlist__new();
+	target.cpu_list = "0,1,18,26";
+	err = do_cpumap(&target, evlist, "1,18");
+	TEST_ASSERT_VAL("wrong ret of do_cpumap", 0 == err);
+
+	evsel = perf_evlist__first(evlist);
+	TEST_ASSERT_VAL("wrong number of cpu", 4 == cpu_map__nr(evlist->cpus));
+	TEST_ASSERT_VAL("wrong number of cpu", 2 == cpu_map__nr(evsel->cpus));
+	TEST_ASSERT_VAL("wrong cpu map", 1 == evsel->cpus->map[0]);
+	TEST_ASSERT_VAL("wrong cpu map", 18 == evsel->cpus->map[1]);
+	perf_evlist__delete(evlist);
+
+
+	/* defined cpu list + evsel cpumask (partial match) */
+	evlist = perf_evlist__new();
+	target.cpu_list = "0,1,18,26";
+	err = do_cpumap(&target, evlist, "2,18,30");
+	TEST_ASSERT_VAL("wrong ret of do_cpumap", 0 == err);
+
+	evsel = perf_evlist__first(evlist);
+	TEST_ASSERT_VAL("wrong number of cpu", 4 == cpu_map__nr(evlist->cpus));
+	TEST_ASSERT_VAL("wrong number of cpu", 1 == cpu_map__nr(evsel->cpus));
+	TEST_ASSERT_VAL("wrong cpu map", 18 == evsel->cpus->map[0]);
+	perf_evlist__delete(evlist);
+
+
+	/* defined cpu list + evsel cpumask (not match) */
+	evlist = perf_evlist__new();
+	target.cpu_list = "0,1,18,26";
+	err = do_cpumap(&target, evlist, "2,9,30");
+	TEST_ASSERT_VAL("wrong ret of do_cpumap", 0 != err);
+	perf_evlist__delete(evlist);
+
+	return 0;
+}
diff --git a/tools/perf/tests/tests.h b/tools/perf/tests/tests.h
index ebb47d9..cb6b994 100644
--- a/tools/perf/tests/tests.h
+++ b/tools/perf/tests/tests.h
@@ -62,6 +62,7 @@ int test__fdarray__filter(void);
 int test__fdarray__add(void);
 int test__kmod_path__parse(void);
 int test__thread_map(void);
+int test__cpu_map(void);
 
 #if defined(__x86_64__) || defined(__i386__) || defined(__arm__) || defined(__aarch64__)
 #ifdef HAVE_DWARF_UNWIND_SUPPORT
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH V2 3/3] perf,tests: Add cpu_map tests
  2015-07-02  7:08 ` [PATCH V2 3/3] perf,tests: Add cpu_map tests kan.liang
@ 2015-07-02 16:12   ` Jiri Olsa
  2015-07-02 16:58     ` Liang, Kan
  0 siblings, 1 reply; 14+ messages in thread
From: Jiri Olsa @ 2015-07-02 16:12 UTC (permalink / raw)
  To: kan.liang
  Cc: acme, jolsa, ak, namhyung, eranian, adrian.hunter, dsahern,
	a.p.zijlstra, mingo, linux-kernel

On Thu, Jul 02, 2015 at 03:08:45AM -0400, kan.liang@intel.com wrote:
> From: Kan Liang <kan.liang@intel.com>
> 
> Adding cpu_map tests to check evlist and evsel cpu map


the test needs to be quiet (unless -v is specified)..

38: Test cpu map                                             :event (null) can only be monitored on CPU 1 18. Other CPUs will be discard.
event (null) can only be monitored on CPU 18. Other CPUs will be discard.
event (null) cannot be monitored on the given cpus.Please check cpumask
 Ok

jirka

> 
> Signed-off-by: Kan Liang <kan.liang@intel.com>
> ---
>  tools/perf/tests/Build          |   1 +
>  tools/perf/tests/builtin-test.c |   4 ++
>  tools/perf/tests/cpu-map.c      | 113 ++++++++++++++++++++++++++++++++++++++++
>  tools/perf/tests/tests.h        |   1 +
>  4 files changed, 119 insertions(+)
>  create mode 100644 tools/perf/tests/cpu-map.c
> 
> diff --git a/tools/perf/tests/Build b/tools/perf/tests/Build
> index d20d6e6..abc848e 100644
> --- a/tools/perf/tests/Build
> +++ b/tools/perf/tests/Build
> @@ -32,6 +32,7 @@ perf-y += sample-parsing.o
>  perf-y += parse-no-sample-id-all.o
>  perf-y += kmod-path.o
>  perf-y += thread-map.o
> +perf-y += cpu-map.o
>  
>  perf-$(CONFIG_X86) += perf-time-to-tsc.o
>  
> diff --git a/tools/perf/tests/builtin-test.c b/tools/perf/tests/builtin-test.c
> index c1dde73..5f279e1 100644
> --- a/tools/perf/tests/builtin-test.c
> +++ b/tools/perf/tests/builtin-test.c
> @@ -175,6 +175,10 @@ static struct test {
>  		.func = test__thread_map,
>  	},
>  	{
> +		.desc = "Test cpu map",
> +		.func = test__cpu_map,
> +	},
> +	{
>  		.func = NULL,
>  	},
>  };
> diff --git a/tools/perf/tests/cpu-map.c b/tools/perf/tests/cpu-map.c
> new file mode 100644
> index 0000000..fa69aca
> --- /dev/null
> +++ b/tools/perf/tests/cpu-map.c
> @@ -0,0 +1,113 @@
> +#include <sys/types.h>
> +#include <unistd.h>
> +#include "evsel.h"
> +#include "evlist.h"
> +#include "tests.h"
> +#include "debug.h"
> +
> +static int do_cpumap(struct target *target,
> +		     struct perf_evlist *evlist,
> +		      const char *cpu_list)
> +{
> +	struct perf_evsel *evsel;
> +
> +	evsel = zalloc(sizeof(struct perf_evsel));
> +	if (evsel == NULL) {
> +		pr_debug("failed to alloc evsel\n");
> +		return -1;
> +	}
> +	perf_evlist__add(evlist, evsel);
> +	if (cpu_list != NULL)
> +		evsel->cpus = cpu_map__new(cpu_list);
> +	else
> +		evsel->cpus = NULL;
> +
> +	if (perf_evlist__create_maps(evlist, target) < 0)
> +		return -1;
> +
> +	return 0;
> +}
> +
> +int test__cpu_map(void)
> +{
> +	struct perf_evlist *evlist;
> +	struct perf_evsel *evsel;
> +	struct target target = {
> +		.uid = UINT_MAX,
> +	};
> +	int i, err;
> +
> +	/* system wide */
> +	evlist = perf_evlist__new();
> +	target.system_wide = true;
> +	err = do_cpumap(&target, evlist, NULL);
> +	TEST_ASSERT_VAL("wrong ret of do_cpumap", 0 == err);
> +
> +	evsel = perf_evlist__first(evlist);
> +	TEST_ASSERT_VAL("wrong number of cpu", cpu_map__nr(evlist->cpus) == cpu_map__nr(evsel->cpus));
> +	for (i = 0; i < cpu_map__nr(evsel->cpus); i++)
> +		TEST_ASSERT_VAL("wrong cpu map", evlist->cpus->map[i] == evsel->cpus->map[i]);
> +	perf_evlist__delete(evlist);
> +
> +
> +	/* system wide + evsel cpumask */
> +	evlist = perf_evlist__new();
> +	target.system_wide = true;
> +	err = do_cpumap(&target, evlist, "0");
> +
> +	evsel = perf_evlist__first(evlist);
> +	TEST_ASSERT_VAL("wrong number of cpu", 1 == cpu_map__nr(evsel->cpus));
> +	TEST_ASSERT_VAL("wrong cpu map", 0 == evsel->cpus->map[0]);
> +	perf_evlist__delete(evlist);
> +
> +
> +	/* defined cpu list */
> +	evlist = perf_evlist__new();
> +	target.cpu_list = "0,1,18,26";
> +	err = do_cpumap(&target, evlist, NULL);
> +	TEST_ASSERT_VAL("wrong ret of do_cpumap", 0 == err);
> +
> +	evsel = perf_evlist__first(evlist);
> +	TEST_ASSERT_VAL("wrong number of cpu", 4 == cpu_map__nr(evlist->cpus));
> +	TEST_ASSERT_VAL("wrong number of cpu", cpu_map__nr(evlist->cpus) == cpu_map__nr(evsel->cpus));
> +	for (i = 0; i < cpu_map__nr(evsel->cpus); i++)
> +		TEST_ASSERT_VAL("wrong cpu map", evlist->cpus->map[i] == evsel->cpus->map[i]);
> +	perf_evlist__delete(evlist);
> +
> +
> +	/* defined cpu list + evsel cpumask (all match) */
> +	evlist = perf_evlist__new();
> +	target.cpu_list = "0,1,18,26";
> +	err = do_cpumap(&target, evlist, "1,18");
> +	TEST_ASSERT_VAL("wrong ret of do_cpumap", 0 == err);
> +
> +	evsel = perf_evlist__first(evlist);
> +	TEST_ASSERT_VAL("wrong number of cpu", 4 == cpu_map__nr(evlist->cpus));
> +	TEST_ASSERT_VAL("wrong number of cpu", 2 == cpu_map__nr(evsel->cpus));
> +	TEST_ASSERT_VAL("wrong cpu map", 1 == evsel->cpus->map[0]);
> +	TEST_ASSERT_VAL("wrong cpu map", 18 == evsel->cpus->map[1]);
> +	perf_evlist__delete(evlist);
> +
> +
> +	/* defined cpu list + evsel cpumask (partial match) */
> +	evlist = perf_evlist__new();
> +	target.cpu_list = "0,1,18,26";
> +	err = do_cpumap(&target, evlist, "2,18,30");
> +	TEST_ASSERT_VAL("wrong ret of do_cpumap", 0 == err);
> +
> +	evsel = perf_evlist__first(evlist);
> +	TEST_ASSERT_VAL("wrong number of cpu", 4 == cpu_map__nr(evlist->cpus));
> +	TEST_ASSERT_VAL("wrong number of cpu", 1 == cpu_map__nr(evsel->cpus));
> +	TEST_ASSERT_VAL("wrong cpu map", 18 == evsel->cpus->map[0]);
> +	perf_evlist__delete(evlist);
> +
> +
> +	/* defined cpu list + evsel cpumask (not match) */
> +	evlist = perf_evlist__new();
> +	target.cpu_list = "0,1,18,26";
> +	err = do_cpumap(&target, evlist, "2,9,30");
> +	TEST_ASSERT_VAL("wrong ret of do_cpumap", 0 != err);
> +	perf_evlist__delete(evlist);
> +
> +	return 0;
> +}
> diff --git a/tools/perf/tests/tests.h b/tools/perf/tests/tests.h
> index ebb47d9..cb6b994 100644
> --- a/tools/perf/tests/tests.h
> +++ b/tools/perf/tests/tests.h
> @@ -62,6 +62,7 @@ int test__fdarray__filter(void);
>  int test__fdarray__add(void);
>  int test__kmod_path__parse(void);
>  int test__thread_map(void);
> +int test__cpu_map(void);
>  
>  #if defined(__x86_64__) || defined(__i386__) || defined(__arm__) || defined(__aarch64__)
>  #ifdef HAVE_DWARF_UNWIND_SUPPORT
> -- 
> 1.8.3.1
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH V2 1/3] perf,tools: get correct cpu id for print_aggr
  2015-07-02  7:08 [PATCH V2 1/3] perf,tools: get correct cpu id for print_aggr kan.liang
  2015-07-02  7:08 ` [PATCH V2 2/3] perf,tools: check and re-organize evsel cpu maps kan.liang
  2015-07-02  7:08 ` [PATCH V2 3/3] perf,tests: Add cpu_map tests kan.liang
@ 2015-07-02 16:13 ` Jiri Olsa
  2015-08-28 13:33   ` Liang, Kan
  2015-08-31  8:27 ` [tip:perf/urgent] perf stat: Get " tip-bot for Kan Liang
  3 siblings, 1 reply; 14+ messages in thread
From: Jiri Olsa @ 2015-07-02 16:13 UTC (permalink / raw)
  To: kan.liang
  Cc: acme, jolsa, ak, namhyung, eranian, adrian.hunter, dsahern,
	a.p.zijlstra, mingo, linux-kernel

On Thu, Jul 02, 2015 at 03:08:43AM -0400, kan.liang@intel.com wrote:
> From: Kan Liang <kan.liang@intel.com>
> 
> print_aggr fails to print per-core/per-socket statistics after commit
> 582ec0829b3d ("perf stat: Fix per-socket output bug for uncore events")
> if events have differnt cpus. Because in print_aggr, aggr_get_id needs
> index (not cpu id) to find core/pkg id. Also, evsel cpu maps should be
> used to get aggregated id.
> 
> Here is an example.
> Counting events cycles,uncore_imc_0/cas_count_read/. (Uncore event has
> cpumask 0,18)
> 
> "perf stat -e cycles,uncore_imc_0/cas_count_read/ -C0,18 --per-core
> sleep 2"
> 
> Without this patch, it failes to get CPU 18 result.
>  Performance counter stats for 'CPU(s) 0,18':
> 
> S0-C0           1            7526851      cycles
> S0-C0           1               1.05 MiB  uncore_imc_0/cas_count_read/
> S1-C0           0      <not counted>      cycles
> S1-C0           0      <not counted> MiB  uncore_imc_0/cas_count_read/
> 
> With this patch, it can get both CPU0 and CPU18 result.
>  Performance counter stats for 'CPU(s) 0,18':
> 
> S0-C0           1            6327768      cycles
> S0-C0           1               0.47 MiB  uncore_imc_0/cas_count_read/
> S1-C0           1             330228      cycles
> S1-C0           1               0.29 MiB  uncore_imc_0/cas_count_read/
> 
> Signed-off-by: Kan Liang <kan.liang@intel.com>

Acked-by: Jiri Olsa <jolsa@kernel.org>

thanks,
jirka

> ---
> 
> Changes since V1
>  - Modify changelog
>  - Remove perf_evsel__get_cpumap_index 
>  
>  tools/perf/builtin-stat.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
> index 37e301a..47c3c1f 100644
> --- a/tools/perf/builtin-stat.c
> +++ b/tools/perf/builtin-stat.c
> @@ -694,7 +694,7 @@ static void abs_printout(int id, int nr, struct perf_evsel *evsel, double avg)
>  static void print_aggr(char *prefix)
>  {
>  	struct perf_evsel *counter;
> -	int cpu, cpu2, s, s2, id, nr;
> +	int cpu, s, s2, id, nr;
>  	double uval;
>  	u64 ena, run, val;
>  
> @@ -707,8 +707,7 @@ static void print_aggr(char *prefix)
>  			val = ena = run = 0;
>  			nr = 0;
>  			for (cpu = 0; cpu < perf_evsel__nr_cpus(counter); cpu++) {
> -				cpu2 = perf_evsel__cpus(counter)->map[cpu];
> -				s2 = aggr_get_id(evsel_list->cpus, cpu2);
> +				s2 = aggr_get_id(perf_evsel__cpus(counter), cpu);
>  				if (s2 != id)
>  					continue;
>  				val += perf_counts(counter->counts, cpu, 0)->val;
> -- 
> 1.8.3.1
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [PATCH V2 3/3] perf,tests: Add cpu_map tests
  2015-07-02 16:12   ` Jiri Olsa
@ 2015-07-02 16:58     ` Liang, Kan
  0 siblings, 0 replies; 14+ messages in thread
From: Liang, Kan @ 2015-07-02 16:58 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: acme, jolsa, ak, namhyung, eranian, Hunter, Adrian, dsahern,
	a.p.zijlstra, mingo, linux-kernel


> 
> On Thu, Jul 02, 2015 at 03:08:45AM -0400, kan.liang@intel.com wrote:
> > From: Kan Liang <kan.liang@intel.com>
> >
> > Adding cpu_map tests to check evlist and evsel cpu map
> 
> 
> the test needs to be quiet (unless -v is specified)..

The test includes some cases to check error handling, which will print
some warnings. For example, the user setting is not compatible with
cpumask.
Should I remove the cases which checking error handling?

Thanks,
Kan

> 
> 38: Test cpu map                                             :event (null) can only be monitored
> on CPU 1 18. Other CPUs will be discard.
> event (null) can only be monitored on CPU 18. Other CPUs will be discard.
> event (null) cannot be monitored on the given cpus.Please check cpumask
> Ok
> 
> jirka
> 
> >
> > Signed-off-by: Kan Liang <kan.liang@intel.com>
> > ---
> >  tools/perf/tests/Build          |   1 +
> >  tools/perf/tests/builtin-test.c |   4 ++
> >  tools/perf/tests/cpu-map.c      | 113
> ++++++++++++++++++++++++++++++++++++++++
> >  tools/perf/tests/tests.h        |   1 +
> >  4 files changed, 119 insertions(+)
> >  create mode 100644 tools/perf/tests/cpu-map.c
> >
> > diff --git a/tools/perf/tests/Build b/tools/perf/tests/Build index
> > d20d6e6..abc848e 100644
> > --- a/tools/perf/tests/Build
> > +++ b/tools/perf/tests/Build
> > @@ -32,6 +32,7 @@ perf-y += sample-parsing.o  perf-y +=
> > parse-no-sample-id-all.o  perf-y += kmod-path.o  perf-y +=
> > thread-map.o
> > +perf-y += cpu-map.o
> >
> >  perf-$(CONFIG_X86) += perf-time-to-tsc.o
> >
> > diff --git a/tools/perf/tests/builtin-test.c
> > b/tools/perf/tests/builtin-test.c index c1dde73..5f279e1 100644
> > --- a/tools/perf/tests/builtin-test.c
> > +++ b/tools/perf/tests/builtin-test.c
> > @@ -175,6 +175,10 @@ static struct test {
> >  		.func = test__thread_map,
> >  	},
> >  	{
> > +		.desc = "Test cpu map",
> > +		.func = test__cpu_map,
> > +	},
> > +	{
> >  		.func = NULL,
> >  	},
> >  };
> > diff --git a/tools/perf/tests/cpu-map.c b/tools/perf/tests/cpu-map.c
> > new file mode 100644 index 0000000..fa69aca
> > --- /dev/null
> > +++ b/tools/perf/tests/cpu-map.c
> > @@ -0,0 +1,113 @@
> > +#include <sys/types.h>
> > +#include <unistd.h>
> > +#include "evsel.h"
> > +#include "evlist.h"
> > +#include "tests.h"
> > +#include "debug.h"
> > +
> > +static int do_cpumap(struct target *target,
> > +		     struct perf_evlist *evlist,
> > +		      const char *cpu_list)
> > +{
> > +	struct perf_evsel *evsel;
> > +
> > +	evsel = zalloc(sizeof(struct perf_evsel));
> > +	if (evsel == NULL) {
> > +		pr_debug("failed to alloc evsel\n");
> > +		return -1;
> > +	}
> > +	perf_evlist__add(evlist, evsel);
> > +	if (cpu_list != NULL)
> > +		evsel->cpus = cpu_map__new(cpu_list);
> > +	else
> > +		evsel->cpus = NULL;
> > +
> > +	if (perf_evlist__create_maps(evlist, target) < 0)
> > +		return -1;
> > +
> > +	return 0;
> > +}
> > +
> > +int test__cpu_map(void)
> > +{
> > +	struct perf_evlist *evlist;
> > +	struct perf_evsel *evsel;
> > +	struct target target = {
> > +		.uid = UINT_MAX,
> > +	};
> > +	int i, err;
> > +
> > +	/* system wide */
> > +	evlist = perf_evlist__new();
> > +	target.system_wide = true;
> > +	err = do_cpumap(&target, evlist, NULL);
> > +	TEST_ASSERT_VAL("wrong ret of do_cpumap", 0 == err);
> > +
> > +	evsel = perf_evlist__first(evlist);
> > +	TEST_ASSERT_VAL("wrong number of cpu", cpu_map__nr(evlist-
> >cpus) == cpu_map__nr(evsel->cpus));
> > +	for (i = 0; i < cpu_map__nr(evsel->cpus); i++)
> > +		TEST_ASSERT_VAL("wrong cpu map", evlist->cpus->map[i]
> == evsel->cpus->map[i]);
> > +	perf_evlist__delete(evlist);
> > +
> > +
> > +	/* system wide + evsel cpumask */
> > +	evlist = perf_evlist__new();
> > +	target.system_wide = true;
> > +	err = do_cpumap(&target, evlist, "0");
> > +
> > +	evsel = perf_evlist__first(evlist);
> > +	TEST_ASSERT_VAL("wrong number of cpu", 1 ==
> cpu_map__nr(evsel->cpus));
> > +	TEST_ASSERT_VAL("wrong cpu map", 0 == evsel->cpus->map[0]);
> > +	perf_evlist__delete(evlist);
> > +
> > +
> > +	/* defined cpu list */
> > +	evlist = perf_evlist__new();
> > +	target.cpu_list = "0,1,18,26";
> > +	err = do_cpumap(&target, evlist, NULL);
> > +	TEST_ASSERT_VAL("wrong ret of do_cpumap", 0 == err);
> > +
> > +	evsel = perf_evlist__first(evlist);
> > +	TEST_ASSERT_VAL("wrong number of cpu", 4 ==
> cpu_map__nr(evlist->cpus));
> > +	TEST_ASSERT_VAL("wrong number of cpu", cpu_map__nr(evlist-
> >cpus) == cpu_map__nr(evsel->cpus));
> > +	for (i = 0; i < cpu_map__nr(evsel->cpus); i++)
> > +		TEST_ASSERT_VAL("wrong cpu map", evlist->cpus->map[i]
> == evsel->cpus->map[i]);
> > +	perf_evlist__delete(evlist);
> > +
> > +
> > +	/* defined cpu list + evsel cpumask (all match) */
> > +	evlist = perf_evlist__new();
> > +	target.cpu_list = "0,1,18,26";
> > +	err = do_cpumap(&target, evlist, "1,18");
> > +	TEST_ASSERT_VAL("wrong ret of do_cpumap", 0 == err);
> > +
> > +	evsel = perf_evlist__first(evlist);
> > +	TEST_ASSERT_VAL("wrong number of cpu", 4 ==
> cpu_map__nr(evlist->cpus));
> > +	TEST_ASSERT_VAL("wrong number of cpu", 2 ==
> cpu_map__nr(evsel->cpus));
> > +	TEST_ASSERT_VAL("wrong cpu map", 1 == evsel->cpus->map[0]);
> > +	TEST_ASSERT_VAL("wrong cpu map", 18 == evsel->cpus->map[1]);
> > +	perf_evlist__delete(evlist);
> > +
> > +
> > +	/* defined cpu list + evsel cpumask (partial match) */
> > +	evlist = perf_evlist__new();
> > +	target.cpu_list = "0,1,18,26";
> > +	err = do_cpumap(&target, evlist, "2,18,30");
> > +	TEST_ASSERT_VAL("wrong ret of do_cpumap", 0 == err);
> > +
> > +	evsel = perf_evlist__first(evlist);
> > +	TEST_ASSERT_VAL("wrong number of cpu", 4 ==
> cpu_map__nr(evlist->cpus));
> > +	TEST_ASSERT_VAL("wrong number of cpu", 1 ==
> cpu_map__nr(evsel->cpus));
> > +	TEST_ASSERT_VAL("wrong cpu map", 18 == evsel->cpus->map[0]);
> > +	perf_evlist__delete(evlist);
> > +
> > +
> > +	/* defined cpu list + evsel cpumask (not match) */
> > +	evlist = perf_evlist__new();
> > +	target.cpu_list = "0,1,18,26";
> > +	err = do_cpumap(&target, evlist, "2,9,30");
> > +	TEST_ASSERT_VAL("wrong ret of do_cpumap", 0 != err);
> > +	perf_evlist__delete(evlist);
> > +
> > +	return 0;
> > +}
> > diff --git a/tools/perf/tests/tests.h b/tools/perf/tests/tests.h index
> > ebb47d9..cb6b994 100644
> > --- a/tools/perf/tests/tests.h
> > +++ b/tools/perf/tests/tests.h
> > @@ -62,6 +62,7 @@ int test__fdarray__filter(void);  int
> > test__fdarray__add(void);  int test__kmod_path__parse(void);  int
> > test__thread_map(void);
> > +int test__cpu_map(void);
> >
> >  #if defined(__x86_64__) || defined(__i386__) || defined(__arm__) ||
> > defined(__aarch64__)  #ifdef HAVE_DWARF_UNWIND_SUPPORT
> > --
> > 1.8.3.1
> >

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [PATCH V2 2/3] perf,tools: check and re-organize evsel cpu maps
  2015-07-02  7:08 ` [PATCH V2 2/3] perf,tools: check and re-organize evsel cpu maps kan.liang
@ 2015-07-15 21:14   ` Liang, Kan
  0 siblings, 0 replies; 14+ messages in thread
From: Liang, Kan @ 2015-07-15 21:14 UTC (permalink / raw)
  To: eranian, acme, jolsa
  Cc: ak, namhyung, Hunter, Adrian, dsahern, a.p.zijlstra, mingo, linux-kernel

Hi Stephane,

Any comments about this patch?

Currently, perf stat -a and -C don't have consistent results for per-socket
uncore event.
With -a and --per-core, only first cpu of each socket show results. Other
cpus show "<not counted>".
But with -C and --per-core, all cpus have a result.  That's not correct. 
This patch only count the CPU in the cpumask and show "<not counted>"
for other CPUs. It only impacts -C.

Thanks,
Kan
> 
> From: Kan Liang <kan.liang@intel.com>
> 
> Some PMU events have cpumask, e.g uncore events. The cpu list set by
> user may be incompatible with event's cpumask.
> This patch will check the user defined cpu list. If the incompatible cpu is
> found, it will warn the user and discard the incompatible cpu. Only available
> cpu can be stored in evsel->cpus->map. If there is no cpu from cpu list
> compatible with event's cpumask. It will error out.
> 
> Here is an example.
> According to cpumask, uncore should be default on CPU0 and CPU18.
> So the S0-C1 for uncore should show "<not counted>".
> 
> Without this patch
>  $ sudo ./perf stat -e cycles,uncore_imc_0/cas_count_read/ -C0,1,18 --per-
> core sleep 2
> 
>  Performance counter stats for 'CPU(s) 0,1,18':
> 
> S0-C0           1            6749638      cycles
> S0-C0           1               0.83 MiB  uncore_imc_0/cas_count_read/
> (100.00%)
> S0-C1           1             232421      cycles
> S0-C1           1               0.83 MiB  uncore_imc_0/cas_count_read/
> S1-C0           1             236997      cycles
> S1-C0           1               0.35 MiB  uncore_imc_0/cas_count_read/
> 
>        2.001094019 seconds time elapsed
> 
> With this patch
>  $ perf stat -e cycles,uncore_imc_0/cas_count_read/ -C0,1,18 --per-core
> sleep 2 event uncore_imc_0/cas_count_read/ can only be monitored on
> CPU 0 18.
> Other CPUs will be discard.
> 
>  Performance counter stats for 'CPU(s) 0,1,18':
> 
> S0-C0           1            5557406      cycles
> S0-C0           1               0.21 MiB  uncore_imc_0/cas_count_read/
> S0-C1           1            1012534      cycles
> S0-C1           0      <not counted> MiB  uncore_imc_0/cas_count_read/
> S1-C0           1             916130      cycles
> S1-C0           1               0.08 MiB  uncore_imc_0/cas_count_read/
> 
>        2.001110843 seconds time elapsed
> 
> Signed-off-by: Kan Liang <kan.liang@intel.com>
> ---
> 
> Changes since V1
>  - Re-implement the way to find the common cpus
>  - Add test case
> 
>  tools/perf/util/evlist.c | 72
> ++++++++++++++++++++++++++++++++++++++++++++----
>  1 file changed, 67 insertions(+), 5 deletions(-)
> 
> diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c index
> 6cfdee6..d67414f 100644
> --- a/tools/perf/util/evlist.c
> +++ b/tools/perf/util/evlist.c
> @@ -1101,6 +1101,66 @@ int perf_evlist__mmap(struct perf_evlist *evlist,
> unsigned int pages,
>  	return perf_evlist__mmap_ex(evlist, pages, overwrite, 0, false);  }
> 
> +static int cmp_ids(const void *a, const void *b) {
> +	return *(int *)a - *(int *)b;
> +}
> +
> +static int perf_evlist__check_evsel_cpus(struct perf_evlist *evlist,
> +struct perf_evsel *evsel) {
> +	const int ncpus = cpu_map__nr(evlist->cpus);
> +	int i = 0, j = 0, cpu_nr = 0, tmp;
> +
> +	/* ensure we process id in increasing order */
> +	qsort(evlist->cpus->map, evlist->cpus->nr, sizeof(int), cmp_ids);
> +
> +	/* find the common cpus between evsel and evlist. */
> +	while ((i < cpu_map__nr(evlist->cpus)) &&
> +	       (j < cpu_map__nr(evsel->cpus))) {
> +
> +		if (evlist->cpus->map[i] == evsel->cpus->map[j]) {
> +			cpu_nr++;
> +			i++;
> +			j++;
> +		} else if (evlist->cpus->map[i] > evsel->cpus->map[j]) {
> +			evsel->cpus->map[j++] = -1;
> +		} else
> +			i++;
> +
> +	}
> +	while (j < cpu_map__nr(evsel->cpus))
> +		evsel->cpus->map[j++] = -1;
> +
> +	if (cpu_nr == 0) {
> +		pr_warning("event %s cannot be monitored on the given
> cpus."
> +			   "Please check cpumask\n", evsel->name);
> +		return -1;
> +	}
> +
> +	if (ncpus > cpu_nr)
> +		pr_warning("event %s can only be monitored on CPU",
> evsel->name);
> +
> +	/* order evsel cpus */
> +	for (i = 0, tmp = 0; i < cpu_nr; i++) {
> +		if (evsel->cpus->map[i] == -1) {
> +			while (evsel->cpus->map[tmp] == -1) {
> +				tmp++;
> +				BUG_ON(tmp >= cpu_map__nr(evsel-
> >cpus));
> +			}
> +			evsel->cpus->map[i] = evsel->cpus->map[tmp];
> +			evsel->cpus->map[tmp] = -1;
> +		}
> +		if (ncpus > cpu_nr)
> +			pr_warning(" %d", evsel->cpus->map[i]);
> +		tmp++;
> +	}
> +	evsel->cpus->nr = cpu_nr;
> +	if (ncpus > cpu_nr)
> +		pr_warning(". Other CPUs will be discard.\n");
> +
> +	return 0;
> +}
> +
>  static int perf_evlist__propagate_maps(struct perf_evlist *evlist,
>  				       struct target *target)
>  {
> @@ -1108,13 +1168,15 @@ static int perf_evlist__propagate_maps(struct
> perf_evlist *evlist,
> 
>  	evlist__for_each(evlist, evsel) {
>  		/*
> -		 * We already have cpus for evsel (via PMU sysfs) so
> -		 * keep it, if there's no target cpu list defined.
> +		 * We already have cpus for evsel (via PMU sysfs)
> +		 * and target cpu list defined, check if they are
> +		 * compatible. If not, discard incompatible cpus.
>  		 */
> -		if (evsel->cpus && target->cpu_list)
> -			cpu_map__put(evsel->cpus);
> +		if (evsel->cpus && target->cpu_list &&
> +		    perf_evlist__check_evsel_cpus(evlist, evsel))
> +			return -EINVAL;
> 
> -		if (!evsel->cpus || target->cpu_list)
> +		if (!evsel->cpus)
>  			evsel->cpus = cpu_map__get(evlist->cpus);
> 
>  		evsel->threads = thread_map__get(evlist->threads);
> --
> 1.8.3.1


^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [PATCH V2 1/3] perf,tools: get correct cpu id for print_aggr
  2015-07-02 16:13 ` [PATCH V2 1/3] perf,tools: get correct cpu id for print_aggr Jiri Olsa
@ 2015-08-28 13:33   ` Liang, Kan
  2015-08-28 14:23     ` acme
  0 siblings, 1 reply; 14+ messages in thread
From: Liang, Kan @ 2015-08-28 13:33 UTC (permalink / raw)
  To: acme, Jiri Olsa
  Cc: jolsa, ak, namhyung, eranian, Hunter, Adrian, dsahern,
	a.p.zijlstra, mingo, linux-kernel



> On Thu, Jul 02, 2015 at 03:08:43AM -0400, kan.liang@intel.com wrote:
> > From: Kan Liang <kan.liang@intel.com>
> >
> > print_aggr fails to print per-core/per-socket statistics after commit
> > 582ec0829b3d ("perf stat: Fix per-socket output bug for uncore
> > events") if events have differnt cpus. Because in print_aggr,
> > aggr_get_id needs index (not cpu id) to find core/pkg id. Also, evsel
> > cpu maps should be used to get aggregated id.
> >
> > Here is an example.
> > Counting events cycles,uncore_imc_0/cas_count_read/. (Uncore event
> has
> > cpumask 0,18)
> >
> > "perf stat -e cycles,uncore_imc_0/cas_count_read/ -C0,18 --per-core
> > sleep 2"
> >
> > Without this patch, it failes to get CPU 18 result.
> >  Performance counter stats for 'CPU(s) 0,18':
> >
> > S0-C0           1            7526851      cycles
> > S0-C0           1               1.05 MiB  uncore_imc_0/cas_count_read/
> > S1-C0           0      <not counted>      cycles
> > S1-C0           0      <not counted> MiB  uncore_imc_0/cas_count_read/
> >
> > With this patch, it can get both CPU0 and CPU18 result.
> >  Performance counter stats for 'CPU(s) 0,18':
> >
> > S0-C0           1            6327768      cycles
> > S0-C0           1               0.47 MiB  uncore_imc_0/cas_count_read/
> > S1-C0           1             330228      cycles
> > S1-C0           1               0.29 MiB  uncore_imc_0/cas_count_read/
> >
> > Signed-off-by: Kan Liang <kan.liang@intel.com>
> 
> Acked-by: Jiri Olsa <jolsa@kernel.org>
> 

Hi Arnaldo,

Could you please merge this patch?
This patch is to fix a bug of perf stat. It doesn't depend on
other patches of the patchset, and can be merged by itself.

Thanks,
Kan

> thanks,
> jirka
> 
> > ---
> >
> > Changes since V1
> >  - Modify changelog
> >  - Remove perf_evsel__get_cpumap_index
> >
> >  tools/perf/builtin-stat.c | 5 ++---
> >  1 file changed, 2 insertions(+), 3 deletions(-)
> >
> > diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
> > index 37e301a..47c3c1f 100644
> > --- a/tools/perf/builtin-stat.c
> > +++ b/tools/perf/builtin-stat.c
> > @@ -694,7 +694,7 @@ static void abs_printout(int id, int nr, struct
> > perf_evsel *evsel, double avg)  static void print_aggr(char *prefix)
> > {
> >  	struct perf_evsel *counter;
> > -	int cpu, cpu2, s, s2, id, nr;
> > +	int cpu, s, s2, id, nr;
> >  	double uval;
> >  	u64 ena, run, val;
> >
> > @@ -707,8 +707,7 @@ static void print_aggr(char *prefix)
> >  			val = ena = run = 0;
> >  			nr = 0;
> >  			for (cpu = 0; cpu < perf_evsel__nr_cpus(counter);
> cpu++) {
> > -				cpu2 = perf_evsel__cpus(counter)-
> >map[cpu];
> > -				s2 = aggr_get_id(evsel_list->cpus, cpu2);
> > +				s2 =
> aggr_get_id(perf_evsel__cpus(counter), cpu);
> >  				if (s2 != id)
> >  					continue;
> >  				val += perf_counts(counter->counts, cpu,
> 0)->val;
> > --
> > 1.8.3.1
> >

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH V2 1/3] perf,tools: get correct cpu id for print_aggr
  2015-08-28 13:33   ` Liang, Kan
@ 2015-08-28 14:23     ` acme
  2015-08-28 14:39       ` Liang, Kan
  0 siblings, 1 reply; 14+ messages in thread
From: acme @ 2015-08-28 14:23 UTC (permalink / raw)
  To: Liang, Kan
  Cc: Jiri Olsa, jolsa, ak, namhyung, eranian, Hunter, Adrian, dsahern,
	a.p.zijlstra, mingo, linux-kernel

Em Fri, Aug 28, 2015 at 01:33:22PM +0000, Liang, Kan escreveu:
> > On Thu, Jul 02, 2015 at 03:08:43AM -0400, kan.liang@intel.com wrote:
> > > From: Kan Liang <kan.liang@intel.com>
> > > print_aggr fails to print per-core/per-socket statistics after commit
> > > 582ec0829b3d ("perf stat: Fix per-socket output bug for uncore
> > > events") if events have differnt cpus. Because in print_aggr,
> > > aggr_get_id needs index (not cpu id) to find core/pkg id. Also, evsel
> > > cpu maps should be used to get aggregated id.
> > > Signed-off-by: Kan Liang <kan.liang@intel.com>

> > Acked-by: Jiri Olsa <jolsa@kernel.org>
 
> Hi Arnaldo,
 
> Could you please merge this patch?
> This patch is to fix a bug of perf stat. It doesn't depend on
> other patches of the patchset, and can be merged by itself.

Right, in such cases, please, make it clear against which branch this
should be applied, i.e. if this is a longstanding bug that needs to go
to perf/urgent, i.e. to the current merge window, ASAP, or if this
is for something that was introduced in the current development branch,
perf/core.

In this case it needs to go to perf/urgent, where it applies cleanly,
perf/core has extra stuff there that fuzzes a bit.

Also, since you know the cset where this bug was introduced, please
consider adding a "Fixes:" tag, commom everywhere in the kernel:

  [acme@zoo linux]$ git log | grep '^[ \t]\+Fixes:'  | wc -l
  3805

And we use it in tools/, sometime I add it while editing changelogs,
like in this patch:

  [acme@zoo linux]$ git log tools/ | grep '^[ \t]\+Fixes:' | head -5
    Fixes: 582ec0829b3d ("perf stat: Fix per-socket output bug for uncore events")
    Fixes: b685ac22b436 ("perf symbols: Add front end cache for DSO symbol lookup")
    Fixes: 06b234ec26fd ("perf script: Don't assume evsel position of tracking events")
    Fixes: d4957633bf9d ("perf report: Add infrastructure for a cycles histogram")
    Fixes: 75186a9b09e4 ("perf probe: Fix to show lines of sys_ functions correctly")
  [acme@zoo linux]$

- Arnaldo

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [PATCH V2 1/3] perf,tools: get correct cpu id for print_aggr
  2015-08-28 14:23     ` acme
@ 2015-08-28 14:39       ` Liang, Kan
  2015-08-28 14:45         ` Stephane Eranian
  2015-08-28 14:47         ` Arnaldo Carvalho de Melo
  0 siblings, 2 replies; 14+ messages in thread
From: Liang, Kan @ 2015-08-28 14:39 UTC (permalink / raw)
  To: acme
  Cc: Jiri Olsa, jolsa, ak, namhyung, eranian, Hunter, Adrian, dsahern,
	a.p.zijlstra, mingo, linux-kernel

> 
> Em Fri, Aug 28, 2015 at 01:33:22PM +0000, Liang, Kan escreveu:
> > > On Thu, Jul 02, 2015 at 03:08:43AM -0400, kan.liang@intel.com wrote:
> > > > From: Kan Liang <kan.liang@intel.com> print_aggr fails to print
> > > > per-core/per-socket statistics after commit 582ec0829b3d ("perf
> > > > stat: Fix per-socket output bug for uncore
> > > > events") if events have differnt cpus. Because in print_aggr,
> > > > aggr_get_id needs index (not cpu id) to find core/pkg id. Also,
> > > > evsel cpu maps should be used to get aggregated id.
> > > > Signed-off-by: Kan Liang <kan.liang@intel.com>
> 
> > > Acked-by: Jiri Olsa <jolsa@kernel.org>
> 
> > Hi Arnaldo,
> 
> > Could you please merge this patch?
> > This patch is to fix a bug of perf stat. It doesn't depend on other
> > patches of the patchset, and can be merged by itself.
> 
> Right, in such cases, please, make it clear against which branch this should
> be applied, i.e. if this is a longstanding bug that needs to go to perf/urgent,
> i.e. to the current merge window, ASAP, or if this is for something that was
> introduced in the current development branch, perf/core.
> 
> In this case it needs to go to perf/urgent, where it applies cleanly,
> perf/core has extra stuff there that fuzzes a bit.
> 
> Also, since you know the cset where this bug was introduced, please
> consider adding a "Fixes:" tag, commom everywhere in the kernel:

OK. I think this patch needs to go to perf/urgent.
Should I re-send this patch with "Fixes:" tag and comments? 

Thanks,
Kan
> 
>   [acme@zoo linux]$ git log | grep '^[ \t]\+Fixes:'  | wc -l
>   3805
> 
> And we use it in tools/, sometime I add it while editing changelogs, like in
> this patch:
> 
>   [acme@zoo linux]$ git log tools/ | grep '^[ \t]\+Fixes:' | head -5
>     Fixes: 582ec0829b3d ("perf stat: Fix per-socket output bug for uncore
> events")
>     Fixes: b685ac22b436 ("perf symbols: Add front end cache for DSO symbol
> lookup")
>     Fixes: 06b234ec26fd ("perf script: Don't assume evsel position of tracking
> events")
>     Fixes: d4957633bf9d ("perf report: Add infrastructure for a cycles
> histogram")
>     Fixes: 75186a9b09e4 ("perf probe: Fix to show lines of sys_ functions
> correctly")
>   [acme@zoo linux]$
> 
> - Arnaldo

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH V2 1/3] perf,tools: get correct cpu id for print_aggr
  2015-08-28 14:39       ` Liang, Kan
@ 2015-08-28 14:45         ` Stephane Eranian
  2015-08-28 14:47         ` Arnaldo Carvalho de Melo
  1 sibling, 0 replies; 14+ messages in thread
From: Stephane Eranian @ 2015-08-28 14:45 UTC (permalink / raw)
  To: Liang, Kan
  Cc: acme, Jiri Olsa, jolsa, ak, namhyung, Hunter, Adrian, dsahern,
	a.p.zijlstra, mingo, linux-kernel

On Fri, Aug 28, 2015 at 7:39 AM, Liang, Kan <kan.liang@intel.com> wrote:
>
> >
> > Em Fri, Aug 28, 2015 at 01:33:22PM +0000, Liang, Kan escreveu:
> > > > On Thu, Jul 02, 2015 at 03:08:43AM -0400, kan.liang@intel.com wrote:
> > > > > From: Kan Liang <kan.liang@intel.com> print_aggr fails to print
> > > > > per-core/per-socket statistics after commit 582ec0829b3d ("perf
> > > > > stat: Fix per-socket output bug for uncore
> > > > > events") if events have differnt cpus. Because in print_aggr,
> > > > > aggr_get_id needs index (not cpu id) to find core/pkg id. Also,
> > > > > evsel cpu maps should be used to get aggregated id.
> > > > > Signed-off-by: Kan Liang <kan.liang@intel.com>
> >
> > > > Acked-by: Jiri Olsa <jolsa@kernel.org>
> >
> > > Hi Arnaldo,
> >
> > > Could you please merge this patch?
> > > This patch is to fix a bug of perf stat. It doesn't depend on other
> > > patches of the patchset, and can be merged by itself.
> >
> > Right, in such cases, please, make it clear against which branch this should
> > be applied, i.e. if this is a longstanding bug that needs to go to perf/urgent,
> > i.e. to the current merge window, ASAP, or if this is for something that was
> > introduced in the current development branch, perf/core.
> >
> > In this case it needs to go to perf/urgent, where it applies cleanly,
> > perf/core has extra stuff there that fuzzes a bit.
> >
> > Also, since you know the cset where this bug was introduced, please
> > consider adding a "Fixes:" tag, commom everywhere in the kernel:
>
> OK. I think this patch needs to go to perf/urgent.
> Should I re-send this patch with "Fixes:" tag and comments?
>
Acked-by: Stephane Eranian <eranian@google.com>

>
> Thanks,
> Kan
> >
> >   [acme@zoo linux]$ git log | grep '^[ \t]\+Fixes:'  | wc -l
> >   3805
> >
> > And we use it in tools/, sometime I add it while editing changelogs, like in
> > this patch:
> >
> >   [acme@zoo linux]$ git log tools/ | grep '^[ \t]\+Fixes:' | head -5
> >     Fixes: 582ec0829b3d ("perf stat: Fix per-socket output bug for uncore
> > events")
> >     Fixes: b685ac22b436 ("perf symbols: Add front end cache for DSO symbol
> > lookup")
> >     Fixes: 06b234ec26fd ("perf script: Don't assume evsel position of tracking
> > events")
> >     Fixes: d4957633bf9d ("perf report: Add infrastructure for a cycles
> > histogram")
> >     Fixes: 75186a9b09e4 ("perf probe: Fix to show lines of sys_ functions
> > correctly")
> >   [acme@zoo linux]$
> >
> > - Arnaldo

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH V2 1/3] perf,tools: get correct cpu id for print_aggr
  2015-08-28 14:39       ` Liang, Kan
  2015-08-28 14:45         ` Stephane Eranian
@ 2015-08-28 14:47         ` Arnaldo Carvalho de Melo
  2015-08-28 14:51           ` Liang, Kan
  1 sibling, 1 reply; 14+ messages in thread
From: Arnaldo Carvalho de Melo @ 2015-08-28 14:47 UTC (permalink / raw)
  To: Liang, Kan
  Cc: Jiri Olsa, jolsa, ak, namhyung, eranian, Hunter, Adrian, dsahern,
	a.p.zijlstra, Ingo Molnar, linux-kernel

Em Fri, Aug 28, 2015 at 02:39:21PM +0000, Liang, Kan escreveu:
> > In this case it needs to go to perf/urgent, where it applies
> > cleanly, perf/core has extra stuff there that fuzzes a bit.

> > Also, since you know the cset where this bug was introduced, please
> > consider adding a "Fixes:" tag, commom everywhere in the kernel:

> OK. I think this patch needs to go to perf/urgent.
> Should I re-send this patch with "Fixes:" tag and comments? 

No need, I did this already, its sitting in my perf/urgent branch now,
will go to Ingo today.

- Arnaldo

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [PATCH V2 1/3] perf,tools: get correct cpu id for print_aggr
  2015-08-28 14:47         ` Arnaldo Carvalho de Melo
@ 2015-08-28 14:51           ` Liang, Kan
  0 siblings, 0 replies; 14+ messages in thread
From: Liang, Kan @ 2015-08-28 14:51 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, jolsa, ak, namhyung, eranian, Hunter, Adrian, dsahern,
	a.p.zijlstra, Ingo Molnar, linux-kernel



> -----Original Message-----
> From: Arnaldo Carvalho de Melo [mailto:acme@kernel.org]
> Sent: Friday, August 28, 2015 10:47 AM
> To: Liang, Kan
> Cc: Jiri Olsa; jolsa@kernel.org; ak@linux.intel.com; namhyung@kernel.org;
> eranian@google.com; Hunter, Adrian; dsahern@gmail.com;
> a.p.zijlstra@chello.nl; Ingo Molnar; linux-kernel@vger.kernel.org
> Subject: Re: [PATCH V2 1/3] perf,tools: get correct cpu id for print_aggr
> 
> Em Fri, Aug 28, 2015 at 02:39:21PM +0000, Liang, Kan escreveu:
> > > In this case it needs to go to perf/urgent, where it applies
> > > cleanly, perf/core has extra stuff there that fuzzes a bit.
> 
> > > Also, since you know the cset where this bug was introduced, please
> > > consider adding a "Fixes:" tag, commom everywhere in the kernel:
> 
> > OK. I think this patch needs to go to perf/urgent.
> > Should I re-send this patch with "Fixes:" tag and comments?
> 
> No need, I did this already, its sitting in my perf/urgent branch now, will go
> to Ingo today.
> 

Thanks.

Kan

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [tip:perf/urgent] perf stat: Get correct cpu id for print_aggr
  2015-07-02  7:08 [PATCH V2 1/3] perf,tools: get correct cpu id for print_aggr kan.liang
                   ` (2 preceding siblings ...)
  2015-07-02 16:13 ` [PATCH V2 1/3] perf,tools: get correct cpu id for print_aggr Jiri Olsa
@ 2015-08-31  8:27 ` tip-bot for Kan Liang
  3 siblings, 0 replies; 14+ messages in thread
From: tip-bot for Kan Liang @ 2015-08-31  8:27 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: adrian.hunter, hpa, kan.liang, ak, dsahern, linux-kernel, tglx,
	a.p.zijlstra, jolsa, mingo, namhyung, acme, eranian

Commit-ID:  601083cffb7cabdcc55b8195d732f0f7028570fa
Gitweb:     http://git.kernel.org/tip/601083cffb7cabdcc55b8195d732f0f7028570fa
Author:     Kan Liang <kan.liang@intel.com>
AuthorDate: Thu, 2 Jul 2015 03:08:43 -0400
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Fri, 28 Aug 2015 11:49:52 -0300

perf stat: Get correct cpu id for print_aggr

print_aggr() fails to print per-core/per-socket statistics after commit
582ec0829b3d ("perf stat: Fix per-socket output bug for uncore events")
if events have differnt cpus. Because in print_aggr(), aggr_get_id needs
index (not cpu id) to find core/pkg id. Also, evsel cpu maps should be
used to get aggregated id.

Here is an example:

Counting events cycles,uncore_imc_0/cas_count_read/. (Uncore event has
cpumask 0,18)

  $ perf stat -e cycles,uncore_imc_0/cas_count_read/ -C0,18 --per-core sleep 2

Without this patch, it failes to get CPU 18 result.

   Performance counter stats for 'CPU(s) 0,18':

  S0-C0           1            7526851      cycles
  S0-C0           1               1.05 MiB  uncore_imc_0/cas_count_read/
  S1-C0           0      <not counted>      cycles
  S1-C0           0      <not counted> MiB  uncore_imc_0/cas_count_read/

With this patch, it can get both CPU0 and CPU18 result.

   Performance counter stats for 'CPU(s) 0,18':

  S0-C0           1            6327768      cycles
  S0-C0           1               0.47 MiB  uncore_imc_0/cas_count_read/
  S1-C0           1             330228      cycles
  S1-C0           1               0.29 MiB  uncore_imc_0/cas_count_read/

Signed-off-by: Kan Liang <kan.liang@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Stephane Eranian <eranian@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Fixes: 582ec0829b3d ("perf stat: Fix per-socket output bug for uncore events")
Link: http://lkml.kernel.org/r/1435820925-51091-1-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-stat.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index d99d850..ef355fc 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -694,7 +694,7 @@ static void abs_printout(int id, int nr, struct perf_evsel *evsel, double avg)
 static void print_aggr(char *prefix)
 {
 	struct perf_evsel *counter;
-	int cpu, cpu2, s, s2, id, nr;
+	int cpu, s, s2, id, nr;
 	double uval;
 	u64 ena, run, val;
 
@@ -707,8 +707,7 @@ static void print_aggr(char *prefix)
 			val = ena = run = 0;
 			nr = 0;
 			for (cpu = 0; cpu < perf_evsel__nr_cpus(counter); cpu++) {
-				cpu2 = perf_evsel__cpus(counter)->map[cpu];
-				s2 = aggr_get_id(evsel_list->cpus, cpu2);
+				s2 = aggr_get_id(perf_evsel__cpus(counter), cpu);
 				if (s2 != id)
 					continue;
 				val += perf_counts(counter->counts, cpu, 0)->val;

^ permalink raw reply related	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2015-08-31  8:28 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-07-02  7:08 [PATCH V2 1/3] perf,tools: get correct cpu id for print_aggr kan.liang
2015-07-02  7:08 ` [PATCH V2 2/3] perf,tools: check and re-organize evsel cpu maps kan.liang
2015-07-15 21:14   ` Liang, Kan
2015-07-02  7:08 ` [PATCH V2 3/3] perf,tests: Add cpu_map tests kan.liang
2015-07-02 16:12   ` Jiri Olsa
2015-07-02 16:58     ` Liang, Kan
2015-07-02 16:13 ` [PATCH V2 1/3] perf,tools: get correct cpu id for print_aggr Jiri Olsa
2015-08-28 13:33   ` Liang, Kan
2015-08-28 14:23     ` acme
2015-08-28 14:39       ` Liang, Kan
2015-08-28 14:45         ` Stephane Eranian
2015-08-28 14:47         ` Arnaldo Carvalho de Melo
2015-08-28 14:51           ` Liang, Kan
2015-08-31  8:27 ` [tip:perf/urgent] perf stat: Get " tip-bot for Kan Liang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).