All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] perf stat: Enable group read of counters
@ 2017-07-21 12:12 Jiri Olsa
  2017-07-21 12:12 ` [PATCH 1/4] perf tools: Add verbose output for sys_perf_event_open fallback Jiri Olsa
                   ` (4 more replies)
  0 siblings, 5 replies; 12+ messages in thread
From: Jiri Olsa @ 2017-07-21 12:12 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: lkml, Ingo Molnar, Peter Zijlstra, Alexander Shishkin,
	Namhyung Kim, David Ahern, Andi Kleen

hi,
sending changes to enable group read of perf counters
for perf stat command. It allows us to read whole group
of counters within single read syscall.

Also available in here:
  git://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git
  perf/stat_group

Not sure why we haven't supported yet, but anyway it was
unavailable for some time due to a bug which was fixed
just recently via:
  ba5213ae6b88 ("perf/core: Correct event creation with PERF_FORMAT_GROUP")

thanks,
jirka


---
Jiri Olsa (4):
      perf tools: Add verbose output for sys_perf_event_open fallback
      perf tools: Add perf_evsel__read_size function
      perf tools: Add perf_evsel__read_counter function
      perf stat: Use group read for event groups

 tools/perf/builtin-stat.c |  30 +++++++++++++++++++---
 tools/perf/util/counts.h  |   1 +
 tools/perf/util/evsel.c   | 147 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 tools/perf/util/evsel.h   |   2 ++
 tools/perf/util/stat.c    |   3 +++
 tools/perf/util/stat.h    |   5 ++--
 6 files changed, 182 insertions(+), 6 deletions(-)

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/4] perf tools: Add verbose output for sys_perf_event_open fallback
  2017-07-21 12:12 [PATCH 0/4] perf stat: Enable group read of counters Jiri Olsa
@ 2017-07-21 12:12 ` Jiri Olsa
  2017-07-26 17:23   ` [tip:perf/core] perf evsel: " tip-bot for Jiri Olsa
  2017-07-21 12:12 ` [PATCH 2/4] perf tools: Add perf_evsel__read_size function Jiri Olsa
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 12+ messages in thread
From: Jiri Olsa @ 2017-07-21 12:12 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: lkml, Ingo Molnar, Peter Zijlstra, Alexander Shishkin,
	Namhyung Kim, David Ahern, Andi Kleen

Adding info about what is being switched off in
the sys_perf_event_open fallback.

New output (notice the 'switching off' lines):
  $ perf stat -e '{cycles,instructions}' -vvv ls
  Using CPUID GenuineIntel-6-3D
  intel_pt default config: tsc
  ------------------------------------------------------------
  perf_event_attr:
    size                             112
    sample_type                      IDENTIFIER
    read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING|ID|GROUP
    disabled                         1
    inherit                          1
    enable_on_exec                   1
    exclude_guest                    1
  ------------------------------------------------------------
  sys_perf_event_open: pid 3591  cpu -1  group_fd -1  flags 0x8
  sys_perf_event_open failed, error -22
  switching off cloexec flag
  ------------------------------------------------------------
  perf_event_attr:
    size                             112
    sample_type                      IDENTIFIER
    read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING|ID|GROUP
    disabled                         1
    inherit                          1
    enable_on_exec                   1
    exclude_guest                    1
  ------------------------------------------------------------
  sys_perf_event_open: pid 3591  cpu -1  group_fd -1  flags 0
  sys_perf_event_open failed, error -22
  switching off exclude_guest, exclude_host
  ------------------------------------------------------------
  perf_event_attr:
    size                             112
    sample_type                      IDENTIFIER
    read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING|ID|GROUP
    disabled                         1
    inherit                          1
    enable_on_exec                   1
  ------------------------------------------------------------
  sys_perf_event_open: pid 3591  cpu -1  group_fd -1  flags 0
  sys_perf_event_open failed, error -22
  switching off sample_id_all
  ------------------------------------------------------------
  perf_event_attr:
    size                             112
    sample_type                      IDENTIFIER
    read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING|ID|GROUP
    disabled                         1
    inherit                          1
    enable_on_exec                   1
  ...

Link: http://lkml.kernel.org/n/tip-jbzet07v499dnc1umzj2ron2@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/perf/util/evsel.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 6dd069a41ac3..450b5fadf8cb 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1671,31 +1671,39 @@ int perf_evsel__open(struct perf_evsel *evsel, struct cpu_map *cpus,
 	 */
 	if (!perf_missing_features.write_backward && evsel->attr.write_backward) {
 		perf_missing_features.write_backward = true;
+		pr_debug2("switching off write_backward\n");
 		goto out_close;
 	} else if (!perf_missing_features.clockid_wrong && evsel->attr.use_clockid) {
 		perf_missing_features.clockid_wrong = true;
+		pr_debug2("switching off clockid\n");
 		goto fallback_missing_features;
 	} else if (!perf_missing_features.clockid && evsel->attr.use_clockid) {
 		perf_missing_features.clockid = true;
+		pr_debug2("switching off use_clockid\n");
 		goto fallback_missing_features;
 	} else if (!perf_missing_features.cloexec && (flags & PERF_FLAG_FD_CLOEXEC)) {
 		perf_missing_features.cloexec = true;
+		pr_debug2("switching off cloexec flag\n");
 		goto fallback_missing_features;
 	} else if (!perf_missing_features.mmap2 && evsel->attr.mmap2) {
 		perf_missing_features.mmap2 = true;
+		pr_debug2("switching off mmap2\n");
 		goto fallback_missing_features;
 	} else if (!perf_missing_features.exclude_guest &&
 		   (evsel->attr.exclude_guest || evsel->attr.exclude_host)) {
 		perf_missing_features.exclude_guest = true;
+		pr_debug2("switching off exclude_guest, exclude_host\n");
 		goto fallback_missing_features;
 	} else if (!perf_missing_features.sample_id_all) {
 		perf_missing_features.sample_id_all = true;
+		pr_debug2("switching off sample_id_all\n");
 		goto retry_sample_id;
 	} else if (!perf_missing_features.lbr_flags &&
 			(evsel->attr.branch_sample_type &
 			 (PERF_SAMPLE_BRANCH_NO_CYCLES |
 			  PERF_SAMPLE_BRANCH_NO_FLAGS))) {
 		perf_missing_features.lbr_flags = true;
+		pr_debug2("switching off branch sample type no (cycles/flags)\n");
 		goto fallback_missing_features;
 	}
 out_close:
-- 
2.9.4

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/4] perf tools: Add perf_evsel__read_size function
  2017-07-21 12:12 [PATCH 0/4] perf stat: Enable group read of counters Jiri Olsa
  2017-07-21 12:12 ` [PATCH 1/4] perf tools: Add verbose output for sys_perf_event_open fallback Jiri Olsa
@ 2017-07-21 12:12 ` Jiri Olsa
  2017-07-21 12:12 ` [PATCH 3/4] perf tools: Add perf_evsel__read_counter function Jiri Olsa
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 12+ messages in thread
From: Jiri Olsa @ 2017-07-21 12:12 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: lkml, Ingo Molnar, Peter Zijlstra, Alexander Shishkin,
	Namhyung Kim, David Ahern, Andi Kleen

Currently we use the size of struct perf_counts_values
to read the event, which prevents us to put any new
member to the struct.

Adding perf_evsel__read_size to return size of the
buffer needed for event read.

Link: http://lkml.kernel.org/n/tip-cfc3dmil3tlzezzxtyi9f38m@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/perf/util/evsel.c | 29 ++++++++++++++++++++++++++++-
 1 file changed, 28 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 450b5fadf8cb..4dd0fcc06db9 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1261,15 +1261,42 @@ void perf_counts_values__scale(struct perf_counts_values *count,
 		*pscaled = scaled;
 }
 
+static int perf_evsel__read_size(struct perf_evsel *evsel)
+{
+	u64 read_format = evsel->attr.read_format;
+	int entry = sizeof(u64); /* value */
+	int size = 0;
+	int nr = 1;
+
+	if (read_format & PERF_FORMAT_TOTAL_TIME_ENABLED)
+		size += sizeof(u64);
+
+	if (read_format & PERF_FORMAT_TOTAL_TIME_RUNNING)
+		size += sizeof(u64);
+
+	if (read_format & PERF_FORMAT_ID)
+		entry += sizeof(u64);
+
+	if (read_format & PERF_FORMAT_GROUP) {
+		nr = evsel->nr_members;
+		size += sizeof(u64);
+	}
+
+	size += entry * nr;
+	return size;
+}
+
 int perf_evsel__read(struct perf_evsel *evsel, int cpu, int thread,
 		     struct perf_counts_values *count)
 {
+	size_t size = perf_evsel__read_size(evsel);
+
 	memset(count, 0, sizeof(*count));
 
 	if (FD(evsel, cpu, thread) < 0)
 		return -EINVAL;
 
-	if (readn(FD(evsel, cpu, thread), count, sizeof(*count)) <= 0)
+	if (readn(FD(evsel, cpu, thread), count->values, size) <= 0)
 		return -errno;
 
 	return 0;
-- 
2.9.4

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 3/4] perf tools: Add perf_evsel__read_counter function
  2017-07-21 12:12 [PATCH 0/4] perf stat: Enable group read of counters Jiri Olsa
  2017-07-21 12:12 ` [PATCH 1/4] perf tools: Add verbose output for sys_perf_event_open fallback Jiri Olsa
  2017-07-21 12:12 ` [PATCH 2/4] perf tools: Add perf_evsel__read_size function Jiri Olsa
@ 2017-07-21 12:12 ` Jiri Olsa
  2017-07-26  1:41   ` Arnaldo Carvalho de Melo
  2017-07-21 12:12 ` [PATCH 4/4] perf stat: Use group read for event groups Jiri Olsa
  2017-07-21 17:07 ` [PATCH 0/4] perf stat: Enable group read of counters Arnaldo Carvalho de Melo
  4 siblings, 1 reply; 12+ messages in thread
From: Jiri Olsa @ 2017-07-21 12:12 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: lkml, Ingo Molnar, Peter Zijlstra, Alexander Shishkin,
	Namhyung Kim, David Ahern, Andi Kleen

Adding perf_evsel__read_counter function to read single or
group counter. After calling this function the counter's
evsel::counts struct is filled with values for the counter
and member of its group if there are any.

Link: http://lkml.kernel.org/n/tip-itsuxdyt7rp4mvij1t6k7bcl@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/perf/util/evsel.c | 100 ++++++++++++++++++++++++++++++++++++++++++++++++
 tools/perf/util/evsel.h |   2 +
 tools/perf/util/stat.c  |   3 ++
 tools/perf/util/stat.h  |   5 ++-
 4 files changed, 108 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 4dd0fcc06db9..89aecf3a35c7 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1302,6 +1302,106 @@ int perf_evsel__read(struct perf_evsel *evsel, int cpu, int thread,
 	return 0;
 }
 
+static int
+perf_evsel__read_one(struct perf_evsel *evsel, int cpu, int thread)
+{
+	struct perf_counts_values *count = perf_counts(evsel->counts, cpu, thread);
+
+	return perf_evsel__read(evsel, cpu, thread, count);
+}
+
+static void
+perf_evsel__set_count(struct perf_evsel *counter, int cpu, int thread,
+		      u64 val, u64 ena, u64 run)
+{
+	struct perf_counts_values *count;
+
+	count = perf_counts(counter->counts, cpu, thread);
+
+	count->val    = val;
+	count->ena    = ena;
+	count->run    = run;
+}
+
+static int
+perf_evsel__process_group_data(struct perf_evsel *leader,
+			       int cpu, int thread, u64 *data)
+{
+	u64 read_format = leader->attr.read_format;
+	struct sample_read_value *v;
+	u64 nr, ena = 0, run = 0, i;
+
+	nr = *data++;
+
+	if (nr != (u64) leader->nr_members)
+		return -EINVAL;
+
+	if (read_format & PERF_FORMAT_TOTAL_TIME_ENABLED)
+		ena = *data++;
+
+	if (read_format & PERF_FORMAT_TOTAL_TIME_RUNNING)
+		run = *data++;
+
+	v = (struct sample_read_value *) data;
+
+	perf_evsel__set_count(leader, cpu, thread,
+			      v[0].value, ena, run);
+
+	for (i = 1; i < nr; i++) {
+		struct perf_evsel *counter;
+
+		counter = perf_evlist__id2evsel(leader->evlist, v[i].id);
+		if (!counter)
+			return -EINVAL;
+
+		perf_evsel__set_count(counter, cpu, thread,
+				      v[i].value, ena, run);
+	}
+
+	return 0;
+}
+
+static int
+perf_evsel__read_group(struct perf_evsel *leader, int cpu, int thread)
+{
+	struct perf_stat_evsel *ps = leader->priv;
+	u64 read_format = leader->attr.read_format;
+	int size = perf_evsel__read_size(leader);
+	u64 *data = ps->group_data;
+
+	if (!(read_format & PERF_FORMAT_ID))
+		return -EINVAL;
+
+	if (!perf_evsel__is_group_leader(leader))
+		return -EINVAL;
+
+	if (!data) {
+		data = zalloc(size);
+		if (!data)
+			return -ENOMEM;
+
+		ps->group_data = data;
+	}
+
+	if (FD(leader, cpu, thread) < 0)
+		return -EINVAL;
+
+	if (readn(FD(leader, cpu, thread), data, size) <= 0)
+		return -errno;
+
+	return perf_evsel__process_group_data(leader, cpu, thread, data);
+}
+
+int perf_evsel__read_counter(struct perf_evsel *evsel, int cpu, int thread)
+{
+	u64 read_format = evsel->attr.read_format;
+
+	if (read_format & PERF_FORMAT_GROUP)
+		return perf_evsel__read_group(evsel, cpu, thread);
+	else
+		return perf_evsel__read_one(evsel, cpu, thread);
+}
+
 int __perf_evsel__read_on_cpu(struct perf_evsel *evsel,
 			      int cpu, int thread, bool scale)
 {
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index fb40ca3c6519..de03c18daaf0 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -299,6 +299,8 @@ static inline bool perf_evsel__match2(struct perf_evsel *e1,
 int perf_evsel__read(struct perf_evsel *evsel, int cpu, int thread,
 		     struct perf_counts_values *count);
 
+int perf_evsel__read_counter(struct perf_evsel *evsel, int cpu, int thread);
+
 int __perf_evsel__read_on_cpu(struct perf_evsel *evsel,
 			      int cpu, int thread, bool scale);
 
diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c
index 53b9a994a3dc..2c258554f94d 100644
--- a/tools/perf/util/stat.c
+++ b/tools/perf/util/stat.c
@@ -128,6 +128,9 @@ static int perf_evsel__alloc_stat_priv(struct perf_evsel *evsel)
 
 static void perf_evsel__free_stat_priv(struct perf_evsel *evsel)
 {
+	struct perf_stat_evsel *ps = evsel->priv;
+
+	free(ps->group_data);
 	zfree(&evsel->priv);
 }
 
diff --git a/tools/perf/util/stat.h b/tools/perf/util/stat.h
index 7522bf10b03e..eacaf958e19d 100644
--- a/tools/perf/util/stat.h
+++ b/tools/perf/util/stat.h
@@ -28,8 +28,9 @@ enum perf_stat_evsel_id {
 };
 
 struct perf_stat_evsel {
-	struct stats		res_stats[3];
-	enum perf_stat_evsel_id	id;
+	struct stats		 res_stats[3];
+	enum perf_stat_evsel_id	 id;
+	u64			*group_data;
 };
 
 enum aggr_mode {
-- 
2.9.4

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 4/4] perf stat: Use group read for event groups
  2017-07-21 12:12 [PATCH 0/4] perf stat: Enable group read of counters Jiri Olsa
                   ` (2 preceding siblings ...)
  2017-07-21 12:12 ` [PATCH 3/4] perf tools: Add perf_evsel__read_counter function Jiri Olsa
@ 2017-07-21 12:12 ` Jiri Olsa
  2017-07-23  0:53   ` Namhyung Kim
  2017-07-21 17:07 ` [PATCH 0/4] perf stat: Enable group read of counters Arnaldo Carvalho de Melo
  4 siblings, 1 reply; 12+ messages in thread
From: Jiri Olsa @ 2017-07-21 12:12 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: lkml, Ingo Molnar, Peter Zijlstra, Alexander Shishkin,
	Namhyung Kim, David Ahern, Andi Kleen

Make perf stat use  group read if there  are groups
defined. The group read will get the values for all
member of groups within a single syscall instead of
calling read syscall for every event.

We can see considerable less amount of kernel cycles
spent on single group read, than reading each event
separately, like for following perf stat command:

  # perf stat -e {cycles,instructions} -I 10 -a sleep 1

Monitored with "perf stat -r 5 -e '{cycles:u,cycles:k}'"

Before:

        24,325,676      cycles:u
       297,040,775      cycles:k

       1.038554134 seconds time elapsed

After:
        25,034,418      cycles:u
       158,256,395      cycles:k

       1.036864497 seconds time elapsed

The perf_evsel__open fallback changes contributed by Andi Kleen.

Link: http://lkml.kernel.org/n/tip-b6g8qarwvptr81cqdtfst59u@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/perf/builtin-stat.c | 30 +++++++++++++++++++++++++++---
 tools/perf/util/counts.h  |  1 +
 tools/perf/util/evsel.c   | 10 ++++++++++
 3 files changed, 38 insertions(+), 3 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 48ac53b199fc..866da7aa54bf 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -213,10 +213,20 @@ static void perf_stat__reset_stats(void)
 static int create_perf_stat_counter(struct perf_evsel *evsel)
 {
 	struct perf_event_attr *attr = &evsel->attr;
+	struct perf_evsel *leader = evsel->leader;
 
-	if (stat_config.scale)
+	if (stat_config.scale) {
 		attr->read_format = PERF_FORMAT_TOTAL_TIME_ENABLED |
 				    PERF_FORMAT_TOTAL_TIME_RUNNING;
+	}
+
+	/*
+	 * The event is part of non trivial group, let's enable
+	 * the group read (for leader) and ID retrieval for all
+	 * members.
+	 */
+	if (leader->nr_members > 1)
+		attr->read_format |= PERF_FORMAT_ID|PERF_FORMAT_GROUP;
 
 	attr->inherit = !no_inherit;
 
@@ -333,13 +343,21 @@ static int read_counter(struct perf_evsel *counter)
 			struct perf_counts_values *count;
 
 			count = perf_counts(counter->counts, cpu, thread);
-			if (perf_evsel__read(counter, cpu, thread, count)) {
+
+			/*
+			 * The leader's group read loads data into its group members
+			 * (via perf_evsel__read_counter) and sets threir count->loaded.
+			 */
+			if (!count->loaded &&
+			    perf_evsel__read_counter(counter, cpu, thread)) {
 				counter->counts->scaled = -1;
 				perf_counts(counter->counts, cpu, thread)->ena = 0;
 				perf_counts(counter->counts, cpu, thread)->run = 0;
 				return -1;
 			}
 
+			count->loaded = false;
+
 			if (STAT_RECORD) {
 				if (perf_evsel__write_stat_event(counter, cpu, thread, count)) {
 					pr_err("failed to write stat event\n");
@@ -559,6 +577,11 @@ static int store_counter_ids(struct perf_evsel *counter)
 	return __store_counter_ids(counter, cpus, threads);
 }
 
+static bool perf_evsel__should_store_id(struct perf_evsel *counter)
+{
+	return STAT_RECORD || counter->attr.read_format & PERF_FORMAT_ID;
+}
+
 static int __run_perf_stat(int argc, const char **argv)
 {
 	int interval = stat_config.interval;
@@ -631,7 +654,8 @@ static int __run_perf_stat(int argc, const char **argv)
 		if (l > unit_width)
 			unit_width = l;
 
-		if (STAT_RECORD && store_counter_ids(counter))
+		if (perf_evsel__should_store_id(counter) &&
+		    store_counter_ids(counter))
 			return -1;
 	}
 
diff --git a/tools/perf/util/counts.h b/tools/perf/util/counts.h
index 34d8baaf558a..cb45a6aecf9d 100644
--- a/tools/perf/util/counts.h
+++ b/tools/perf/util/counts.h
@@ -12,6 +12,7 @@ struct perf_counts_values {
 		};
 		u64 values[3];
 	};
+	bool	loaded;
 };
 
 struct perf_counts {
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 89aecf3a35c7..3735c9e0080d 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -49,6 +49,7 @@ static struct {
 	bool clockid_wrong;
 	bool lbr_flags;
 	bool write_backward;
+	bool group_read;
 } perf_missing_features;
 
 static clockid_t clockid;
@@ -1321,6 +1322,7 @@ perf_evsel__set_count(struct perf_evsel *counter, int cpu, int thread,
 	count->val    = val;
 	count->ena    = ena;
 	count->run    = run;
+	count->loaded = true;
 }
 
 static int
@@ -1677,6 +1679,8 @@ int perf_evsel__open(struct perf_evsel *evsel, struct cpu_map *cpus,
 	if (perf_missing_features.lbr_flags)
 		evsel->attr.branch_sample_type &= ~(PERF_SAMPLE_BRANCH_NO_FLAGS |
 				     PERF_SAMPLE_BRANCH_NO_CYCLES);
+	if (perf_missing_features.group_read && evsel->attr.inherit)
+		evsel->attr.read_format &= ~(PERF_FORMAT_GROUP|PERF_FORMAT_ID);
 retry_sample_id:
 	if (perf_missing_features.sample_id_all)
 		evsel->attr.sample_id_all = 0;
@@ -1832,6 +1836,12 @@ int perf_evsel__open(struct perf_evsel *evsel, struct cpu_map *cpus,
 		perf_missing_features.lbr_flags = true;
 		pr_debug2("switching off branch sample type no (cycles/flags)\n");
 		goto fallback_missing_features;
+	} else if (!perf_missing_features.group_read &&
+		    evsel->attr.inherit &&
+		   (evsel->attr.read_format & PERF_FORMAT_GROUP)) {
+		perf_missing_features.group_read = true;
+		pr_debug2("switching off group read\n");
+		goto fallback_missing_features;
 	}
 out_close:
 	do {
-- 
2.9.4

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/4] perf stat: Enable group read of counters
  2017-07-21 12:12 [PATCH 0/4] perf stat: Enable group read of counters Jiri Olsa
                   ` (3 preceding siblings ...)
  2017-07-21 12:12 ` [PATCH 4/4] perf stat: Use group read for event groups Jiri Olsa
@ 2017-07-21 17:07 ` Arnaldo Carvalho de Melo
  4 siblings, 0 replies; 12+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-07-21 17:07 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: lkml, Ingo Molnar, Peter Zijlstra, Alexander Shishkin,
	Namhyung Kim, David Ahern, Andi Kleen

Em Fri, Jul 21, 2017 at 02:12:08PM +0200, Jiri Olsa escreveu:
> hi,
> sending changes to enable group read of perf counters
> for perf stat command. It allows us to read whole group
> of counters within single read syscall.
> 
> Also available in here:
>   git://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git
>   perf/stat_group
> 
> Not sure why we haven't supported yet, but anyway it was
> unavailable for some time due to a bug which was fixed
> just recently via:
>   ba5213ae6b88 ("perf/core: Correct event creation with PERF_FORMAT_GROUP")

Thanks, applied.

- Arnaldo

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 4/4] perf stat: Use group read for event groups
  2017-07-21 12:12 ` [PATCH 4/4] perf stat: Use group read for event groups Jiri Olsa
@ 2017-07-23  0:53   ` Namhyung Kim
  2017-07-24  7:31     ` Jiri Olsa
  0 siblings, 1 reply; 12+ messages in thread
From: Namhyung Kim @ 2017-07-23  0:53 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Arnaldo Carvalho de Melo, lkml, Ingo Molnar, Peter Zijlstra,
	Alexander Shishkin, David Ahern, Andi Kleen, kernel-team

On Fri, Jul 21, 2017 at 02:12:12PM +0200, Jiri Olsa wrote:
> Make perf stat use  group read if there  are groups
> defined. The group read will get the values for all
> member of groups within a single syscall instead of
> calling read syscall for every event.
> 
> We can see considerable less amount of kernel cycles
> spent on single group read, than reading each event
> separately, like for following perf stat command:
> 
>   # perf stat -e {cycles,instructions} -I 10 -a sleep 1
> 
> Monitored with "perf stat -r 5 -e '{cycles:u,cycles:k}'"
> 
> Before:
> 
>         24,325,676      cycles:u
>        297,040,775      cycles:k
> 
>        1.038554134 seconds time elapsed
> 
> After:
>         25,034,418      cycles:u
>        158,256,395      cycles:k
> 
>        1.036864497 seconds time elapsed
> 
> The perf_evsel__open fallback changes contributed by Andi Kleen.
> 
> Link: http://lkml.kernel.org/n/tip-b6g8qarwvptr81cqdtfst59u@git.kernel.org
> Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> ---
>  tools/perf/builtin-stat.c | 30 +++++++++++++++++++++++++++---
>  tools/perf/util/counts.h  |  1 +
>  tools/perf/util/evsel.c   | 10 ++++++++++
>  3 files changed, 38 insertions(+), 3 deletions(-)
> 
> diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
> index 48ac53b199fc..866da7aa54bf 100644
> --- a/tools/perf/builtin-stat.c
> +++ b/tools/perf/builtin-stat.c
> @@ -213,10 +213,20 @@ static void perf_stat__reset_stats(void)
>  static int create_perf_stat_counter(struct perf_evsel *evsel)
>  {
>  	struct perf_event_attr *attr = &evsel->attr;
> +	struct perf_evsel *leader = evsel->leader;
>  
> -	if (stat_config.scale)
> +	if (stat_config.scale) {
>  		attr->read_format = PERF_FORMAT_TOTAL_TIME_ENABLED |
>  				    PERF_FORMAT_TOTAL_TIME_RUNNING;
> +	}
> +
> +	/*
> +	 * The event is part of non trivial group, let's enable
> +	 * the group read (for leader) and ID retrieval for all
> +	 * members.
> +	 */
> +	if (leader->nr_members > 1)
> +		attr->read_format |= PERF_FORMAT_ID|PERF_FORMAT_GROUP;

I just wonder ID is really necessary.  Doesn't it have same order we
can traverse with the for_each_group_member()?

Thanks,
Namhyung


>  
>  	attr->inherit = !no_inherit;
>  
> @@ -333,13 +343,21 @@ static int read_counter(struct perf_evsel *counter)
>  			struct perf_counts_values *count;
>  
>  			count = perf_counts(counter->counts, cpu, thread);
> -			if (perf_evsel__read(counter, cpu, thread, count)) {
> +
> +			/*
> +			 * The leader's group read loads data into its group members
> +			 * (via perf_evsel__read_counter) and sets threir count->loaded.
> +			 */
> +			if (!count->loaded &&
> +			    perf_evsel__read_counter(counter, cpu, thread)) {
>  				counter->counts->scaled = -1;
>  				perf_counts(counter->counts, cpu, thread)->ena = 0;
>  				perf_counts(counter->counts, cpu, thread)->run = 0;
>  				return -1;
>  			}
>  
> +			count->loaded = false;
> +
>  			if (STAT_RECORD) {
>  				if (perf_evsel__write_stat_event(counter, cpu, thread, count)) {
>  					pr_err("failed to write stat event\n");
> @@ -559,6 +577,11 @@ static int store_counter_ids(struct perf_evsel *counter)
>  	return __store_counter_ids(counter, cpus, threads);
>  }
>  
> +static bool perf_evsel__should_store_id(struct perf_evsel *counter)
> +{
> +	return STAT_RECORD || counter->attr.read_format & PERF_FORMAT_ID;
> +}
> +
>  static int __run_perf_stat(int argc, const char **argv)
>  {
>  	int interval = stat_config.interval;
> @@ -631,7 +654,8 @@ static int __run_perf_stat(int argc, const char **argv)
>  		if (l > unit_width)
>  			unit_width = l;
>  
> -		if (STAT_RECORD && store_counter_ids(counter))
> +		if (perf_evsel__should_store_id(counter) &&
> +		    store_counter_ids(counter))
>  			return -1;
>  	}
>  
> diff --git a/tools/perf/util/counts.h b/tools/perf/util/counts.h
> index 34d8baaf558a..cb45a6aecf9d 100644
> --- a/tools/perf/util/counts.h
> +++ b/tools/perf/util/counts.h
> @@ -12,6 +12,7 @@ struct perf_counts_values {
>  		};
>  		u64 values[3];
>  	};
> +	bool	loaded;
>  };
>  
>  struct perf_counts {
> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> index 89aecf3a35c7..3735c9e0080d 100644
> --- a/tools/perf/util/evsel.c
> +++ b/tools/perf/util/evsel.c
> @@ -49,6 +49,7 @@ static struct {
>  	bool clockid_wrong;
>  	bool lbr_flags;
>  	bool write_backward;
> +	bool group_read;
>  } perf_missing_features;
>  
>  static clockid_t clockid;
> @@ -1321,6 +1322,7 @@ perf_evsel__set_count(struct perf_evsel *counter, int cpu, int thread,
>  	count->val    = val;
>  	count->ena    = ena;
>  	count->run    = run;
> +	count->loaded = true;
>  }
>  
>  static int
> @@ -1677,6 +1679,8 @@ int perf_evsel__open(struct perf_evsel *evsel, struct cpu_map *cpus,
>  	if (perf_missing_features.lbr_flags)
>  		evsel->attr.branch_sample_type &= ~(PERF_SAMPLE_BRANCH_NO_FLAGS |
>  				     PERF_SAMPLE_BRANCH_NO_CYCLES);
> +	if (perf_missing_features.group_read && evsel->attr.inherit)
> +		evsel->attr.read_format &= ~(PERF_FORMAT_GROUP|PERF_FORMAT_ID);
>  retry_sample_id:
>  	if (perf_missing_features.sample_id_all)
>  		evsel->attr.sample_id_all = 0;
> @@ -1832,6 +1836,12 @@ int perf_evsel__open(struct perf_evsel *evsel, struct cpu_map *cpus,
>  		perf_missing_features.lbr_flags = true;
>  		pr_debug2("switching off branch sample type no (cycles/flags)\n");
>  		goto fallback_missing_features;
> +	} else if (!perf_missing_features.group_read &&
> +		    evsel->attr.inherit &&
> +		   (evsel->attr.read_format & PERF_FORMAT_GROUP)) {
> +		perf_missing_features.group_read = true;
> +		pr_debug2("switching off group read\n");
> +		goto fallback_missing_features;
>  	}
>  out_close:
>  	do {
> -- 
> 2.9.4
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 4/4] perf stat: Use group read for event groups
  2017-07-23  0:53   ` Namhyung Kim
@ 2017-07-24  7:31     ` Jiri Olsa
  0 siblings, 0 replies; 12+ messages in thread
From: Jiri Olsa @ 2017-07-24  7:31 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Jiri Olsa, Arnaldo Carvalho de Melo, lkml, Ingo Molnar,
	Peter Zijlstra, Alexander Shishkin, David Ahern, Andi Kleen,
	kernel-team

On Sun, Jul 23, 2017 at 09:53:00AM +0900, Namhyung Kim wrote:

SNIP

> > Link: http://lkml.kernel.org/n/tip-b6g8qarwvptr81cqdtfst59u@git.kernel.org
> > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > ---
> >  tools/perf/builtin-stat.c | 30 +++++++++++++++++++++++++++---
> >  tools/perf/util/counts.h  |  1 +
> >  tools/perf/util/evsel.c   | 10 ++++++++++
> >  3 files changed, 38 insertions(+), 3 deletions(-)
> > 
> > diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
> > index 48ac53b199fc..866da7aa54bf 100644
> > --- a/tools/perf/builtin-stat.c
> > +++ b/tools/perf/builtin-stat.c
> > @@ -213,10 +213,20 @@ static void perf_stat__reset_stats(void)
> >  static int create_perf_stat_counter(struct perf_evsel *evsel)
> >  {
> >  	struct perf_event_attr *attr = &evsel->attr;
> > +	struct perf_evsel *leader = evsel->leader;
> >  
> > -	if (stat_config.scale)
> > +	if (stat_config.scale) {
> >  		attr->read_format = PERF_FORMAT_TOTAL_TIME_ENABLED |
> >  				    PERF_FORMAT_TOTAL_TIME_RUNNING;
> > +	}
> > +
> > +	/*
> > +	 * The event is part of non trivial group, let's enable
> > +	 * the group read (for leader) and ID retrieval for all
> > +	 * members.
> > +	 */
> > +	if (leader->nr_members > 1)
> > +		attr->read_format |= PERF_FORMAT_ID|PERF_FORMAT_GROUP;
> 
> I just wonder ID is really necessary.  Doesn't it have same order we
> can traverse with the for_each_group_member()?

right, but I don't like to rely on user and kernel space
order to stay the same.. I don't think it's guaranteed
in uapi

jirka

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/4] perf tools: Add perf_evsel__read_counter function
  2017-07-21 12:12 ` [PATCH 3/4] perf tools: Add perf_evsel__read_counter function Jiri Olsa
@ 2017-07-26  1:41   ` Arnaldo Carvalho de Melo
  2017-07-26  1:42     ` Arnaldo Carvalho de Melo
  2017-07-26  7:30     ` Jiri Olsa
  0 siblings, 2 replies; 12+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-07-26  1:41 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: lkml, Ingo Molnar, Peter Zijlstra, Alexander Shishkin,
	Namhyung Kim, David Ahern, Andi Kleen

Em Fri, Jul 21, 2017 at 02:12:11PM +0200, Jiri Olsa escreveu:
> Adding perf_evsel__read_counter function to read single or
> group counter. After calling this function the counter's
> evsel::counts struct is filled with values for the counter
> and member of its group if there are any.
> 
> Link: http://lkml.kernel.org/n/tip-itsuxdyt7rp4mvij1t6k7bcl@git.kernel.org
> Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> ---
>  tools/perf/util/evsel.c | 100 ++++++++++++++++++++++++++++++++++++++++++++++++
>  tools/perf/util/evsel.h |   2 +
>  tools/perf/util/stat.c  |   3 ++
>  tools/perf/util/stat.h  |   5 ++-
>  4 files changed, 108 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> index 4dd0fcc06db9..89aecf3a35c7 100644
> --- a/tools/perf/util/evsel.c
> +++ b/tools/perf/util/evsel.c
> @@ -1302,6 +1302,106 @@ int perf_evsel__read(struct perf_evsel *evsel, int cpu, int thread,
>  	return 0;
>  }
>  
> +static int
> +perf_evsel__read_one(struct perf_evsel *evsel, int cpu, int thread)
> +{
> +	struct perf_counts_values *count = perf_counts(evsel->counts, cpu, thread);
> +
> +	return perf_evsel__read(evsel, cpu, thread, count);
> +}
> +
> +static void
> +perf_evsel__set_count(struct perf_evsel *counter, int cpu, int thread,
> +		      u64 val, u64 ena, u64 run)
> +{
> +	struct perf_counts_values *count;
> +
> +	count = perf_counts(counter->counts, cpu, thread);
> +
> +	count->val    = val;
> +	count->ena    = ena;
> +	count->run    = run;
> +}
> +
> +static int
> +perf_evsel__process_group_data(struct perf_evsel *leader,
> +			       int cpu, int thread, u64 *data)
> +{
> +	u64 read_format = leader->attr.read_format;
> +	struct sample_read_value *v;
> +	u64 nr, ena = 0, run = 0, i;
> +
> +	nr = *data++;
> +
> +	if (nr != (u64) leader->nr_members)
> +		return -EINVAL;
> +
> +	if (read_format & PERF_FORMAT_TOTAL_TIME_ENABLED)
> +		ena = *data++;
> +
> +	if (read_format & PERF_FORMAT_TOTAL_TIME_RUNNING)
> +		run = *data++;
> +
> +	v = (struct sample_read_value *) data;
> +
> +	perf_evsel__set_count(leader, cpu, thread,
> +			      v[0].value, ena, run);
> +
> +	for (i = 1; i < nr; i++) {
> +		struct perf_evsel *counter;
> +
> +		counter = perf_evlist__id2evsel(leader->evlist, v[i].id);
> +		if (!counter)
> +			return -EINVAL;
> +
> +		perf_evsel__set_count(counter, cpu, thread,
> +				      v[i].value, ena, run);
> +	}
> +
> +	return 0;
> +}
> +
> +static int
> +perf_evsel__read_group(struct perf_evsel *leader, int cpu, int thread)
> +{
> +	struct perf_stat_evsel *ps = leader->priv;
> +	u64 read_format = leader->attr.read_format;
> +	int size = perf_evsel__read_size(leader);
> +	u64 *data = ps->group_data;
> +
> +	if (!(read_format & PERF_FORMAT_ID))
> +		return -EINVAL;
> +
> +	if (!perf_evsel__is_group_leader(leader))
> +		return -EINVAL;
> +
> +	if (!data) {
> +		data = zalloc(size);
> +		if (!data)
> +			return -ENOMEM;
> +
> +		ps->group_data = data;
> +	}
> +
> +	if (FD(leader, cpu, thread) < 0)
> +		return -EINVAL;
> +
> +	if (readn(FD(leader, cpu, thread), data, size) <= 0)
> +		return -errno;
> +
> +	return perf_evsel__process_group_data(leader, cpu, thread, data);
> +}
> +
> +int perf_evsel__read_counter(struct perf_evsel *evsel, int cpu, int thread)
> +{
> +	u64 read_format = evsel->attr.read_format;
> +
> +	if (read_format & PERF_FORMAT_GROUP)
> +		return perf_evsel__read_group(evsel, cpu, thread);
> +	else
> +		return perf_evsel__read_one(evsel, cpu, thread);
> +}
> +
>  int __perf_evsel__read_on_cpu(struct perf_evsel *evsel,
>  			      int cpu, int thread, bool scale)
>  {
> diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
> index fb40ca3c6519..de03c18daaf0 100644
> --- a/tools/perf/util/evsel.h
> +++ b/tools/perf/util/evsel.h
> @@ -299,6 +299,8 @@ static inline bool perf_evsel__match2(struct perf_evsel *e1,
>  int perf_evsel__read(struct perf_evsel *evsel, int cpu, int thread,
>  		     struct perf_counts_values *count);
>  
> +int perf_evsel__read_counter(struct perf_evsel *evsel, int cpu, int thread);
> +
>  int __perf_evsel__read_on_cpu(struct perf_evsel *evsel,
>  			      int cpu, int thread, bool scale);
>  
> diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c
> index 53b9a994a3dc..2c258554f94d 100644
> --- a/tools/perf/util/stat.c
> +++ b/tools/perf/util/stat.c
> @@ -128,6 +128,9 @@ static int perf_evsel__alloc_stat_priv(struct perf_evsel *evsel)
>  
>  static void perf_evsel__free_stat_priv(struct perf_evsel *evsel)
>  {
> +	struct perf_stat_evsel *ps = evsel->priv;
> +
> +	free(ps->group_data);

Humm, are you sure you can always do that, i.e. that evsel->priv is not
NULL?

Program received signal SIGSEGV, Segmentation fault.
0x000000000054ca04 in perf_evsel__free_stat_priv (evsel=0x2454be0) at util/stat.c:133
133		free(ps->group_data);
Missing separate debuginfos, use: dnf debuginfo-install audit-libs-2.7.7-1.fc25.x86_64 elfutils-libelf-0.169-1.fc25.x86_64 elfutils-libs-0.169-1.fc25.x86_64 libunwind-1.2-1.fc25.x86_64 perl-libs-5.24.1-386.fc25.x86_64 python-libs-2.7.13-2.fc25.x86_64 slang-2.3.0-7.fc25.x86_64
(gdb) bt
#0  0x000000000054ca04 in perf_evsel__free_stat_priv (evsel=0x2454be0) at util/stat.c:133
#1  0x000000000054cc93 in perf_evlist__free_stats (evlist=0x24541a0) at util/stat.c:189
#2  0x0000000000460311 in cmd_script (argc=0, argv=0x7fffffffe170) at builtin-script.c:3075
#3  0x00000000004be56c in run_builtin (p=0xa359f8 <commands+408>, argc=1, argv=0x7fffffffe170) at perf.c:296
#4  0x00000000004be7d9 in handle_internal_command (argc=1, argv=0x7fffffffe170) at perf.c:348
#5  0x00000000004be92b in run_argv (argcp=0x7fffffffdfcc, argv=0x7fffffffdfc0) at perf.c:392
#6  0x00000000004bed05 in main (argc=1, argv=0x7fffffffe170) at perf.c:530
(gdb) 

I'm dropping the series, please retest with 'perf script'.

- Arnaldo

>  	zfree(&evsel->priv);
>  }
>  
> diff --git a/tools/perf/util/stat.h b/tools/perf/util/stat.h
> index 7522bf10b03e..eacaf958e19d 100644
> --- a/tools/perf/util/stat.h
> +++ b/tools/perf/util/stat.h
> @@ -28,8 +28,9 @@ enum perf_stat_evsel_id {
>  };
>  
>  struct perf_stat_evsel {
> -	struct stats		res_stats[3];
> -	enum perf_stat_evsel_id	id;
> +	struct stats		 res_stats[3];
> +	enum perf_stat_evsel_id	 id;
> +	u64			*group_data;
>  };
>  
>  enum aggr_mode {
> -- 
> 2.9.4

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/4] perf tools: Add perf_evsel__read_counter function
  2017-07-26  1:41   ` Arnaldo Carvalho de Melo
@ 2017-07-26  1:42     ` Arnaldo Carvalho de Melo
  2017-07-26  7:30     ` Jiri Olsa
  1 sibling, 0 replies; 12+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-07-26  1:42 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: lkml, Ingo Molnar, Peter Zijlstra, Alexander Shishkin,
	Namhyung Kim, David Ahern, Andi Kleen

Em Tue, Jul 25, 2017 at 10:41:48PM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Fri, Jul 21, 2017 at 02:12:11PM +0200, Jiri Olsa escreveu:
> > +++ b/tools/perf/util/stat.c
> > @@ -128,6 +128,9 @@ static int perf_evsel__alloc_stat_priv(struct perf_evsel *evsel)

> >  static void perf_evsel__free_stat_priv(struct perf_evsel *evsel)
> >  {
> > +	struct perf_stat_evsel *ps = evsel->priv;
> > +
> > +	free(ps->group_data);
 
> Humm, are you sure you can always do that, i.e. that evsel->priv is not
> NULL?
> 
> Program received signal SIGSEGV, Segmentation fault.
> 0x000000000054ca04 in perf_evsel__free_stat_priv (evsel=0x2454be0) at util/stat.c:133
> 133		free(ps->group_data);
> Missing separate debuginfos, use: dnf debuginfo-install audit-libs-2.7.7-1.fc25.x86_64 elfutils-libelf-0.169-1.fc25.x86_64 elfutils-libs-0.169-1.fc25.x86_64 libunwind-1.2-1.fc25.x86_64 perl-libs-5.24.1-386.fc25.x86_64 python-libs-2.7.13-2.fc25.x86_64 slang-2.3.0-7.fc25.x86_64
> (gdb) bt
> #0  0x000000000054ca04 in perf_evsel__free_stat_priv (evsel=0x2454be0) at util/stat.c:133
> #1  0x000000000054cc93 in perf_evlist__free_stats (evlist=0x24541a0) at util/stat.c:189
> #2  0x0000000000460311 in cmd_script (argc=0, argv=0x7fffffffe170) at builtin-script.c:3075
> #3  0x00000000004be56c in run_builtin (p=0xa359f8 <commands+408>, argc=1, argv=0x7fffffffe170) at perf.c:296
> #4  0x00000000004be7d9 in handle_internal_command (argc=1, argv=0x7fffffffe170) at perf.c:348
> #5  0x00000000004be92b in run_argv (argcp=0x7fffffffdfcc, argv=0x7fffffffdfc0) at perf.c:392
> #6  0x00000000004bed05 in main (argc=1, argv=0x7fffffffe170) at perf.c:530
> (gdb) 
> 
> I'm dropping the series, please retest with 'perf script'.

I'll keep the first one, kinda unrelated :-)
 
> - Arnaldo

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/4] perf tools: Add perf_evsel__read_counter function
  2017-07-26  1:41   ` Arnaldo Carvalho de Melo
  2017-07-26  1:42     ` Arnaldo Carvalho de Melo
@ 2017-07-26  7:30     ` Jiri Olsa
  1 sibling, 0 replies; 12+ messages in thread
From: Jiri Olsa @ 2017-07-26  7:30 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, lkml, Ingo Molnar, Peter Zijlstra, Alexander Shishkin,
	Namhyung Kim, David Ahern, Andi Kleen

On Tue, Jul 25, 2017 at 10:41:48PM -0300, Arnaldo Carvalho de Melo wrote:

SNIP

> > diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c
> > index 53b9a994a3dc..2c258554f94d 100644
> > --- a/tools/perf/util/stat.c
> > +++ b/tools/perf/util/stat.c
> > @@ -128,6 +128,9 @@ static int perf_evsel__alloc_stat_priv(struct perf_evsel *evsel)
> >  
> >  static void perf_evsel__free_stat_priv(struct perf_evsel *evsel)
> >  {
> > +	struct perf_stat_evsel *ps = evsel->priv;
> > +
> > +	free(ps->group_data);
> 
> Humm, are you sure you can always do that, i.e. that evsel->priv is not
> NULL?
> 
> Program received signal SIGSEGV, Segmentation fault.
> 0x000000000054ca04 in perf_evsel__free_stat_priv (evsel=0x2454be0) at util/stat.c:133
> 133		free(ps->group_data);
> Missing separate debuginfos, use: dnf debuginfo-install audit-libs-2.7.7-1.fc25.x86_64 elfutils-libelf-0.169-1.fc25.x86_64 elfutils-libs-0.169-1.fc25.x86_64 libunwind-1.2-1.fc25.x86_64 perl-libs-5.24.1-386.fc25.x86_64 python-libs-2.7.13-2.fc25.x86_64 slang-2.3.0-7.fc25.x86_64
> (gdb) bt
> #0  0x000000000054ca04 in perf_evsel__free_stat_priv (evsel=0x2454be0) at util/stat.c:133
> #1  0x000000000054cc93 in perf_evlist__free_stats (evlist=0x24541a0) at util/stat.c:189
> #2  0x0000000000460311 in cmd_script (argc=0, argv=0x7fffffffe170) at builtin-script.c:3075
> #3  0x00000000004be56c in run_builtin (p=0xa359f8 <commands+408>, argc=1, argv=0x7fffffffe170) at perf.c:296
> #4  0x00000000004be7d9 in handle_internal_command (argc=1, argv=0x7fffffffe170) at perf.c:348
> #5  0x00000000004be92b in run_argv (argcp=0x7fffffffdfcc, argv=0x7fffffffdfc0) at perf.c:392
> #6  0x00000000004bed05 in main (argc=1, argv=0x7fffffffe170) at perf.c:530
> (gdb) 
> 
> I'm dropping the series, please retest with 'perf script'.

hum, will check

thanks,
jirka

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [tip:perf/core] perf evsel: Add verbose output for sys_perf_event_open fallback
  2017-07-21 12:12 ` [PATCH 1/4] perf tools: Add verbose output for sys_perf_event_open fallback Jiri Olsa
@ 2017-07-26 17:23   ` tip-bot for Jiri Olsa
  0 siblings, 0 replies; 12+ messages in thread
From: tip-bot for Jiri Olsa @ 2017-07-26 17:23 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: mingo, dsahern, andi, hpa, jolsa, namhyung, a.p.zijlstra, tglx,
	linux-kernel, alexander.shishkin, acme

Commit-ID:  2b04e0f88291c0dc7e12459bd3c3661d42209b4e
Gitweb:     http://git.kernel.org/tip/2b04e0f88291c0dc7e12459bd3c3661d42209b4e
Author:     Jiri Olsa <jolsa@kernel.org>
AuthorDate: Fri, 21 Jul 2017 14:12:09 +0200
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Tue, 25 Jul 2017 11:23:53 -0300

perf evsel: Add verbose output for sys_perf_event_open fallback

Adding info about what is being switched off in the sys_perf_event_open
fallback.

New output (notice the 'switching off' lines):

  $ perf stat -e '{cycles,instructions}' -vvv ls
  Using CPUID GenuineIntel-6-3D
  intel_pt default config: tsc
  ------------------------------------------------------------
  perf_event_attr:
    size                             112
    sample_type                      IDENTIFIER
    read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING|ID|GROUP
    disabled                         1
    inherit                          1
    enable_on_exec                   1
    exclude_guest                    1
  ------------------------------------------------------------
  sys_perf_event_open: pid 3591  cpu -1  group_fd -1  flags 0x8
  sys_perf_event_open failed, error -22
  switching off cloexec flag
  ------------------------------------------------------------
  perf_event_attr:
    size                             112
    sample_type                      IDENTIFIER
    read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING|ID|GROUP
    disabled                         1
    inherit                          1
    enable_on_exec                   1
    exclude_guest                    1
  ------------------------------------------------------------
  sys_perf_event_open: pid 3591  cpu -1  group_fd -1  flags 0
  sys_perf_event_open failed, error -22
  switching off exclude_guest, exclude_host
  ------------------------------------------------------------
  perf_event_attr:
    size                             112
    sample_type                      IDENTIFIER
    read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING|ID|GROUP
    disabled                         1
    inherit                          1
    enable_on_exec                   1
  ------------------------------------------------------------
  sys_perf_event_open: pid 3591  cpu -1  group_fd -1  flags 0
  sys_perf_event_open failed, error -22
  switching off sample_id_all
  ------------------------------------------------------------
  perf_event_attr:
    size                             112
    sample_type                      IDENTIFIER
    read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING|ID|GROUP
    disabled                         1
    inherit                          1
    enable_on_exec                   1
  ...

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20170721121212.21414-2-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/evsel.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 6dd069a..450b5fa 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1671,31 +1671,39 @@ try_fallback:
 	 */
 	if (!perf_missing_features.write_backward && evsel->attr.write_backward) {
 		perf_missing_features.write_backward = true;
+		pr_debug2("switching off write_backward\n");
 		goto out_close;
 	} else if (!perf_missing_features.clockid_wrong && evsel->attr.use_clockid) {
 		perf_missing_features.clockid_wrong = true;
+		pr_debug2("switching off clockid\n");
 		goto fallback_missing_features;
 	} else if (!perf_missing_features.clockid && evsel->attr.use_clockid) {
 		perf_missing_features.clockid = true;
+		pr_debug2("switching off use_clockid\n");
 		goto fallback_missing_features;
 	} else if (!perf_missing_features.cloexec && (flags & PERF_FLAG_FD_CLOEXEC)) {
 		perf_missing_features.cloexec = true;
+		pr_debug2("switching off cloexec flag\n");
 		goto fallback_missing_features;
 	} else if (!perf_missing_features.mmap2 && evsel->attr.mmap2) {
 		perf_missing_features.mmap2 = true;
+		pr_debug2("switching off mmap2\n");
 		goto fallback_missing_features;
 	} else if (!perf_missing_features.exclude_guest &&
 		   (evsel->attr.exclude_guest || evsel->attr.exclude_host)) {
 		perf_missing_features.exclude_guest = true;
+		pr_debug2("switching off exclude_guest, exclude_host\n");
 		goto fallback_missing_features;
 	} else if (!perf_missing_features.sample_id_all) {
 		perf_missing_features.sample_id_all = true;
+		pr_debug2("switching off sample_id_all\n");
 		goto retry_sample_id;
 	} else if (!perf_missing_features.lbr_flags &&
 			(evsel->attr.branch_sample_type &
 			 (PERF_SAMPLE_BRANCH_NO_CYCLES |
 			  PERF_SAMPLE_BRANCH_NO_FLAGS))) {
 		perf_missing_features.lbr_flags = true;
+		pr_debug2("switching off branch sample type no (cycles/flags)\n");
 		goto fallback_missing_features;
 	}
 out_close:

^ permalink raw reply related	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2017-07-26 17:25 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-07-21 12:12 [PATCH 0/4] perf stat: Enable group read of counters Jiri Olsa
2017-07-21 12:12 ` [PATCH 1/4] perf tools: Add verbose output for sys_perf_event_open fallback Jiri Olsa
2017-07-26 17:23   ` [tip:perf/core] perf evsel: " tip-bot for Jiri Olsa
2017-07-21 12:12 ` [PATCH 2/4] perf tools: Add perf_evsel__read_size function Jiri Olsa
2017-07-21 12:12 ` [PATCH 3/4] perf tools: Add perf_evsel__read_counter function Jiri Olsa
2017-07-26  1:41   ` Arnaldo Carvalho de Melo
2017-07-26  1:42     ` Arnaldo Carvalho de Melo
2017-07-26  7:30     ` Jiri Olsa
2017-07-21 12:12 ` [PATCH 4/4] perf stat: Use group read for event groups Jiri Olsa
2017-07-23  0:53   ` Namhyung Kim
2017-07-24  7:31     ` Jiri Olsa
2017-07-21 17:07 ` [PATCH 0/4] perf stat: Enable group read of counters Arnaldo Carvalho de Melo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.