All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC 0/5] perf: Allow leader sampling on inherited events
@ 2014-08-22 13:05 Jiri Olsa
  2014-08-22 13:05 ` [PATCH 1/5] perf: Deny optimized switch for events read by PERF_SAMPLE_READ Jiri Olsa
                   ` (5 more replies)
  0 siblings, 6 replies; 8+ messages in thread
From: Jiri Olsa @ 2014-08-22 13:05 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andi Kleen, Arnaldo Carvalho de Melo, Corey Ashford, David Ahern,
	Frederic Weisbecker, Ingo Molnar, Jen-Cheng(Tommy) Huang,
	Namhyung Kim, Paul Mackerras, Peter Zijlstra, Stephane Eranian,
	Jiri Olsa

hi,
Jen-Cheng(Tommy) Huang reported the leader sampling not working
on children processes:
  http://www.mail-archive.com/linux-perf-users@vger.kernel.org/msg01644.html

The leader sampling (example below) lets the group leader event (cycles)
do the sampling and reads the rest of the group (cache-misses) via
PERF_FORMAT_GROUP format.

Example:
  $ perf record -e '{cycles,cache-misses}:S' <workload>
  $ perf report --group

  The perf report --group allows to see all events group
  data in single view.

The reason for leader sampling being switched off for inherited
events, is that the kernel does no allow PERF_FORMAT_GROUP format
on inherited events (which is used by leader sampling).

I switched on the PERF_FORMAT_GROUP format for inherited events
with few other fixies in patches:
  perf: Deny optimized switch for events read by PERF_SAMPLE_READ
  perf: Allow PERF_FORMAT_GROUP format on inherited events

And I fixed perf tool code to be able to process data from
children processes.

Anyway, I might have missed some other reason why this was
never switched on in kernel, so sending this as RFC.

thanks for comments,
jirka


Reported-by: Jen-Cheng(Tommy) Huang <tommy24@gatech.edu>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jen-Cheng(Tommy) Huang <tommy24@gatech.edu>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
Jiri Olsa (5):
      perf: Deny optimized switch for events read by PERF_SAMPLE_READ
      perf: Allow PERF_FORMAT_GROUP format on inherited events
      perf tools: Add support to traverse xyarrays
      perf tools: Add hash of periods for struct perf_sample_id
      perf tools: Allow PERF_FORMAT_GROUP for inherited events

 kernel/events/core.c            | 25 ++++++++++++++-----------
 tools/perf/Makefile.perf        |  1 +
 tools/perf/tests/builtin-test.c |  4 ++++
 tools/perf/tests/tests.h        |  1 +
 tools/perf/tests/xyarray.c      | 33 +++++++++++++++++++++++++++++++++
 tools/perf/util/evsel.c         | 17 ++++++++++++++---
 tools/perf/util/evsel.h         |  5 ++++-
 tools/perf/util/session.c       | 93 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++------
 tools/perf/util/xyarray.c       |  4 +++-
 tools/perf/util/xyarray.h       |  6 ++++++
 10 files changed, 167 insertions(+), 22 deletions(-)
 create mode 100644 tools/perf/tests/xyarray.c

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/5] perf: Deny optimized switch for events read by PERF_SAMPLE_READ
  2014-08-22 13:05 [RFC 0/5] perf: Allow leader sampling on inherited events Jiri Olsa
@ 2014-08-22 13:05 ` Jiri Olsa
  2014-08-22 13:05 ` [PATCH 2/5] perf: Allow PERF_FORMAT_GROUP format on inherited events Jiri Olsa
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 8+ messages in thread
From: Jiri Olsa @ 2014-08-22 13:05 UTC (permalink / raw)
  To: linux-kernel
  Cc: Jiri Olsa, Andi Kleen, Arnaldo Carvalho de Melo, Corey Ashford,
	David Ahern, Frederic Weisbecker, Ingo Molnar,
	Jen-Cheng(Tommy) Huang, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra, Stephane Eranian

The optimized task context switch for cloned perf events just
swaps whole perf event contexts (of current and next process)
if it finds them suitable. Events from the 'current' context
will now measure data of the 'next' context and vice versa.

This is ok for cases where we are not directly interested in
the event->count value of separate child events, like:
  - standard sampling, where we take 'period' value for the
    event count
  - counting, where we accumulate all events (children)
    into a single count value

But in case we read event by using the PERF_SAMPLE_READ sample
type, we are interested in direct event->count value meassured
in specific task. Switching events within tasks for this kind
of measurements corrupts data.

Fixing this by setting/unsetting pin_count for perf event
context once event with PERF_SAMPLE_READ read is added/removed.
The pin_count value != 0 makes the context not suitable for
optimized switch.

Cc: Andi Kleen <andi@firstfloor.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jen-Cheng(Tommy) Huang <tommy24@gatech.edu>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 kernel/events/core.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 2d7363adf678..a1d220cf739b 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -1117,6 +1117,12 @@ ctx_group_list(struct perf_event *event, struct perf_event_context *ctx)
 		return &ctx->flexible_groups;
 }
 
+static bool has_inherit_read(struct perf_event *event)
+{
+	return event->attr.inherit &&
+	       (event->attr.sample_type & PERF_SAMPLE_READ);
+}
+
 /*
  * Add a event from the lists for its context.
  * Must be called with ctx->mutex and ctx->lock held.
@@ -1148,6 +1154,9 @@ list_add_event(struct perf_event *event, struct perf_event_context *ctx)
 	if (has_branch_stack(event))
 		ctx->nr_branch_stack++;
 
+	if (has_inherit_read(event))
+		ctx->pin_count++;
+
 	list_add_rcu(&event->event_entry, &ctx->event_list);
 	if (!ctx->nr_events)
 		perf_pmu_rotate_start(ctx->pmu);
@@ -1313,6 +1322,9 @@ list_del_event(struct perf_event *event, struct perf_event_context *ctx)
 	if (has_branch_stack(event))
 		ctx->nr_branch_stack--;
 
+	if (has_inherit_read(event))
+		ctx->pin_count--;
+
 	ctx->nr_events--;
 	if (event->attr.inherit_stat)
 		ctx->nr_stat--;
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/5] perf: Allow PERF_FORMAT_GROUP format on inherited events
  2014-08-22 13:05 [RFC 0/5] perf: Allow leader sampling on inherited events Jiri Olsa
  2014-08-22 13:05 ` [PATCH 1/5] perf: Deny optimized switch for events read by PERF_SAMPLE_READ Jiri Olsa
@ 2014-08-22 13:05 ` Jiri Olsa
  2014-08-22 13:05 ` [PATCH 3/5] perf tools: Add support to traverse xyarrays Jiri Olsa
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 8+ messages in thread
From: Jiri Olsa @ 2014-08-22 13:05 UTC (permalink / raw)
  To: linux-kernel
  Cc: Jiri Olsa, Andi Kleen, Arnaldo Carvalho de Melo, Corey Ashford,
	David Ahern, Frederic Weisbecker, Ingo Molnar,
	Jen-Cheng(Tommy) Huang, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra, Stephane Eranian

I assume the reason for this being disabled is the difficulty
to read child events once in perf overflow routine, thus the
perf_output_read_group function. The read syscall function
perf_event_read_group seems to handle this nicely.

My goal is to be able to read all events in group on leader
sample by using the PERF_SAMPLE_READ with PERF_FORMAT_GROUP
format. Once the monitored process forks, I need the child
processes/events do the same and store samples into parents
ring buffer.

So I need all events sample just to report their own value
(without child events being included). Thus switching the
perf_event_count call for simple read of event->count.

Reported-by: Jen-Cheng(Tommy) Huang <tommy24@gatech.edu>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jen-Cheng(Tommy) Huang <tommy24@gatech.edu>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 kernel/events/core.c | 13 ++-----------
 1 file changed, 2 insertions(+), 11 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index a1d220cf739b..315502bf733b 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -4611,9 +4611,6 @@ static void perf_output_read_one(struct perf_output_handle *handle,
 	__output_copy(handle, values, n * sizeof(u64));
 }
 
-/*
- * XXX PERF_FORMAT_GROUP vs inherited events seems difficult.
- */
 static void perf_output_read_group(struct perf_output_handle *handle,
 			    struct perf_event *event,
 			    u64 enabled, u64 running)
@@ -4634,7 +4631,7 @@ static void perf_output_read_group(struct perf_output_handle *handle,
 	if (leader != event)
 		leader->pmu->read(leader);
 
-	values[n++] = perf_event_count(leader);
+	values[n++] = local64_read(&leader->count);
 	if (read_format & PERF_FORMAT_ID)
 		values[n++] = primary_event_id(leader);
 
@@ -4647,7 +4644,7 @@ static void perf_output_read_group(struct perf_output_handle *handle,
 		    (sub->state == PERF_EVENT_STATE_ACTIVE))
 			sub->pmu->read(sub);
 
-		values[n++] = perf_event_count(sub);
+		values[n++] = local64_read(&sub->count);
 		if (read_format & PERF_FORMAT_ID)
 			values[n++] = primary_event_id(sub);
 
@@ -6956,12 +6953,6 @@ perf_event_alloc(struct perf_event_attr *attr, int cpu,
 
 	local64_set(&hwc->period_left, hwc->sample_period);
 
-	/*
-	 * we currently do not support PERF_FORMAT_GROUP on inherited events
-	 */
-	if (attr->inherit && (attr->read_format & PERF_FORMAT_GROUP))
-		goto err_ns;
-
 	pmu = perf_init_event(event);
 	if (!pmu)
 		goto err_ns;
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 3/5] perf tools: Add support to traverse xyarrays
  2014-08-22 13:05 [RFC 0/5] perf: Allow leader sampling on inherited events Jiri Olsa
  2014-08-22 13:05 ` [PATCH 1/5] perf: Deny optimized switch for events read by PERF_SAMPLE_READ Jiri Olsa
  2014-08-22 13:05 ` [PATCH 2/5] perf: Allow PERF_FORMAT_GROUP format on inherited events Jiri Olsa
@ 2014-08-22 13:05 ` Jiri Olsa
  2014-08-22 13:05 ` [PATCH 4/5] perf tools: Add hash of periods for struct perf_sample_id Jiri Olsa
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 8+ messages in thread
From: Jiri Olsa @ 2014-08-22 13:05 UTC (permalink / raw)
  To: linux-kernel
  Cc: Jiri Olsa, Andi Kleen, Arnaldo Carvalho de Melo, Corey Ashford,
	David Ahern, Frederic Weisbecker, Ingo Molnar,
	Jen-Cheng(Tommy) Huang, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra, Stephane Eranian

Adding xyarray__for_each define to allow sequentially
traverse xyarrays. It will be handy in following patch.

Cc: Andi Kleen <andi@firstfloor.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jen-Cheng(Tommy) Huang <tommy24@gatech.edu>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/perf/Makefile.perf        |  1 +
 tools/perf/tests/builtin-test.c |  4 ++++
 tools/perf/tests/tests.h        |  1 +
 tools/perf/tests/xyarray.c      | 33 +++++++++++++++++++++++++++++++++
 tools/perf/util/xyarray.c       |  4 +++-
 tools/perf/util/xyarray.h       |  6 ++++++
 6 files changed, 48 insertions(+), 1 deletion(-)
 create mode 100644 tools/perf/tests/xyarray.c

diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index 1ea31e275b4d..555acea22f20 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -425,6 +425,7 @@ endif
 endif
 LIB_OBJS += $(OUTPUT)tests/mmap-thread-lookup.o
 LIB_OBJS += $(OUTPUT)tests/thread-mg-share.o
+LIB_OBJS += $(OUTPUT)tests/xyarray.o
 
 BUILTIN_OBJS += $(OUTPUT)builtin-annotate.o
 BUILTIN_OBJS += $(OUTPUT)builtin-bench.o
diff --git a/tools/perf/tests/builtin-test.c b/tools/perf/tests/builtin-test.c
index 99481361b19f..43d40273565f 100644
--- a/tools/perf/tests/builtin-test.c
+++ b/tools/perf/tests/builtin-test.c
@@ -154,6 +154,10 @@ static struct test {
 		.func = test__hists_cumulate,
 	},
 	{
+		.desc = "Test xyarray",
+		.func = test__xyarray,
+	},
+	{
 		.func = NULL,
 	},
 };
diff --git a/tools/perf/tests/tests.h b/tools/perf/tests/tests.h
index ed64790a395f..d11378f04126 100644
--- a/tools/perf/tests/tests.h
+++ b/tools/perf/tests/tests.h
@@ -48,6 +48,7 @@ int test__mmap_thread_lookup(void);
 int test__thread_mg_share(void);
 int test__hists_output(void);
 int test__hists_cumulate(void);
+int test__xyarray(void);
 
 #if defined(__x86_64__) || defined(__i386__) || defined(__arm__)
 #ifdef HAVE_DWARF_UNWIND_SUPPORT
diff --git a/tools/perf/tests/xyarray.c b/tools/perf/tests/xyarray.c
new file mode 100644
index 000000000000..e1a1d6a45106
--- /dev/null
+++ b/tools/perf/tests/xyarray.c
@@ -0,0 +1,33 @@
+#include "tests.h"
+#include "xyarray.h"
+#include "debug.h"
+
+struct krava {
+	int a;
+};
+
+#define X 100
+#define Y 100
+
+int test__xyarray(void)
+{
+	struct xyarray *a;
+	struct krava *k;
+	int x, y;
+
+	a = xyarray__new(X, Y, sizeof(struct krava));
+	TEST_ASSERT_VAL("failed to allocate xyarray", a);
+
+	for (x = 0; x < X; x++) {
+		for (y = 0; y < Y; y++) {
+			k = xyarray__entry(a, x, y);
+			k->a = x * X + y;
+		}
+	}
+
+	y = 0;
+	xyarray__for_each(a, k)
+		TEST_ASSERT_VAL("wrong array value", k->a == y++);
+
+	return 0;
+}
diff --git a/tools/perf/util/xyarray.c b/tools/perf/util/xyarray.c
index 22afbf6c536a..077e8240fe98 100644
--- a/tools/perf/util/xyarray.c
+++ b/tools/perf/util/xyarray.c
@@ -4,11 +4,13 @@
 struct xyarray *xyarray__new(int xlen, int ylen, size_t entry_size)
 {
 	size_t row_size = ylen * entry_size;
-	struct xyarray *xy = zalloc(sizeof(*xy) + xlen * row_size);
+	size_t size = xlen * row_size;
+	struct xyarray *xy = zalloc(sizeof(*xy) + size);
 
 	if (xy != NULL) {
 		xy->entry_size = entry_size;
 		xy->row_size   = row_size;
+		xy->size       = size;
 	}
 
 	return xy;
diff --git a/tools/perf/util/xyarray.h b/tools/perf/util/xyarray.h
index c488a07275dd..e4efa075fd76 100644
--- a/tools/perf/util/xyarray.h
+++ b/tools/perf/util/xyarray.h
@@ -6,6 +6,7 @@
 struct xyarray {
 	size_t row_size;
 	size_t entry_size;
+	size_t size;
 	char contents[];
 };
 
@@ -17,4 +18,9 @@ static inline void *xyarray__entry(struct xyarray *xy, int x, int y)
 	return &xy->contents[x * xy->row_size + y * xy->entry_size];
 }
 
+#define xyarray__for_each(array, entry)					\
+	for (entry = (void *) &array->contents[0];			\
+	     (void *) entry < ((void *) array->contents + array->size);	\
+	     entry++)
+
 #endif /* _PERF_XYARRAY_H_ */
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 4/5] perf tools: Add hash of periods for struct perf_sample_id
  2014-08-22 13:05 [RFC 0/5] perf: Allow leader sampling on inherited events Jiri Olsa
                   ` (2 preceding siblings ...)
  2014-08-22 13:05 ` [PATCH 3/5] perf tools: Add support to traverse xyarrays Jiri Olsa
@ 2014-08-22 13:05 ` Jiri Olsa
  2014-08-22 13:05 ` [PATCH 5/5] perf tools: Allow PERF_FORMAT_GROUP for inherited events Jiri Olsa
  2014-08-22 13:30 ` [RFC 0/5] perf: Allow leader sampling on " Jiri Olsa
  5 siblings, 0 replies; 8+ messages in thread
From: Jiri Olsa @ 2014-08-22 13:05 UTC (permalink / raw)
  To: linux-kernel
  Cc: Jiri Olsa, Andi Kleen, Arnaldo Carvalho de Melo, Corey Ashford,
	David Ahern, Frederic Weisbecker, Ingo Molnar,
	Jen-Cheng(Tommy) Huang, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra, Stephane Eranian

With PERF_FORMAT_GROUP format on inherited events being allowed
in kernel, we can now allow leader sampling on inherited events.

But before we actually switch it on, we need to change the sorting
of PERF_SAMPLE_READ sample's data. Currently PERF_SAMPLE_READ values
are sorted on event id. Now when we'll get data from all children
processes we need to add TID as another sort key.

Adding hash of TIDs into each 'struct perf_sample_id' to
hold event values for different TIDs.

Reported-by: Jen-Cheng(Tommy) Huang <tommy24@gatech.edu>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jen-Cheng(Tommy) Huang <tommy24@gatech.edu>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/perf/util/evsel.c   | 13 +++++++
 tools/perf/util/evsel.h   |  5 ++-
 tools/perf/util/session.c | 93 ++++++++++++++++++++++++++++++++++++++++++++---
 3 files changed, 104 insertions(+), 7 deletions(-)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index b38de5819323..507d458ded2c 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -787,8 +787,21 @@ void perf_evsel__free_fd(struct perf_evsel *evsel)
 	evsel->fd = NULL;
 }
 
+static void free_sample_id(struct perf_evsel *evsel)
+{
+	struct perf_sample_id *sid;
+
+	if (evsel->sample_id) {
+		xyarray__for_each(evsel->sample_id, sid) {
+			if (sid->periods)
+				perf_sample_hash__delete(sid->periods);
+		}
+	}
+}
+
 void perf_evsel__free_id(struct perf_evsel *evsel)
 {
+	free_sample_id(evsel);
 	xyarray__delete(evsel->sample_id);
 	evsel->sample_id = NULL;
 	zfree(&evsel->id);
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 7bc314be6a7b..41c000fc018b 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -29,6 +29,7 @@ struct perf_counts {
 };
 
 struct perf_evsel;
+struct perf_sample_hash;
 
 /*
  * Per fd, to map back from PERF_SAMPLE_ID to evsel, only used when there are
@@ -40,9 +41,11 @@ struct perf_sample_id {
 	struct perf_evsel	*evsel;
 
 	/* Holds total ID period value for PERF_SAMPLE_READ processing. */
-	u64			period;
+	struct perf_sample_hash	*periods;
 };
 
+void perf_sample_hash__delete(struct perf_sample_hash *hash);
+
 /** struct perf_evsel - event selector
  *
  * @name - Can be set to retain the original event name passed by the user,
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 6d2d50dea1d8..dcd2662b3b2e 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -1,4 +1,6 @@
 #include <linux/kernel.h>
+#include <linux/bitops.h>
+#include <linux/hash.h>
 #include <traceevent/event-parse.h>
 
 #include <byteswap.h>
@@ -733,6 +735,82 @@ static struct machine *
 	return &session->machines.host;
 }
 
+struct perf_sample_period {
+	struct hlist_node	node;
+	u64			value;
+	pid_t			tid;
+};
+
+#define PERF_SAMPLE__HLIST_BITS 8
+#define PERF_SAMPLE__HLIST_SIZE (1 << PERF_SAMPLE__HLIST_BITS)
+
+struct perf_sample_hash {
+	struct hlist_head heads[PERF_SAMPLE__HLIST_SIZE];
+};
+
+void perf_sample_hash__delete(struct perf_sample_hash *hash)
+{
+	int h;
+
+	for (h = 0; h < PERF_SAMPLE__HLIST_SIZE; h++) {
+		struct perf_sample_period *period;
+		struct hlist_head *head;
+		struct hlist_node *n;
+
+		head = &hash->heads[h];
+		hlist_for_each_entry_safe(period, n, head, node) {
+			hlist_del(&period->node);
+			free(period);
+		}
+	}
+
+	free(hash);
+}
+
+static struct perf_sample_period*
+findnew_hash_period(struct perf_sample_hash *hash, pid_t tid)
+{
+	struct perf_sample_period *period;
+	struct hlist_head *head;
+	int hash_val;
+
+	hash_val = hash_64(tid, PERF_SAMPLE__HLIST_BITS);
+	head = &hash->heads[hash_val];
+
+	hlist_for_each_entry(period, head, node) {
+		if (period->tid == tid)
+			return period;
+	}
+
+	period = zalloc(sizeof(*period));
+	if (period) {
+		period->tid = tid;
+		hlist_add_head(&period->node, &hash->heads[hash_val]);
+	}
+
+	return period;
+}
+
+static struct perf_sample_period*
+get_sample_period(struct perf_sample_id *sid, pid_t tid)
+{
+	struct perf_sample_hash *hash = sid->periods;
+	int i;
+
+	if (hash == NULL) {
+		hash = zalloc(sizeof(*hash));
+		if (hash == NULL)
+			return NULL;
+
+		for (i = 0; i < PERF_SAMPLE__HLIST_SIZE; ++i)
+			INIT_HLIST_HEAD(&hash->heads[i]);
+
+		sid->periods = hash;
+	}
+
+	return findnew_hash_period(hash, tid);
+}
+
 static int deliver_sample_value(struct perf_session *session,
 				struct perf_tool *tool,
 				union perf_event *event,
@@ -741,19 +819,22 @@ static int deliver_sample_value(struct perf_session *session,
 				struct machine *machine)
 {
 	struct perf_sample_id *sid;
+	struct perf_sample_period *period;
 
 	sid = perf_evlist__id2sid(session->evlist, v->id);
-	if (sid) {
-		sample->id     = v->id;
-		sample->period = v->value - sid->period;
-		sid->period    = v->value;
-	}
-
 	if (!sid || sid->evsel == NULL) {
 		++session->stats.nr_unknown_id;
 		return 0;
 	}
 
+	period = get_sample_period(sid, sample->tid);
+	if (period == NULL)
+		return -ENOMEM;
+
+	sample->id     = v->id;
+	sample->period = v->value - period->value;
+	period->value  = v->value;
+
 	return tool->sample(tool, event, sample, sid->evsel, machine);
 }
 
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 5/5] perf tools: Allow PERF_FORMAT_GROUP for inherited events
  2014-08-22 13:05 [RFC 0/5] perf: Allow leader sampling on inherited events Jiri Olsa
                   ` (3 preceding siblings ...)
  2014-08-22 13:05 ` [PATCH 4/5] perf tools: Add hash of periods for struct perf_sample_id Jiri Olsa
@ 2014-08-22 13:05 ` Jiri Olsa
  2014-08-22 13:30 ` [RFC 0/5] perf: Allow leader sampling on " Jiri Olsa
  5 siblings, 0 replies; 8+ messages in thread
From: Jiri Olsa @ 2014-08-22 13:05 UTC (permalink / raw)
  To: linux-kernel
  Cc: Jiri Olsa, Andi Kleen, Arnaldo Carvalho de Melo, Corey Ashford,
	David Ahern, Frederic Weisbecker, Ingo Molnar,
	Jen-Cheng(Tommy) Huang, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra, Stephane Eranian

Swithing on leader sampling on inherited events.

Following command will now get data from all children processes:
  $ perf record -e '{cycles,cache-misses}:S' <workload>

Use following command to display the data:
  $ perf report --group

Reported-by: Jen-Cheng(Tommy) Huang <tommy24@gatech.edu>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jen-Cheng(Tommy) Huang <tommy24@gatech.edu>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/perf/util/evsel.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 507d458ded2c..29e32b59d762 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -584,10 +584,8 @@ void perf_evsel__config(struct perf_evsel *evsel, struct record_opts *opts)
 		 * Apply group format only if we belong to group
 		 * with more than one members.
 		 */
-		if (leader->nr_members > 1) {
+		if (leader->nr_members > 1)
 			attr->read_format |= PERF_FORMAT_GROUP;
-			attr->inherit = 0;
-		}
 	}
 
 	/*
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [RFC 0/5] perf: Allow leader sampling on inherited events
  2014-08-22 13:05 [RFC 0/5] perf: Allow leader sampling on inherited events Jiri Olsa
                   ` (4 preceding siblings ...)
  2014-08-22 13:05 ` [PATCH 5/5] perf tools: Allow PERF_FORMAT_GROUP for inherited events Jiri Olsa
@ 2014-08-22 13:30 ` Jiri Olsa
       [not found]   ` <CABooUW0qEpo2YhXfxHsf48mw1acuZ63bq=Fot3kH1eHOfryU-A@mail.gmail.com>
  5 siblings, 1 reply; 8+ messages in thread
From: Jiri Olsa @ 2014-08-22 13:30 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: linux-kernel, Andi Kleen, Arnaldo Carvalho de Melo,
	Corey Ashford, David Ahern, Frederic Weisbecker, Ingo Molnar,
	Jen-Cheng(Tommy) Huang, Namhyung Kim, Paul Mackerras,
	Peter Zijlstra, Stephane Eranian

On Fri, Aug 22, 2014 at 03:05:13PM +0200, Jiri Olsa wrote:
> hi,
> Jen-Cheng(Tommy) Huang reported the leader sampling not working
> on children processes:
>   http://www.mail-archive.com/linux-perf-users@vger.kernel.org/msg01644.html
> 
> The leader sampling (example below) lets the group leader event (cycles)
> do the sampling and reads the rest of the group (cache-misses) via
> PERF_FORMAT_GROUP format.
> 
> Example:
>   $ perf record -e '{cycles,cache-misses}:S' <workload>
>   $ perf report --group
> 
>   The perf report --group allows to see all events group
>   data in single view.
> 
> The reason for leader sampling being switched off for inherited
> events, is that the kernel does no allow PERF_FORMAT_GROUP format
> on inherited events (which is used by leader sampling).
> 
> I switched on the PERF_FORMAT_GROUP format for inherited events
> with few other fixies in patches:
>   perf: Deny optimized switch for events read by PERF_SAMPLE_READ
>   perf: Allow PERF_FORMAT_GROUP format on inherited events
> 
> And I fixed perf tool code to be able to process data from
> children processes.
> 
> Anyway, I might have missed some other reason why this was
> never switched on in kernel, so sending this as RFC.

Also reachable in here:
  git://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git
  perf/core_format_group

jirka

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC 0/5] perf: Allow leader sampling on inherited events
       [not found]   ` <CABooUW0qEpo2YhXfxHsf48mw1acuZ63bq=Fot3kH1eHOfryU-A@mail.gmail.com>
@ 2014-08-23 20:07     ` Jiri Olsa
  0 siblings, 0 replies; 8+ messages in thread
From: Jiri Olsa @ 2014-08-23 20:07 UTC (permalink / raw)
  To: Jen-Cheng(Tommy) Huang; +Cc: Jiri Olsa, linux-kernel

On Sat, Aug 23, 2014 at 02:55:32PM -0400, Jen-Cheng(Tommy) Huang wrote:
> Hi Jiri,
> 
> 1. Thank you so much for providing the patch.
> I am trying to test out the patch.
> (I got the source using "git checkout -b perf
> remotes/origin/perf/core_format_group"
> after clone)
> However, the perf in your repo seems having issues with sample_read :S.
> (:S works fine with my original perf that comes with the kernel.)
> When I do
> perf record -e '{instructions,cycles}:S' /bin/ls
> 
> The following error is shown
> Error:
> The sys_perf_event_open() syscall returned with 22 (Invalid argument) for
> event (instructions).
> /bin/dmesg may provide additional information.
> No CONFIG_PERF_EVENTS=y kernel support configured?
> 
> This error does not occur without :S.
> perf record -e '{instructions,cycles}' /bin/ls
> This shows no errors.

have u boot the new kernel? there're 2 kernel patches
within the patchset

but right, the perf code should detect the kernel change
and keep old behaviour if it's not detected..

> 
> 2. Another issue I have is to show the sample value using 'perf script'.
> Currently, I am using perf report -D to show the sample read values, the
> format is not what I need. Could you give me some hints on where to modify
> to show the sample values using perf script?

there's no support in perf script to display the sample
read values.. I'll check if that could be added

thanks,
jirka

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2014-08-23 20:08 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-08-22 13:05 [RFC 0/5] perf: Allow leader sampling on inherited events Jiri Olsa
2014-08-22 13:05 ` [PATCH 1/5] perf: Deny optimized switch for events read by PERF_SAMPLE_READ Jiri Olsa
2014-08-22 13:05 ` [PATCH 2/5] perf: Allow PERF_FORMAT_GROUP format on inherited events Jiri Olsa
2014-08-22 13:05 ` [PATCH 3/5] perf tools: Add support to traverse xyarrays Jiri Olsa
2014-08-22 13:05 ` [PATCH 4/5] perf tools: Add hash of periods for struct perf_sample_id Jiri Olsa
2014-08-22 13:05 ` [PATCH 5/5] perf tools: Allow PERF_FORMAT_GROUP for inherited events Jiri Olsa
2014-08-22 13:30 ` [RFC 0/5] perf: Allow leader sampling on " Jiri Olsa
     [not found]   ` <CABooUW0qEpo2YhXfxHsf48mw1acuZ63bq=Fot3kH1eHOfryU-A@mail.gmail.com>
2014-08-23 20:07     ` Jiri Olsa

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.